date:20201029

Re: [PATCH 07/11] sockets: Fix default of UnixSocketAddress member @tight

2020-10-29 Thread Markus Armbruster

Paolo Bonzini  writes:

> On 29/10/20 18:39, Paolo Bonzini wrote:
>>> When @tight was set to false as it should be, absent @tight defaults
>>> to false.  Wrong, it should default to true.  This is what breaks QMP.
>> When @has_tight...
>
> Ah, I see what you meant here.  Suggested reword:
>
> -
> An optional bool member of a QAPI struct can be false, true, or absent.
> The previous commit demonstrated that socket_listen() and
> socket_connect() are broken for absent @tight, and indeed QMP chardev-
> add also defaults absent member @tight to false instead of true.
>
> In C, QAPI members are represented by two fields, has_MEMBER and MEMBER.
> We have:
>
>   has_MEMBERMEMBER
> false true   false
> truetrue false
> absent false  false/ignore
>
> When has_MEMBER is false, MEMBER should be set to false on write, and
> ignored on read.
>
> For QMP, the QAPI visitors handle absent @tight by setting both
> @has_tight and @tight to false.  unix_listen_saddr() and
> unix_connect_saddr() however use @tight only, disregarding @has_tight.
> This is wrong and means that absent @tight defaults to false whereas it
> should default to true.
>
> The same is true for @has_abstract, though @abstract defaults to
> false and therefore has the same behavior for all of QMP, HMP and CLI.
> Fix unix_listen_saddr() and unix_connect_saddr() to check
> @has_abstract/@has_tight, and to default absent @tight to true.
>
> However, this is only half of the story.  HMP chardev-add and CLI
> -chardev so far correctly defaulted @tight to true, but defaults to
> false again with the above fix for HMP and CLI.  In fact, the "tight"
> and "abstract" options now break completely.
>
> Digging deeper, we find that qemu_chr_parse_socket() also ignores
> @has_tight, leaving it false when it sets @tight.  That is also wrong,
> but the two wrongs cancelled out.  Fix qemu_chr_parse_socket() to set
> @has_tight and @has_abstract; writing testcases for HMP and CLI is left
> for another day.
> -
>
> Apologies if the last sentence is incorrect. :)

Sold (with the table fixed as per Eric's review)!

Re: [PATCH 07/11] sockets: Fix default of UnixSocketAddress member @tight

2020-10-29 Thread Markus Armbruster

Eric Blake  writes:

> On 10/29/20 8:38 AM, Markus Armbruster wrote:
>> QMP chardev-add defaults absent member @tight to false instead of
>> true.  HMP chardev-add and CLI -chardev correctly default to true.
>> 
>> The previous commit demonstrated that socket_listen() and
>> socket_connect() are broken for absent @tight.  That explains why QMP
>> is broken, but not why HMP and CLI work.  We need to dig deeper.
>> 
>> An optional bool member of a QAPI struct can be false, true, or
>> absent.  In C, we have:
>> 
>>  has_MEMBERMEMBER
>> false true  false
>> true   true false
>> absentfalse  false/ignore
>
> I'm not sure the TAB in this table made it very legible (it's hard to
> tell if has_MEMBER is the label of column 1 or 2).

Use of TAB is an accident.

> Row two is wrong: MEMBER (column 3) is set to true when the QMP code
> passed true on the wire.

Pasto, fixing...

Result:

has_MEMBERMEMBER
false true false
true  true  true
absent   false  false/ignore

>> When has_MEMBER is false, MEMBER should be set to false on write, and
>> ignored on read.
>> 
>> unix_listen_saddr() and unix_connect_saddr() use member @tight without
>> checking @has_tight.  This is wrong.
>
> It generally works if addr was constructed by the same way as the
> generated QAPI parser code - but as you demonstrated, in this particular
> case, because our construction did not obey the rules of the QAPI
> parser, our lack of checking bit us.
>
>> When @tight was set to false as it should be, absent @tight defaults
>> to false.  Wrong, it should default to true.  This is what breaks QMP.
>> 
>> There is one exception: qemu_chr_parse_socket() leaves @has_tight
>> false when it sets @tight.  Wrong, but the wrongs cancel out.  This is
>> why HMP and CLI work.  Same for @has_abstract.
>> 
>> Fix unix_listen_saddr() and unix_connect_saddr() to default absent
>> @tight to true.
>> 
>> Fix qemu_chr_parse_socket() to set @has_tight and @has_abstract.
>
> At any rate, the fix looks correct:
> - as producers, anywhere we hand-construct an addr (rather than using
> generated QAPI code), we MUST set both has_MEMBER and MEMBER, including
> setting MEMBER to false if has_MEMBER is false, if we want to preserve
> the assumptions made in the rest of the code;
> - as consumers, rather than relying on the QAPI parsers only setting
> MEMBER to true when has_MEMBER is true, we can ensure that has_MEMBER
> has priority by checking it ourselves

As long as the instance is built according to the rules, you can
contract has_MEMBER && MEMBER to just MEMBER.  Both mean "true".

However, has_MEMBER && !MEMBER cannot be contracted to !MEMBER.  The
former means "false", the latter means "false or absent".

Doubters, see the table above.

Putting defaults in the schema would let us eliminate the "absent"
state, the has_MEMBER flags, and the bugs that come with them.  Sadly,
this project has been crowded out by more urgent or important work since
forever.

>> +++ b/util/qemu-sockets.c
>> @@ -919,7 +919,7 @@ static int unix_listen_saddr(UnixSocketAddress *saddr,
>>  if (saddr->abstract) {
>>  un.sun_path[0] = '\0';
>>  memcpy(&un.sun_path[1], path, pathlen);
>> -if (saddr->tight) {
>> +if (!saddr->has_tight || saddr->tight) {
>>  addrlen = offsetof(struct sockaddr_un, sun_path) + 1 + pathlen;
>>  }
>>  } else {
>> @@ -979,7 +979,7 @@ static int unix_connect_saddr(UnixSocketAddress *saddr, 
>> Error **errp)
>>  if (saddr->abstract) {
>>  un.sun_path[0] = '\0';
>>  memcpy(&un.sun_path[1], saddr->path, pathlen);
>> -if (saddr->tight) {
>> +if (!saddr->has_tight || saddr->tight) {
>>  addrlen = offsetof(struct sockaddr_un, sun_path) + 1 + pathlen;
>>  }
>>  } else {
>> 
>
> Reviewed-by: Eric Blake 

Thanks!

Re: [PATCH 05/11] test-util-sockets: Synchronize properly, don't sleep(1)

2020-10-29 Thread Markus Armbruster

Eric Blake  writes:

> On 10/29/20 8:38 AM, Markus Armbruster wrote:
>> The abstract sockets test spawns a thread to listen and a accept, and
>
> s/and a/and/

Yes.

>> a second one to connect, with a sleep(1) in between to "ensure" the
>> former is listening when the latter tries to connect.  Review fail.
>> Risks spurious test failure, say when a heavily loaded machine doesn't
>> schedule the first thread quickly enough.  It's also slow.
>> 
>> Listen and accept in the main thread, and start the connect thread in
>> between.  Look ma, no sleep!  Run time drops from 2s wall clock to a
>> few milliseconds.
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  tests/test-util-sockets.c | 39 +--
>>  1 file changed, 13 insertions(+), 26 deletions(-)
>> 
>
> Reviewed-by: Eric Blake 

Thanks!

Re: [PATCH v2 8/8] target/ppc: replaced the TODO with LOG_UNIMP and add break for silence warnings

2020-10-29 Thread Philippe Mathieu-Daudé

On 10/30/20 1:40 AM, Chen Qun wrote:
> When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
> target/ppc/mmu_helper.c: In function ‘dump_mmu’:
> target/ppc/mmu_helper.c:1351:12: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
>  1351 | if (ppc64_v3_radix(env_archcpu(env))) {
>   |^
> target/ppc/mmu_helper.c:1358:5: note: here
>  1358 | default:
>   | ^~~
> 
> Use "qemu_log_mask(LOG_UNIMP**)" instead of the TODO comment.
> And add the break statement to fix it.
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> ---
> v1->v2: replace the TODO by a LOG_UNIMP call and add break statement(Base on 
> Philippe's comments)
> 
> Cc: Thomas Huth 
> Cc: David Gibson 
> Cc: Philippe Mathieu-Daudé 
> ---
>  target/ppc/mmu_helper.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> index 8972714775..12723362b7 100644
> --- a/target/ppc/mmu_helper.c
> +++ b/target/ppc/mmu_helper.c
> @@ -1349,11 +1349,12 @@ void dump_mmu(CPUPPCState *env)
>  break;
>  case POWERPC_MMU_3_00:
>  if (ppc64_v3_radix(env_archcpu(env))) {
> -/* TODO - Unsupported */
> +qemu_log_mask(LOG_UNIMP, "%s: the PPC64 MMU unsupported\n",
> +  __func__);
>  } else {
>  dump_slb(env_archcpu(env));
> -break;
>  }
> +break;
>  #endif
>  default:
>  qemu_log_mask(LOG_UNIMP, "%s: unimplemented\n", __func__);
> 

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v7 0/3] hw/block/nvme: dulbe and dsm support

2020-10-29 Thread Klaus Jensen

On Oct 27 18:57, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> This adds support for the Deallocated or Unwritten Logical Block error
> recovery feature as well as the Dataset Management command.
> 
> v7:
>   - Handle negative return value from bdrv_block_status.
>   - bdrv_get_info may not be supported on all block drivers, so do not
> consider it a fatal error.
> 
> v6:
>   - Skip the allocation of the discards integer and just use the opaque
> value directly (Philippe)
>   - Split changes to include/block/nvme.h into a separate patch
> (Philippe)
>   - Clean up some convoluted checks on the discards value (Philippe)
>   - Use unambiguous units in the commit messages (Philippe)
>   - Stack allocate the range array (Keith)
> 
> v5:
>   - Restore status code from callback (Keith)
> 
> v4:
>   - Removed mixed declaration and code (Keith)
>   - Set NPDG and NPDA and account for the blockdev cluster size.
> 
> Klaus Jensen (3):
>   hw/block/nvme: add dulbe support
>   nvme: add namespace I/O optimization fields to shared header
>   hw/block/nvme: add the dataset management command
> 
>  hw/block/nvme-ns.h|   4 +
>  hw/block/nvme.h   |   2 +
>  include/block/nvme.h  |  12 ++-
>  hw/block/nvme-ns.c|  34 ++--
>  hw/block/nvme.c   | 193 +-
>  hw/block/trace-events |   4 +
>  6 files changed, 240 insertions(+), 9 deletions(-)
> 
> -- 
> 2.29.1
> 

Keith, I cleared your R-b's from both patch 1 and 3 - please re-review.
The diff from v6 is very small, but it does include functional changes.


signature.asc
Description: PGP signature

Re: [PATCH v2 6/8] target/sparc/win_helper: silence the compiler warnings

2020-10-29 Thread Philippe Mathieu-Daudé

On 10/30/20 1:40 AM, Chen Qun wrote:
> When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
> target/sparc/win_helper.c: In function ‘get_gregset’:
> target/sparc/win_helper.c:304:9: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
>   304 | trace_win_helper_gregset_error(pstate);
>   | ^~
> target/sparc/win_helper.c:306:5: note: here
>   306 | case 0:
>   | ^~~~
> 
> Add the corresponding "fall through" comment to fix it.
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> Reviewed-by: Artyom Tarasenko 
> ---
> v1->v2: Combine the /* fall through */ to the preceding comments
> (Base on Philippe's comments).

Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v2 3/8] accel/tcg/user-exec: silence the compiler warnings

2020-10-29 Thread Philippe Mathieu-Daudé

On 10/30/20 1:40 AM, Chen Qun wrote:
> When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
> ../accel/tcg/user-exec.c: In function ‘handle_cpu_signal’:
> ../accel/tcg/user-exec.c:169:13: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
>   169 | cpu_exit_tb_from_sighandler(cpu, old_set);
>   | ^
> ../accel/tcg/user-exec.c:172:9: note: here
>   172 | default:
> 
> Mark the cpu_exit_tb_from_sighandler() function with QEMU_NORETURN to fix it.
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> ---
> v1->v2: Add QEMU_NORETURN to cpu_exit_tb_from_sighandler() function
> to avoid the compiler warnings(Base on Thomas's and Richard's comments).

Reviewed-by: Philippe Mathieu-Daudé

[PATCH v2] block: Remove unused BlockDeviceMapEntry

2020-10-29 Thread Markus Armbruster

BlockDeviceMapEntry has never been used.  It was added in commit
facd6e2 "so that it is published through the introspection mechanism."
What exactly introspecting types that aren't used for anything could
accomplish isn't clear.  What "introspection mechanism" to use is also
nebulous.  To the best of my knowledge, there has never been one that
covered this type.  Certainly not query-qmp-schema, which includes
only types that are actually used in QMP.

Not being able to introspect BlockDeviceMapEntry hasn't bothered
anyone enough to complain in almost four years.  Get rid of it.

Cc: Paolo Bonzini 
Cc: Eric Blake 
Reviewed-by: Eric Blake 
Signed-off-by: Markus Armbruster 
---
I found an old patch I neglected to merge.

Max replied to a remark in Eric's review of v1:

Max Reitz  writes:

> On 2017-07-28 20:10, Eric Blake wrote:
>> This type is the schema for 'qemu-img map --output=json'.  And I had a
>> patch once (that I need to revive) that added a JSON Output visitor; at
>> which point I fixed qemu-img to convert from QAPI to JSON instead of
>> open-coding its construction of its output string, at which point the
>> QAPI generated code for this type is useful.
> (Very late reply, I know, I just stumbled over *MapEntry when looking
> over block-core.json what we might want to deprecate in 3.0)
>
> We already use MapEntry there -- why don't we output just that instead?
> The only difference seems to be an additional @filename parameter which
> would probably be actually nice to include in the output.
>
> Except that BlockDeviceMapEntry's documentation is better, so we should
> merge that into MapEntry before removing the former.
>
> Max

https://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg02933.html

Me doing the doc update Max suggested could take more than one
iteration, as I know nothing about this stuff.  Max, could you give it
a try?  Feel free to take over my patch.

 qapi/block-core.json | 29 -
 1 file changed, 29 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e00fc27b5e..2aa499a72e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -418,35 +418,6 @@
 ##
 { 'enum': 'BlockDeviceIoStatus', 'data': [ 'ok', 'failed', 'nospace' ] }
 
-##
-# @BlockDeviceMapEntry:
-#
-# Entry in the metadata map of the device (returned by "qemu-img map")
-#
-# @start: Offset in the image of the first byte described by this entry
-# (in bytes)
-#
-# @length: Length of the range described by this entry (in bytes)
-#
-# @depth: Number of layers (0 = top image, 1 = top image's backing file, etc.)
-# before reaching one for which the range is allocated.  The value is
-# in the range 0 to the depth of the image chain - 1.
-#
-# @zero: the sectors in this range read as zeros
-#
-# @data: reading the image will actually read data from a file (in particular,
-#if @offset is present this means that the sectors are not simply
-#preallocated, but contain actual data in raw format)
-#
-# @offset: if present, the image file stores the data for this range in
-#  raw format at the given offset.
-#
-# Since: 1.7
-##
-{ 'struct': 'BlockDeviceMapEntry',
-  'data': { 'start': 'int', 'length': 'int', 'depth': 'int', 'zero': 'bool',
-'data': 'bool', '*offset': 'int' } }
-
 ##
 # @DirtyBitmapStatus:
 #
-- 
2.26.2

Re: Out-of-Process Device Emulation session at KVM Forum 2020

2020-10-29 Thread Stefan Hajnoczi

On Fri, Oct 30, 2020 at 3:04 AM Alex Williamson
 wrote:
> It's great to revisit ideas, but proclaiming a uAPI is bad solely
> because the data transfer is opaque, without defining why that's bad,
> evaluating the feasibility and implementation of defining a well
> specified data format rather than protocol, including cross-vendor
> support, or proposing any sort of alternative is not so helpful imo.

The migration approaches in VFIO and vDPA/vhost were designed for
different requirements and I think this is why there are different
perspectives on this. Here is a comparison and how VFIO could be
extended in the future. I see 3 levels of device state compatibility:

1. The device cannot save/load state blobs, instead userspace fetches
and restores specific values of the device's runtime state (e.g. last
processed ring index). This is the vhost approach.

2. The device can save/load state in a standard format. This is
similar to #1 except that there is a single read/write blob interface
instead of fine-grained get_FOO()/set_FOO() interfaces. This approach
pushes the migration state parsing into the device so that userspace
doesn't need knowledge of every device type. With this approach it is
possible for a device from vendor A to migrate to a device from vendor
B, as long as they both implement the same standard migration format.
The limitation of this approach is that vendor-specific state cannot
be transferred.

3. The device can save/load opaque blobs. This is the initial VFIO
approach. A device from vendor A cannot migrate to a device from
vendor B because the format is incompatible. This approach works well
when devices have unique guest-visible hardware interfaces so the
guest wouldn't be able to handle migrating a device from vendor A to a
device from vendor B anyway.

I think we will see more NVMe and VIRTIO hardware VFIO devices in the
future. Those are standard guest-visible hardware interfaces. It makes
sense to define standard migration formats so it's possible to migrate
a device from vendor A to a device from vendor B.

This can be achieved as follows:
1. The VFIO migration blob starts with a unique format identifier such
as a UUID. This way the destination device can identify standard
device state formats and parse them.
2. The VFIO device state ioctl is extended so userspace can enumerate
and select device state formats. This way it's possible to check
available formats on the source and destination devices before
migration and to configure the source device to produce device state
in a common format.

To me it seems #3 makes sense as an initial approach for VFIO since
guest-visible hardware interfaces are often not compatible between PCI
devices. #2 can be added in the future, especially when VFIO drivers
from different vendors become available that present the same
guest-visible hardware interface (NVMe, VIRTIO, etc).

Stefan

Re: [PATCH v2 6/8] target/sparc/win_helper: silence the compiler warnings

2020-10-29 Thread Richard Henderson

On 10/29/20 5:40 PM, Chen Qun wrote:
> When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
> target/sparc/win_helper.c: In function ‘get_gregset’:
> target/sparc/win_helper.c:304:9: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
>   304 | trace_win_helper_gregset_error(pstate);
>   | ^~
> target/sparc/win_helper.c:306:5: note: here
>   306 | case 0:
>   | ^~~~
> 
> Add the corresponding "fall through" comment to fix it.
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> Reviewed-by: Artyom Tarasenko 
> ---
> v1->v2: Combine the /* fall through */ to the preceding comments
> (Base on Philippe's comments).

Reviewed-by: Richard Henderson 


r~

Re: [PATCH v2 3/8] accel/tcg/user-exec: silence the compiler warnings

2020-10-29 Thread Richard Henderson

On 10/29/20 5:40 PM, Chen Qun wrote:
> When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
> ../accel/tcg/user-exec.c: In function ‘handle_cpu_signal’:
> ../accel/tcg/user-exec.c:169:13: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
>   169 | cpu_exit_tb_from_sighandler(cpu, old_set);
>   | ^
> ../accel/tcg/user-exec.c:172:9: note: here
>   172 | default:
> 
> Mark the cpu_exit_tb_from_sighandler() function with QEMU_NORETURN to fix it.
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> ---
> v1->v2: Add QEMU_NORETURN to cpu_exit_tb_from_sighandler() function
> to avoid the compiler warnings(Base on Thomas's and Richard's comments).

Reviewed-by: Richard Henderson 


r~

block: Fix some code style problems, "foo* bar" should be "foo *bar"

2020-10-29 Thread shiliyang

There have some code style problems be found when read the block driver code.
So I fixes some problems of this error, ERROR: "foo* bar" should be "foo *bar".

Signed-off-by: Liyang Shi 
Reported-by: Euler Robot 
---
 block/blkdebug.c |  2 +-
 block/dmg.c  |  2 +-
 block/qcow2.c|  4 ++--
 block/qcow2.h|  6 +++---
 block/vpc.c  | 10 +-
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 54da719dd1..5fe6172da9 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -173,7 +173,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error 
**errp)
 {
 struct add_rule_data *d = opaque;
 BDRVBlkdebugState *s = d->s;
-const char* event_name;
+const char *event_name;
 int event;
 struct BlkdebugRule *rule;
 int64_t sector;
diff --git a/block/dmg.c b/block/dmg.c
index 0d6c317296..ef35a505f2 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -559,7 +559,7 @@ static void dmg_refresh_limits(BlockDriverState *bs, Error 
**errp)
 bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
 }

-static inline int is_sector_in_chunk(BDRVDMGState* s,
+static inline int is_sector_in_chunk(BDRVDMGState *s,
 uint32_t chunk_num, uint64_t sector_num)
 {
 if (chunk_num >= s->n_chunks || s->sectors[chunk_num] > sector_num ||
diff --git a/block/qcow2.c b/block/qcow2.c
index b6cb4db8bb..0f94c43ce9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -269,7 +269,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,

 case QCOW2_EXT_MAGIC_FEATURE_TABLE:
 if (p_feature_table != NULL) {
-void* feature_table = g_malloc0(ext.len + 2 * 
sizeof(Qcow2Feature));
+void *feature_table = g_malloc0(ext.len + 2 * 
sizeof(Qcow2Feature));
 ret = bdrv_pread(bs->file, offset , feature_table, ext.len);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "ERROR: ext_feature_table: "
@@ -3374,7 +3374,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
Error **errp)
 size_t cluster_size;
 int version;
 int refcount_order;
-uint64_t* refcount_table;
+uint64_t *refcount_table;
 int ret;
 uint8_t compression_type = QCOW2_COMPRESSION_TYPE_ZLIB;

diff --git a/block/qcow2.h b/block/qcow2.h
index 125ea9679b..2da03e1d1e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -343,8 +343,8 @@ typedef struct BDRVQcow2State {
 uint64_t l1_table_offset;
 uint64_t *l1_table;

-Qcow2Cache* l2_table_cache;
-Qcow2Cache* refcount_block_cache;
+Qcow2Cache *l2_table_cache;
+Qcow2Cache *refcount_block_cache;
 QEMUTimer *cache_clean_timer;
 unsigned cache_clean_interval;

@@ -394,7 +394,7 @@ typedef struct BDRVQcow2State {
 uint64_t autoclear_features;

 size_t unknown_header_fields_size;
-void* unknown_header_fields;
+void *unknown_header_fields;
 QLIST_HEAD(, Qcow2UnknownHeaderExtension) unknown_header_ext;
 QTAILQ_HEAD (, Qcow2DiscardRegion) discards;
 bool cache_discards;
diff --git a/block/vpc.c b/block/vpc.c
index 890554277e..1ab55f9287 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -172,7 +172,7 @@ static QemuOptsList vpc_runtime_opts = {

 static QemuOptsList vpc_create_opts;

-static uint32_t vpc_checksum(uint8_t* buf, size_t size)
+static uint32_t vpc_checksum(uint8_t *buf, size_t size)
 {
 uint32_t res = 0;
 int i;
@@ -528,7 +528,7 @@ static inline int64_t get_image_offset(BlockDriverState 
*bs, uint64_t offset,
  *
  * Returns 0 on success and < 0 on error
  */
-static int rewrite_footer(BlockDriverState* bs)
+static int rewrite_footer(BlockDriverState *bs)
 {
 int ret;
 BDRVVPCState *s = bs->opaque;
@@ -548,7 +548,7 @@ static int rewrite_footer(BlockDriverState* bs)
  *
  * Returns the sectors' offset in the image file on success and < 0 on error
  */
-static int64_t alloc_block(BlockDriverState* bs, int64_t offset)
+static int64_t alloc_block(BlockDriverState *bs, int64_t offset)
 {
 BDRVVPCState *s = bs->opaque;
 int64_t bat_offset;
@@ -781,8 +781,8 @@ static int coroutine_fn 
vpc_co_block_status(BlockDriverState *bs,
  * the hardware EIDE and ATA-2 limit of 16 heads (max disk size of 127 GB)
  * and instead allow up to 255 heads.
  */
-static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
-uint8_t* heads, uint8_t* secs_per_cyl)
+static int calculate_geometry(int64_t total_sectors, uint16_t *cyls,
+uint8_t *heads, uint8_t *secs_per_cyl)
 {
 uint32_t cyls_times_heads;

-- 
2.17.1.windows.2

[PATCH] block: Fix some code style problems, "foo* bar" should be "foo *bar"

2020-10-29 Thread shiliyang

There have some code style problems be found when read the block driver code.
So I fixes some problems of this error, ERROR: "foo* bar" should be "foo *bar".

Signed-off-by: Liyang Shi 
Reported-by: Euler Robot 
---
 block/blkdebug.c |  2 +-
 block/dmg.c  |  2 +-
 block/qcow2.c|  4 ++--
 block/qcow2.h|  6 +++---
 block/vpc.c  | 10 +-
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 54da719dd1..5fe6172da9 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -173,7 +173,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error 
**errp)
 {
 struct add_rule_data *d = opaque;
 BDRVBlkdebugState *s = d->s;
-const char* event_name;
+const char *event_name;
 int event;
 struct BlkdebugRule *rule;
 int64_t sector;
diff --git a/block/dmg.c b/block/dmg.c
index 0d6c317296..ef35a505f2 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -559,7 +559,7 @@ static void dmg_refresh_limits(BlockDriverState *bs, Error 
**errp)
 bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
 }

-static inline int is_sector_in_chunk(BDRVDMGState* s,
+static inline int is_sector_in_chunk(BDRVDMGState *s,
 uint32_t chunk_num, uint64_t sector_num)
 {
 if (chunk_num >= s->n_chunks || s->sectors[chunk_num] > sector_num ||
diff --git a/block/qcow2.c b/block/qcow2.c
index b6cb4db8bb..0f94c43ce9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -269,7 +269,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,

 case QCOW2_EXT_MAGIC_FEATURE_TABLE:
 if (p_feature_table != NULL) {
-void* feature_table = g_malloc0(ext.len + 2 * 
sizeof(Qcow2Feature));
+void *feature_table = g_malloc0(ext.len + 2 * 
sizeof(Qcow2Feature));
 ret = bdrv_pread(bs->file, offset , feature_table, ext.len);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "ERROR: ext_feature_table: "
@@ -3374,7 +3374,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
Error **errp)
 size_t cluster_size;
 int version;
 int refcount_order;
-uint64_t* refcount_table;
+uint64_t *refcount_table;
 int ret;
 uint8_t compression_type = QCOW2_COMPRESSION_TYPE_ZLIB;

diff --git a/block/qcow2.h b/block/qcow2.h
index 125ea9679b..2da03e1d1e 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -343,8 +343,8 @@ typedef struct BDRVQcow2State {
 uint64_t l1_table_offset;
 uint64_t *l1_table;

-Qcow2Cache* l2_table_cache;
-Qcow2Cache* refcount_block_cache;
+Qcow2Cache *l2_table_cache;
+Qcow2Cache *refcount_block_cache;
 QEMUTimer *cache_clean_timer;
 unsigned cache_clean_interval;

@@ -394,7 +394,7 @@ typedef struct BDRVQcow2State {
 uint64_t autoclear_features;

 size_t unknown_header_fields_size;
-void* unknown_header_fields;
+void *unknown_header_fields;
 QLIST_HEAD(, Qcow2UnknownHeaderExtension) unknown_header_ext;
 QTAILQ_HEAD (, Qcow2DiscardRegion) discards;
 bool cache_discards;
diff --git a/block/vpc.c b/block/vpc.c
index 890554277e..1ab55f9287 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -172,7 +172,7 @@ static QemuOptsList vpc_runtime_opts = {

 static QemuOptsList vpc_create_opts;

-static uint32_t vpc_checksum(uint8_t* buf, size_t size)
+static uint32_t vpc_checksum(uint8_t *buf, size_t size)
 {
 uint32_t res = 0;
 int i;
@@ -528,7 +528,7 @@ static inline int64_t get_image_offset(BlockDriverState 
*bs, uint64_t offset,
  *
  * Returns 0 on success and < 0 on error
  */
-static int rewrite_footer(BlockDriverState* bs)
+static int rewrite_footer(BlockDriverState *bs)
 {
 int ret;
 BDRVVPCState *s = bs->opaque;
@@ -548,7 +548,7 @@ static int rewrite_footer(BlockDriverState* bs)
  *
  * Returns the sectors' offset in the image file on success and < 0 on error
  */
-static int64_t alloc_block(BlockDriverState* bs, int64_t offset)
+static int64_t alloc_block(BlockDriverState *bs, int64_t offset)
 {
 BDRVVPCState *s = bs->opaque;
 int64_t bat_offset;
@@ -781,8 +781,8 @@ static int coroutine_fn 
vpc_co_block_status(BlockDriverState *bs,
  * the hardware EIDE and ATA-2 limit of 16 heads (max disk size of 127 GB)
  * and instead allow up to 255 heads.
  */
-static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
-uint8_t* heads, uint8_t* secs_per_cyl)
+static int calculate_geometry(int64_t total_sectors, uint16_t *cyls,
+uint8_t *heads, uint8_t *secs_per_cyl)
 {
 uint32_t cyls_times_heads;

-- 
2.17.1.windows.2

[PATCH v3 1/3] hw/9pfs : add spaces around operator

2020-10-29 Thread Xinhao Zhang

Fix code style. Operator needs spaces both sides.

Signed-off-by: Xinhao Zhang 
Signed-off-by: Kai Deng 
Reported-by: Euler Robot 
Reviewed-by: Greg Kurz 
---
 hw/9pfs/9p-local.c | 10 +-
 hw/9pfs/9p.c   | 16 
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
index 3107637209..af52c1daac 100644
--- a/hw/9pfs/9p-local.c
+++ b/hw/9pfs/9p-local.c
@@ -162,13 +162,13 @@ static void local_mapped_file_attr(int dirfd, const char 
*name,
 memset(buf, 0, ATTR_MAX);
 while (fgets(buf, ATTR_MAX, fp)) {
 if (!strncmp(buf, "virtfs.uid", 10)) {
-stbuf->st_uid = atoi(buf+11);
+stbuf->st_uid = atoi(buf + 11);
 } else if (!strncmp(buf, "virtfs.gid", 10)) {
-stbuf->st_gid = atoi(buf+11);
+stbuf->st_gid = atoi(buf + 11);
 } else if (!strncmp(buf, "virtfs.mode", 11)) {
-stbuf->st_mode = atoi(buf+12);
+stbuf->st_mode = atoi(buf + 12);
 } else if (!strncmp(buf, "virtfs.rdev", 11)) {
-stbuf->st_rdev = atoi(buf+12);
+stbuf->st_rdev = atoi(buf + 12);
 }
 memset(buf, 0, ATTR_MAX);
 }
@@ -823,7 +823,7 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
*dir_path, const char *name,
 if (fd == -1) {
 goto out;
 }
-credp->fc_mode = credp->fc_mode|S_IFREG;
+credp->fc_mode = credp->fc_mode | S_IFREG;
 if (fs_ctx->export_flags & V9FS_SM_MAPPED) {
 /* Set cleint credentials in xattr */
 err = local_set_xattrat(dirfd, name, credp);
diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 741d222c3f..94df440fc7 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1091,7 +1091,7 @@ static mode_t v9mode_to_mode(uint32_t mode, V9fsString 
*extension)
 }
 }
 
-if (!(ret&~0777)) {
+if (!(ret & ~0777)) {
 ret |= S_IFREG;
 }
 
@@ -2776,7 +2776,7 @@ static void coroutine_fn v9fs_create(void *opaque)
 v9fs_path_unlock(s);
 } else {
 err = v9fs_co_open2(pdu, fidp, &name, -1,
-omode_to_uflags(mode)|O_CREAT, perm, &stbuf);
+omode_to_uflags(mode) | O_CREAT, perm, &stbuf);
 if (err < 0) {
 goto out;
 }
@@ -3428,7 +3428,7 @@ static int v9fs_fill_statfs(V9fsState *s, V9fsPDU *pdu, 
struct statfs *stbuf)
  * compute bsize factor based on host file system block size
  * and client msize
  */
-bsize_factor = (s->msize - P9_IOHDRSZ)/stbuf->f_bsize;
+bsize_factor = (s->msize - P9_IOHDRSZ) / stbuf->f_bsize;
 if (!bsize_factor) {
 bsize_factor = 1;
 }
@@ -3440,9 +3440,9 @@ static int v9fs_fill_statfs(V9fsState *s, V9fsPDU *pdu, 
struct statfs *stbuf)
  * adjust(divide) the number of blocks, free blocks and available
  * blocks by bsize factor
  */
-f_blocks = stbuf->f_blocks/bsize_factor;
-f_bfree  = stbuf->f_bfree/bsize_factor;
-f_bavail = stbuf->f_bavail/bsize_factor;
+f_blocks = stbuf->f_blocks / bsize_factor;
+f_bfree  = stbuf->f_bfree / bsize_factor;
+f_bavail = stbuf->f_bavail / bsize_factor;
 f_files  = stbuf->f_files;
 f_ffree  = stbuf->f_ffree;
 fsid_val = (unsigned int) stbuf->f_fsid.__val[0] |
@@ -4185,6 +4185,6 @@ static void __attribute__((__constructor__)) 
v9fs_set_fd_limit(void)
 error_report("Failed to get the resource limit");
 exit(1);
 }
-open_fd_hw = rlim.rlim_cur - MIN(400, rlim.rlim_cur/3);
-open_fd_rc = rlim.rlim_cur/2;
+open_fd_hw = rlim.rlim_cur - MIN(400, rlim.rlim_cur / 3);
+open_fd_rc = rlim.rlim_cur / 2;
 }
-- 
2.29.0-rc1

[PATCH v3 2/3] hw/9pfs : open brace '{' following struct go on the same line

2020-10-29 Thread Xinhao Zhang

Fix code style. Open braces for struct should go on the same line.

Signed-off-by: Xinhao Zhang 
Signed-off-by: Kai Deng 
Reported-by: Euler Robot 
Reviewed-by: Greg Kurz 
---
 hw/9pfs/9p.h | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
index 3dd1b50b1a..32df81f360 100644
--- a/hw/9pfs/9p.h
+++ b/hw/9pfs/9p.h
@@ -143,8 +143,7 @@ typedef struct {
  */
 QEMU_BUILD_BUG_ON(sizeof(P9MsgHeader) != 7);
 
-struct V9fsPDU
-{
+struct V9fsPDU {
 uint32_t size;
 uint16_t tag;
 uint8_t id;
@@ -270,8 +269,7 @@ union V9fsFidOpenState {
 void *private;
 };
 
-struct V9fsFidState
-{
+struct V9fsFidState {
 int fid_type;
 int32_t fid;
 V9fsPath path;
@@ -338,8 +336,7 @@ typedef struct {
 uint64_t path;
 } QpfEntry;
 
-struct V9fsState
-{
+struct V9fsState {
 QLIST_HEAD(, V9fsPDU) free_list;
 QLIST_HEAD(, V9fsPDU) active_list;
 V9fsFidState *fid_list;
-- 
2.29.0-rc1

[PATCH v3 3/3] hw/9pfs : add space before the open parenthesis '('

2020-10-29 Thread Xinhao Zhang

Fix code style. Space required before the open parenthesis '('.

Signed-off-by: Xinhao Zhang 
Signed-off-by: Kai Deng 
Reported-by: Euler Robot 
Reviewed-by: Greg Kurz 
---
 hw/9pfs/cofs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/9pfs/cofs.c b/hw/9pfs/cofs.c
index 55991916ec..0b321b456e 100644
--- a/hw/9pfs/cofs.c
+++ b/hw/9pfs/cofs.c
@@ -23,7 +23,7 @@ static ssize_t __readlink(V9fsState *s, V9fsPath *path, 
V9fsString *buf)
 ssize_t len, maxlen = PATH_MAX;
 
 buf->data = g_malloc(PATH_MAX);
-for(;;) {
+for (;;) {
 len = s->ops->readlink(&s->ctx, path, buf->data, maxlen);
 if (len < 0) {
 g_free(buf->data);
-- 
2.29.0-rc1

Live migration not possible from 5.0 to 5.1?

2020-10-29 Thread Antti Antinoja

Hi All,

I couldn't find any mention about live migration incompatibility between 5.0 
and 5.1 in the release notes but at least on our AMD based platform live 
migration from 5.0 to 5.1 is not possible.

The upgraded host had identical versions with it's pair before the upgrade was 
started:
* qemu 5.0.0-r2
* kernel 5.7.17

After upgrade:
* qemu 5.1.0-r1
* kernel 5.8.16

After reverting qemu back to 5.0.0-r2 migration worked normally.

On the sending end "info migrate" shows:
(qemu) info migrate
info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
Migration status: failed (Unable to write to socket: Broken pipe)
total time: 0 milliseconds

At least once the receiving end died (while running 5.1.0-r1). All attempts 
resulted a "Broken pipe" error on the sending (5.0.0-r2) end.

Cheers,
Antti

-- 
Antti Antinoja

[PATCH] migration/dirtyrate: simplify inlcudes in dirtyrate.c

2020-10-29 Thread Chuan Zheng

Remove redundant blank line which is left by Commit 662770af7c6e8c,
also take this opportunity to remove redundant includes in dirtyrate.c.

Signed-off-by: Chuan Zheng 
---
 migration/dirtyrate.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index 8f728d2..ccb9814 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -11,17 +11,12 @@
  */
 
 #include "qemu/osdep.h"
-
 #include 
 #include "qapi/error.h"
 #include "cpu.h"
-#include "qemu/config-file.h"
-#include "exec/memory.h"
 #include "exec/ramblock.h"
-#include "exec/target_page.h"
 #include "qemu/rcu_queue.h"
 #include "qapi/qapi-commands-migration.h"
-#include "migration.h"
 #include "ram.h"
 #include "trace.h"
 #include "dirtyrate.h"
-- 
1.8.3.1

Re: [PATCH] vfio-pci: add Ascend devices passthrough quirks

2020-10-29 Thread Alex Williamson

On Thu, 29 Oct 2020 19:40:48 +0800
Binfeng Wu  wrote:

> Ascend is a series of SoC processors developed by Huawei. Ascend310/910
> are highly efficient, flexible, and programmable AI processors in this
> series and support device passthrough via vfio-pci. Ascends device
> xloader update is only allowed in host, because update triggered by vm
> users may affect other users when Ascend devices passthrough to vm.
> Set a bar quirk is an effective method to keep vm users from updating
> xloader. In this patch, two bar quirks were proposed to cover
> Ascend310/910 respectively.


If you're trying to say that userspace, not just a VM, should not be
able to update this feature, then is QEMU the right place to implement
this quirk versus device specific protection within the host kernel?
Thanks,

Alex

> Signed-off-by: Binfeng Wu 
> ---
>  hw/vfio/pci-quirks.c | 104 +++
>  1 file changed, 104 insertions(+)
> 
> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> index 57150913b7..291a45d3ab 100644
> --- a/hw/vfio/pci-quirks.c
> +++ b/hw/vfio/pci-quirks.c
> @@ -1202,6 +1202,108 @@ int vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev,
>  return 0;
>  }
>  
> +#define PCI_VENDOR_ID_HUAWEI  0x19e5
> +#define PCI_DEVICE_ID_ASCEND910   0xd801
> +#define PCI_DEVICE_ID_ASCEND310   0xd100
> +#define ASCEND910_XLOADER_SIZE4
> +#define ASCEND910_XLOADER_OFFSET  0x80400
> +#define ASCEND310_XLOADER_SIZE4
> +#define ASCEND310_XLOADER_OFFSET  0x400
> +
> +typedef struct VFIOAscendBarQuirk {
> +struct VFIOPCIDevice *vdev;
> +pcibus_t offset;
> +uint8_t bar;
> +MemoryRegion *mem;
> +} VFIOAscendBarQuirk;
> +
> +static uint64_t vfio_ascend_quirk_read(void *opaque,
> +   hwaddr addr, unsigned size)
> +{
> +VFIOAscendBarQuirk *quirk = opaque;
> +VFIOPCIDevice *vdev = quirk->vdev;
> +
> +return vfio_region_read(&vdev->bars[quirk->bar].region,
> +addr + quirk->offset, size);
> +}
> +
> +static void vfio_ascend_quirk_write(void *opaque, hwaddr addr,
> +uint64_t data, unsigned size)
> +{
> +}
> +
> +static const MemoryRegionOps vfio_ascend_intercept_regs_quirk = {
> +.read = vfio_ascend_quirk_read,
> +.write = vfio_ascend_quirk_write,
> +.endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
> +static void vfio_probe_ascend_bar0_quirk(VFIOPCIDevice *vdev, int nr)
> +{
> +VFIOQuirk *quirk;
> +VFIOAscendBarQuirk *bar0_quirk;
> +
> +if (!vfio_pci_is(vdev, PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ASCEND910) ||
> +nr != 0) {
> +return;
> +}
> +
> +quirk = g_malloc0(sizeof(*quirk));
> +quirk->nr_mem = 1;
> +quirk->mem = g_new0(MemoryRegion, quirk->nr_mem);
> +bar0_quirk = quirk->data = g_new0(typeof(*bar0_quirk), quirk->nr_mem);
> +bar0_quirk[0].vdev = vdev;
> +bar0_quirk[0].offset = ASCEND910_XLOADER_OFFSET;
> +bar0_quirk[0].bar = nr;
> +
> +/*
> + * intercept w/r to the xloader-updating register,
> + * so the vm can't enable xloader-updating
> + */
> +memory_region_init_io(&quirk->mem[0], OBJECT(vdev),
> +  &vfio_ascend_intercept_regs_quirk,
> +  &bar0_quirk[0],
> +  "vfio-ascend-bar0-intercept-regs-quirk",
> +  ASCEND910_XLOADER_SIZE);
> +memory_region_add_subregion_overlap(vdev->bars[nr].region.mem,
> +bar0_quirk[0].offset,
> +&quirk->mem[0], 1);
> +QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
> +}
> +
> +static void vfio_probe_ascend_bar4_quirk(VFIOPCIDevice *vdev, int nr)
> +{
> +VFIOQuirk *quirk;
> +VFIOAscendBarQuirk *bar4_quirk;
> +
> +if (!vfio_pci_is(vdev, PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ASCEND310) ||
> +nr != 4) {
> +return;
> +}
> +
> +quirk = g_malloc0(sizeof(*quirk));
> +quirk->nr_mem = 1;
> +quirk->mem = g_new0(MemoryRegion, quirk->nr_mem);
> +bar4_quirk = quirk->data = g_new0(typeof(*bar4_quirk), quirk->nr_mem);
> +bar4_quirk[0].vdev = vdev;
> +bar4_quirk[0].offset = ASCEND310_XLOADER_OFFSET;
> +bar4_quirk[0].bar = nr;
> +
> +/*
> + * intercept w/r to the xloader-updating register,
> + * so the vm can't enable xloader-updating
> + */
> +memory_region_init_io(&quirk->mem[0], OBJECT(vdev),
> +  &vfio_ascend_intercept_regs_quirk,
> +  &bar4_quirk[0],
> +  "vfio-ascend-bar4-intercept-regs-quirk",
> +  ASCEND310_XLOADER_SIZE);
> +memory_region_add_subregion_overlap(vdev->bars[nr].region.mem,
> +bar4_quirk[0].offset,
> +&quirk->mem[0], 1);
> +QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, quirk, next);
> +}
> +
>  /*
>

Re: Out-of-Process Device Emulation session at KVM Forum 2020

2020-10-29 Thread Alex Williamson

On Fri, 30 Oct 2020 09:11:23 +0800
Jason Wang  wrote:

> On 2020/10/29 下午11:46, Alex Williamson wrote:
> > On Thu, 29 Oct 2020 23:09:33 +0800
> > Jason Wang  wrote:
> >  
> >> On 2020/10/29 下午10:31, Alex Williamson wrote:  
> >>> On Thu, 29 Oct 2020 21:02:05 +0800
> >>> Jason Wang  wrote:
> >>> 
>  On 2020/10/29 下午8:08, Stefan Hajnoczi wrote:  
> > Here are notes from the session:
> >
> > protocol stability:
> >* vhost-user already exists for existing third-party applications
> >* vfio-user is more general but will take more time to develop
> >* libvfio-user can be provided to allow device implementations
> >
> > management:
> >* Should QEMU launch device emulation processes?
> >* Nicer user experience
> >* Technical blockers: forking, hotplug, security is hard once
> > QEMU has started running
> >* Probably requires a new process model with a long-running
> > QEMU management process proxying QMP requests to the emulator process
> >
> > migration:
> >* dbus-vmstate
> >* VFIO live migration ioctls
> >* Source device can continue if migration fails
> >* Opaque blobs are transferred to destination, destination 
> > can
> > fail migration if it decides the blobs are incompatible  
>  I'm not sure this can work:
> 
>  1) Reading something that is opaque to userspace is probably a hint of
>  bad uAPI design
>  2) Did qemu even try to migrate opaque blobs before? It's probably a bad
>  design of migration protocol as well.
> 
>  It looks to me have a migration driver in qemu that can clearly define
>  each byte in the migration stream is a better approach.  
> >>> Any time during the previous two years of development might have been a
> >>> more appropriate time to express your doubts.  
> >>
> >> Somehow I did that in this series[1]. But the main issue is still there.  
> > That series is related to a migration compatibility interface, not the
> > migration data itself.  
> 
> 
> They are not independent. The compatibility interface design depends on 
> the migration data design. I ask the uAPI issue in that thread but 
> without any response.
> 
> 
> >  
> >> Is this legal to have a uAPI that turns out to be opaque to userspace?
> >> (VFIO seems to be the first). If it's not,  the only choice is to do
> >> that in Qemu.  
> > So you're suggesting that any time the kernel is passing through opaque
> > data that gets interpreted by some entity elsewhere, potentially with
> > proprietary code, that we're in legal jeopardy?  VFIO is certainly not
> > the first to do that (storage and network devices come to mind).
> > Devices are essentially opaque data themselves, vfio provides access to
> > (ex.) BARs, but the interpretation of what resides in that BAR is device
> > specific.  Sometimes it's defined in a public datasheet, sometimes not.
> > Suggesting that we can't move opaque data through a uAPI seems rather
> > absurd.  
> 
> 
> No, I think we are talking about different things. What I meant is the 
> data carried via uAPI should not opaque userspace. What you said here is 
> a good example for this actually. When you expose BAR to userspace, 
> there should be driver that knows the semantics of BAR running in the 
> userspace, so it's not opaque to userspace.


But the thing running in userspace might be QEMU, which doesn't know
the semantics of the BAR, it might not be until a driver in the guest
that we have something that understands the BAR semantics beyond opaque
data.  We might have nested guests, so it could be passed through
multiple userspaces as opaque data.  The requirement make no sense.


> >>> Note that we're not talking about vDPA devices here, we're talking
> >>> about arbitrary devices with arbitrary state.  Some degree of migration
> >>> support for assigned devices can be implemented in QEMU, Alex Graf
> >>> proved this several years ago with i40evf.  Years later, we don't have
> >>> any vendors proposing device specific migration code for QEMU.  
> >>
> >> Yes but it's not necessarily VFIO as well.  
> > I don't know what this means.  
> 
> 
> I meant we can't not assume VFIO is the only uAPI that will be used by Qemu.

 
And we don't, DPDK, SPDK, various other userspaces exist.  All can take
advantage of the migration uAPI that we've developed rather than
implementing device specific code in their projects.  I'm not sure how
this is strengthening your argument for device specific migration code
in QEMU, which would need to be replicated in every other userspace.  As
opaque data with a well defined protocol, each userspace can implement
support for this migration protocol once and it should work independent
of the device or vendor.  It only requires support in the code
implementing the device, which is already necessarily device specific.


> >>> Clearly

Question on UEFI ACPI tables setup and probing on arm64

2020-10-29 Thread Ying Fang


Hi,

I have a question on UEFI/ACPI tables setup and probing on arm64 platform.

Currently on arm64 platform guest can be booted with both fdt and ACPI
supported. If ACPI is enabled, [1] says the only defined method for
passing ACPI tables to the kernel is via the UEFI system configuration
table. So AFAIK, ACPI Should be dependent on UEFI.

What's more [2] says UEFI kernel support on the ARM architectures
is only available through a *stub*. The stub populates the FDT /chosen
node with some UEFI parameters describing the UEFI location info.

So i dump /sys/firmware/fdt from the guest, it does have something like:

/dts-v1/;

/ {
#size-cells = <0x02>;
#address-cells = <0x02>;

chosen {
linux,uefi-mmap-desc-ver = <0x01>;
linux,uefi-mmap-desc-size = <0x30>;
linux,uefi-mmap-size = <0x810>;
linux,uefi-mmap-start = <0x04 0x3c0ce018>;
linux,uefi-system-table = <0x04 0x3f8b0018>;
		bootargs = "BOOT_IMAGE=/vmlinuz-4.19.90-2003.4.0.0036.oe1.aarch64 
root=/dev/mapper/openeuler-root ro rd.lvm.lv=openeuler/root 
rd.lvm.lv=openeuler/swap video=VGA-1:640x480-32@60me 
smmu.bypassdev=0x1000:0x17 smmu.bypassdev=0x1000:0x15 
crashkernel=1024M,high video=efifb:off video=VGA-1:640x480-32@60me";

linux,initrd-end = <0x04 0x3a85a5da>;
linux,initrd-start = <0x04 0x392f2000>;
};
};

But the question is that I did not see any code adding the uefi
in fdt chosen node in *arm_load_dtb* or anywhere else.
Qemu only maps the OVMF binary file into a pflash device.
So I'm really confused on how UEFI information is provided to
guest by qemu. Does anybody know of the details about it ?

[1] https://www.kernel.org/doc/html/latest/arm64/arm-acpi.html
[2] https://www.kernel.org/doc/Documentation/arm/uefi.rst

Thanks.
Ying

[PATCH] net/l2tpv3: Remove redundant check in net_init_l2tpv3()

2020-10-29 Thread AlexChen

The result has been checked to be NULL before, it cannot be NULL here,
so the check is redundant. Remove it.

Reported-by: Euler Robot 
Signed-off-by: AlexChen 
---
 net/l2tpv3.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/l2tpv3.c b/net/l2tpv3.c
index 55fea17c0f..e4d4218db6 100644
--- a/net/l2tpv3.c
+++ b/net/l2tpv3.c
@@ -655,9 +655,8 @@ int net_init_l2tpv3(const Netdev *netdev,
 error_setg(errp, "could not bind socket err=%i", errno);
 goto outerr;
 }
-if (result) {
-freeaddrinfo(result);
-}
+
+freeaddrinfo(result);

 memset(&hints, 0, sizeof(hints));

@@ -686,9 +685,7 @@ int net_init_l2tpv3(const Netdev *netdev,
 memcpy(s->dgram_dst, result->ai_addr, result->ai_addrlen);
 s->dst_size = result->ai_addrlen;

-if (result) {
-freeaddrinfo(result);
-}
+freeaddrinfo(result);

 if (l2tpv3->has_counter && l2tpv3->counter) {
 s->has_counter = true;
-- 
2.19.1

RE: [PATCH] tcg/optimize: Add fallthrough annotations

2020-10-29 Thread Chenqun (kuhn)

> -Original Message-
> From: Richard Henderson [mailto:richard.hender...@linaro.org]
> Sent: Friday, October 30, 2020 4:07 AM
> To: Thomas Huth ; Richard Henderson ;
> qemu-devel@nongnu.org
> Cc: Chenqun (kuhn) 
> Subject: Re: [PATCH] tcg/optimize: Add fallthrough annotations
> 
> On 10/29/20 5:28 AM, Thomas Huth wrote:
> > To be able to compile this file with -Werror=implicit-fallthrough, we
> > need to add some fallthrough annotations to the case statements that
> > might fall through. Unfortunately, the typical "/* fallthrough */"
> > comments do not work here as expected since some case labels are
> > wrapped in macros and the compiler fails to match the comments in this
> > case. But using __attribute__((fallthrough)) seems to work fine, so
> > let's use that instead.
> 
> Why would the macro matter?  It expands to two case statements with
> nothing in between them.
> 
> This sounds like a compiler bug that should be reported.
> 
Hi all,
  I have queried the GCC options description about the Wimplicit-fallthrough 
and verified it.
The value of Wimplicit-fallthrough ranges from 0 to 5. 
The value 0 is to ignore all warnings, which is certainly not what we need.
If the value is set to 1 or 2, most fall through on the QEMU can take effect.
   Eg：/* FALLTHRU */、/* fallthru */、/* fall-through */、/* FALLTHOUGH */、/* fall 
through */、/* fallthrough */..

When the value ranges from 3 to 5, more fallthrough comments become invalid as 
the value increases.

So, I agree with Philippe's suggestion to add a QEMU_FALLTHROUGH to unify this 
compiler property.

Thanks,
Chen Qun

Additional gcc information is as follows:
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
-Wimplicit-fallthrough is the same as -Wimplicit-fallthrough=3 and 
-Wno-implicit-fallthrough is the same as -Wimplicit-fallthrough=0.

The option argument n specifies what kind of comments are accepted:
-Wimplicit-fallthrough=0 disables the warning altogether.
-Wimplicit-fallthrough=1 matches .* regular expression, any comment is used as 
fallthrough comment.
-Wimplicit-fallthrough=2 case insensitively matches .*falls?[ 
\t-]*thr(ough|u).* regular expression.
-Wimplicit-fallthrough=3 case sensitively matches one of the following regular 
expressions:
   -fallthrough
   @fallthrough@
   lint -fallthrough[ \t]*
   [ \t.!]*(ELSE,? |INTENTIONAL(LY)? )?
   FALL(S | |-)?THR(OUGH|U)[ \t.!]*(-[^\n\r]*)?
   [ \t.!]*(Else,? |Intentional(ly)? )?
   Fall((s | |-)[Tt]|t)hr(ough|u)[ \t.!]*(-[^\n\r]*)?
   [ \t.!]*([Ee]lse,? |[Ii]ntentional(ly)? )?
   fall(s | |-)?thr(ough|u)[ \t.!]*(-[^\n\r]*)?
-Wimplicit-fallthrough=4 case sensitively matches one of the following regular 
expressions:
-fallthrough
@fallthrough@
lint -fallthrough[ \t]*
[ \t]*FALLTHR(OUGH|U)[ \t]*
-Wimplicit-fallthrough=5 doesn’t recognize any comments as fallthrough 
comments, only attributes disable the warning.

[PATCH v8 11/11] hw/block/nvme: Document zoned parameters in usage text

2020-10-29 Thread Dmitry Fomichev

Added brief descriptions of the new device properties that are
now available to users to configure features of Zoned Namespace
Command Set in the emulator.

This patch is for documentation only, no functionality change.

Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
---
 hw/block/nvme.c | 47 ++-
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 339becd3e2..10f5c752ba 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -9,7 +9,7 @@
  */
 
 /**
- * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e
+ * Reference Specs: http://www.nvmexpress.org, 1.4, 1.3, 1.2, 1.1, 1.0e
  *
  *  https://nvmexpress.org/developers/nvme-specification/
  */
@@ -22,8 +22,9 @@
  *  [pmrdev=,] \
  *  max_ioqpairs=, \
  *  aerl=, aer_max_queued=, \
- *  mdts=
- *  -device nvme-ns,drive=,bus=bus_name,nsid=
+ *  mdts=,zoned.append_size_limit= \
+ *  -device nvme-ns,drive=,bus=,nsid=,\
+ *  zoned=
  *
  * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
  * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
@@ -41,14 +42,50 @@
  * ~~
  * - `aerl`
  *   The Asynchronous Event Request Limit (AERL). Indicates the maximum number
- *   of concurrently outstanding Asynchronous Event Request commands suppoert
+ *   of concurrently outstanding Asynchronous Event Request commands support
  *   by the controller. This is a 0's based value.
  *
  * - `aer_max_queued`
  *   This is the maximum number of events that the device will enqueue for
- *   completion when there are no oustanding AERs. When the maximum number of
+ *   completion when there are no outstanding AERs. When the maximum number of
  *   enqueued events are reached, subsequent events will be dropped.
  *
+ * - `zoned.append_size_limit`
+ *   The maximum I/O size in bytes that is allowed in Zone Append command.
+ *   The default is 128KiB. Since internally this this value is maintained as
+ *   ZASL = log2( / ), some values assigned
+ *   to this property may be rounded down and result in a lower maximum ZA
+ *   data size being in effect. By setting this property to 0, users can make
+ *   ZASL to be equal to MDTS. This property only affects zoned namespaces.
+ *
+ * Setting `zoned` to true selects Zoned Command Set at the namespace.
+ * In this case, the following namespace properties are available to configure
+ * zoned operation:
+ * zoned.zsze=
+ * The number may be followed by K, M, G as in kilo-, mega- or giga-.
+ *
+ * zoned.zcap=
+ * The value 0 (default) forces zone capacity to be the same as zone
+ * size. The value of this property may not exceed zone size.
+ *
+ * zoned.descr_ext_size=
+ * This value needs to be specified in 64B units. If it is zero,
+ * namespace(s) will not support zone descriptor extensions.
+ *
+ * zoned.max_active=
+ * The default value means there is no limit to the number of
+ * concurrently active zones.
+ *
+ * zoned.max_open=
+ * The default value means there is no limit to the number of
+ * concurrently open zones.
+ *
+ * zoned.offline_zones=
+ *
+ * zoned.rdonly_zones=
+ *
+ * zoned.cross_zone_read=
+ * Setting this property to true enables Read Across Zone Boundaries.
  */
 
 #include "qemu/osdep.h"
-- 
2.21.0

[PATCH v8 06/11] hw/block/nvme: Support allocated CNS command variants

2020-10-29 Thread Dmitry Fomichev

From: Niklas Cassel 

Many CNS commands have "allocated" command variants. These include
a namespace as long as it is allocated, that is a namespace is
included regardless if it is active (attached) or not.

While these commands are optional (they are mandatory for controllers
supporting the namespace attachment command), our QEMU implementation
is more complete by actually providing support for these CNS values.

However, since our QEMU model currently does not support the namespace
attachment command, these new allocated CNS commands will return the
same result as the active CNS command variants.

In NVMe, a namespace is active if it exists and is attached to the
controller.

Add a new Boolean namespace flag, "attached", to provide the most
basic namespace attachment support. The default value for this new
flag is true. Also, implement the logic in the new CNS values to
include/exclude namespaces based on this new property. The only thing
missing is hooking up the actual Namespace Attachment command opcode,
which will allow a user to toggle the "attached" flag per namespace.

The reason for not hooking up this command completely is because the
NVMe specification requires the namespace management command to be
supported if the namespace attachment command is supported.

Signed-off-by: Niklas Cassel 
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Keith Busch 
---
 hw/block/nvme-ns.c   |  1 +
 hw/block/nvme-ns.h   |  1 +
 hw/block/nvme.c  | 68 
 include/block/nvme.h | 20 +++--
 4 files changed, 70 insertions(+), 20 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index c0362426cc..e191ef9be0 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -42,6 +42,7 @@ static void nvme_ns_init(NvmeNamespace *ns)
 id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
 
 ns->csi = NVME_CSI_NVM;
+ns->attached = true;
 
 /* no thin provisioning */
 id_ns->ncap = id_ns->nsze;
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index d795e44bab..2d9cd29d07 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -31,6 +31,7 @@ typedef struct NvmeNamespace {
 int64_t  size;
 NvmeIdNs id_ns;
 const uint32_t *iocs;
+bool attached;
 uint8_t  csi;
 
 NvmeNamespaceParams params;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d9e9fd264c..7b3eafad03 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1084,6 +1084,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
 if (unlikely(!req->ns)) {
 return NVME_INVALID_FIELD | NVME_DNR;
 }
+if (!req->ns->attached) {
+return NVME_INVALID_FIELD | NVME_DNR;
+}
 
 if (!(req->ns->iocs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) {
 trace_pci_nvme_err_invalid_opc(req->cmd.opcode);
@@ -1245,6 +1248,7 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, 
uint32_t buf_len,
 uint32_t trans_len;
 NvmeNamespace *ns;
 time_t current_ms;
+int i;
 
 if (off >= sizeof(smart)) {
 return NVME_INVALID_FIELD | NVME_DNR;
@@ -1255,15 +1259,18 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t 
rae, uint32_t buf_len,
 if (!ns) {
 return NVME_INVALID_NSID | NVME_DNR;
 }
-nvme_set_blk_stats(ns, &stats);
+if (ns->attached) {
+nvme_set_blk_stats(ns, &stats);
+}
 } else {
-int i;
-
 for (i = 1; i <= n->num_namespaces; i++) {
 ns = nvme_ns(n, i);
 if (!ns) {
 continue;
 }
+if (!ns->attached) {
+continue;
+}
 nvme_set_blk_stats(ns, &stats);
 }
 }
@@ -1560,7 +1567,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, 
NvmeRequest *req)
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 
-static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
+static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req,
+ bool only_active)
 {
 NvmeNamespace *ns;
 NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
@@ -1577,11 +1585,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, 
NvmeRequest *req)
 return nvme_rpt_empty_id_struct(n, req);
 }
 
+if (only_active && !ns->attached) {
+return nvme_rpt_empty_id_struct(n, req);
+}
+
 return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs),
 DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req)
+static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req,
+ bool only_active)
 {
 NvmeNamespace *ns;
 NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
@@ -1598,6 +1611,10 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, 
NvmeRequest *req)
 return nvme_rpt_empty_id_struct(n, req);
 }
 
+if (only_active && !ns->attached) {
+return nvme_rpt_empty

[PATCH v8 09/11] hw/block/nvme: Support Zone Descriptor Extensions

2020-10-29 Thread Dmitry Fomichev

Zone Descriptor Extension is a label that can be assigned to a zone.
It can be set to an Empty zone and it stays assigned until the zone
is reset.

This commit adds a new optional module property,
"zoned.descr_ext_size". Its value must be a multiple of 64 bytes.
If this value is non-zero, it becomes possible to assign extensions
of that size to any Empty zones. The default value for this property
is 0, therefore setting extensions is disabled by default.

Signed-off-by: Hans Holmberg 
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Klaus Jensen 
Reviewed-by: Niklas Cassel 
---
 hw/block/nvme-ns.c| 25 +++--
 hw/block/nvme-ns.h|  8 +++
 hw/block/nvme.c   | 51 +--
 hw/block/trace-events |  2 ++
 4 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 2e45838c15..85dc73cf06 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -133,6 +133,18 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, 
Error **errp)
 return -1;
 }
 
+if (ns->params.zd_extension_size) {
+if (ns->params.zd_extension_size & 0x3f) {
+error_setg(errp,
+"zone descriptor extension size must be a multiple of 64B");
+return -1;
+}
+if ((ns->params.zd_extension_size >> 6) > 0xff) {
+error_setg(errp, "zone descriptor extension size is too large");
+return -1;
+}
+}
+
 return 0;
 }
 
@@ -144,6 +156,10 @@ static void nvme_init_zone_state(NvmeNamespace *ns)
 int i;
 
 ns->zone_array = g_malloc0(ns->zone_array_size);
+if (ns->params.zd_extension_size) {
+ns->zd_extensions = g_malloc0(ns->params.zd_extension_size *
+  ns->num_zones);
+}
 
 QTAILQ_INIT(&ns->exp_open_zones);
 QTAILQ_INIT(&ns->imp_open_zones);
@@ -186,7 +202,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace 
*ns, int lba_index,
 id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00;
 
 id_ns_z->lbafe[lba_index].zsze = cpu_to_le64(ns->zone_size);
-id_ns_z->lbafe[lba_index].zdes = 0;
+id_ns_z->lbafe[lba_index].zdes =
+ns->params.zd_extension_size >> 6; /* Units of 64B */
 
 ns->csi = NVME_CSI_ZONED;
 ns->id_ns.nsze = cpu_to_le64(ns->zone_size * ns->num_zones);
@@ -204,7 +221,8 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone 
*zone)
 
 zone->w_ptr = zone->d.wp;
 state = nvme_get_zone_state(zone);
-if (zone->d.wp != zone->d.zslba) {
+if (zone->d.wp != zone->d.zslba ||
+(zone->d.za & NVME_ZA_ZD_EXT_VALID)) {
 if (state != NVME_ZONE_STATE_CLOSED) {
 trace_pci_nvme_clear_ns_close(state, zone->d.zslba);
 nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED);
@@ -301,6 +319,7 @@ void nvme_ns_cleanup(NvmeNamespace *ns)
 if (ns->params.zoned) {
 g_free(ns->id_ns_zoned);
 g_free(ns->zone_array);
+g_free(ns->zd_extensions);
 }
 }
 
@@ -332,6 +351,8 @@ static Property nvme_ns_props[] = {
params.max_active_zones, 0),
 DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace,
params.max_open_zones, 0),
+DEFINE_PROP_UINT32("zoned.descr_ext_size", NvmeNamespace,
+   params.zd_extension_size, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 421bab0a57..50a6a0e1ac 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -35,6 +35,7 @@ typedef struct NvmeNamespaceParams {
 uint64_t zone_cap_bs;
 uint32_t max_active_zones;
 uint32_t max_open_zones;
+uint32_t zd_extension_size;
 } NvmeNamespaceParams;
 
 typedef struct NvmeNamespace {
@@ -58,6 +59,7 @@ typedef struct NvmeNamespace {
 uint64_tzone_capacity;
 uint64_tzone_array_size;
 uint32_tzone_size_log2;
+uint8_t *zd_extensions;
 int32_t nr_open_zones;
 int32_t nr_active_zones;
 
@@ -127,6 +129,12 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone)
st != NVME_ZONE_STATE_OFFLINE;
 }
 
+static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns,
+ uint32_t zone_idx)
+{
+return &ns->zd_extensions[zone_idx * ns->params.zd_extension_size];
+}
+
 static inline void nvme_aor_inc_open(NvmeNamespace *ns)
 {
 assert(ns->nr_open_zones >= 0);
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 485ac6fc40..339becd3e2 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1711,6 +1711,26 @@ static uint16_t nvme_offline_zone(NvmeNamespace *ns, 
NvmeZone *zone,
 return NVME_ZONE_INVAL_TRANSITION;
 }
 
+static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone)
+{
+uint16_t status;
+uint8_t state = nvme_get_zone_state(zone);
+
+if (state == NVME_ZONE_STATE_EMPTY) {
+nvme_auto_transition_zone(ns, false, true);

[PATCH v8 08/11] hw/block/nvme: Introduce max active and open zone limits

2020-10-29 Thread Dmitry Fomichev

Add two module properties, "zoned.max_active" and "zoned.max_open"
to control the maximum number of zones that can be active or open.
Once these variables are set to non-default values, these limits are
checked during I/O and Too Many Active or Too Many Open command status
is returned if they are exceeded.

Signed-off-by: Hans Holmberg 
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
---
 hw/block/nvme-ns.c| 30 +-
 hw/block/nvme-ns.h| 41 +++
 hw/block/nvme.c   | 94 +++
 hw/block/trace-events |  2 +
 4 files changed, 165 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index e6db7f7d3b..2e45838c15 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -119,6 +119,20 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, 
Error **errp)
 ns->zone_size_log2 = 63 - clz64(ns->zone_size);
 }
 
+/* Make sure that the values of all ZNS properties are sane */
+if (ns->params.max_open_zones > nz) {
+error_setg(errp,
+   "max_open_zones value %u exceeds the number of zones %u",
+   ns->params.max_open_zones, nz);
+return -1;
+}
+if (ns->params.max_active_zones > nz) {
+error_setg(errp,
+   "max_active_zones value %u exceeds the number of zones %u",
+   ns->params.max_active_zones, nz);
+return -1;
+}
+
 return 0;
 }
 
@@ -166,8 +180,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace 
*ns, int lba_index,
 id_ns_z = g_malloc0(sizeof(NvmeIdNsZoned));
 
 /* MAR/MOR are zeroes-based, 0x means no limit */
-id_ns_z->mar = 0x;
-id_ns_z->mor = 0x;
+id_ns_z->mar = cpu_to_le32(ns->params.max_active_zones - 1);
+id_ns_z->mor = cpu_to_le32(ns->params.max_open_zones - 1);
 id_ns_z->zoc = 0;
 id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00;
 
@@ -195,6 +209,7 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone 
*zone)
 trace_pci_nvme_clear_ns_close(state, zone->d.zslba);
 nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED);
 }
+nvme_aor_inc_active(ns);
 QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry);
 } else {
 trace_pci_nvme_clear_ns_reset(state, zone->d.zslba);
@@ -211,16 +226,23 @@ static void nvme_zoned_ns_shutdown(NvmeNamespace *ns)
 
 QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) {
 QTAILQ_REMOVE(&ns->closed_zones, zone, entry);
+nvme_aor_dec_active(ns);
 nvme_clear_zone(ns, zone);
 }
 QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) {
 QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry);
+nvme_aor_dec_open(ns);
+nvme_aor_dec_active(ns);
 nvme_clear_zone(ns, zone);
 }
 QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) {
 QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry);
+nvme_aor_dec_open(ns);
+nvme_aor_dec_active(ns);
 nvme_clear_zone(ns, zone);
 }
+
+assert(ns->nr_open_zones == 0);
 }
 
 static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp)
@@ -306,6 +328,10 @@ static Property nvme_ns_props[] = {
 DEFINE_PROP_SIZE("zoned.zcap", NvmeNamespace, params.zone_cap_bs, 0),
 DEFINE_PROP_BOOL("zoned.cross_read", NvmeNamespace,
  params.cross_zone_read, false),
+DEFINE_PROP_UINT32("zoned.max_active", NvmeNamespace,
+   params.max_active_zones, 0),
+DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace,
+   params.max_open_zones, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index d2631ff5a3..421bab0a57 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -33,6 +33,8 @@ typedef struct NvmeNamespaceParams {
 bool cross_zone_read;
 uint64_t zone_size_bs;
 uint64_t zone_cap_bs;
+uint32_t max_active_zones;
+uint32_t max_open_zones;
 } NvmeNamespaceParams;
 
 typedef struct NvmeNamespace {
@@ -56,6 +58,8 @@ typedef struct NvmeNamespace {
 uint64_tzone_capacity;
 uint64_tzone_array_size;
 uint32_tzone_size_log2;
+int32_t nr_open_zones;
+int32_t nr_active_zones;
 
 NvmeNamespaceParams params;
 } NvmeNamespace;
@@ -123,6 +127,43 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone)
st != NVME_ZONE_STATE_OFFLINE;
 }
 
+static inline void nvme_aor_inc_open(NvmeNamespace *ns)
+{
+assert(ns->nr_open_zones >= 0);
+if (ns->params.max_open_zones) {
+ns->nr_open_zones++;
+assert(ns->nr_open_zones <= ns->params.max_open_zones);
+}
+}
+
+static inline void nvme_aor_dec_open(NvmeNamespace *ns)
+{
+if (ns->params.max_open_zones) {
+assert(ns->nr_open_zones > 0);
+ns->nr_open_zones--;
+}
+assert(ns->nr_open_zones >=

[PATCH v8 05/11] hw/block/nvme: Add support for Namespace Types

2020-10-29 Thread Dmitry Fomichev

From: Niklas Cassel 

Define the structures and constants required to implement
Namespace Types support.

Namespace Types introduce a new command set, "I/O Command Sets",
that allows the host to retrieve the command sets associated with
a namespace. Introduce support for the command set and enable
detection for the NVM Command Set.

The new workflows for identify commands rely heavily on zero-filled
identify structs. E.g., certain CNS commands are defined to return
a zero-filled identify struct when an inactive namespace NSID
is supplied.

Add a helper function in order to avoid code duplication when
reporting zero-filled identify structures.

Signed-off-by: Niklas Cassel 
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Keith Busch 
---
 hw/block/nvme-ns.c|   2 +
 hw/block/nvme-ns.h|   1 +
 hw/block/nvme.c   | 188 +++---
 hw/block/trace-events |   7 ++
 include/block/nvme.h  |  66 +++
 5 files changed, 219 insertions(+), 45 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index de735eb9f3..c0362426cc 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -41,6 +41,8 @@ static void nvme_ns_init(NvmeNamespace *ns)
 
 id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
 
+ns->csi = NVME_CSI_NVM;
+
 /* no thin provisioning */
 id_ns->ncap = id_ns->nsze;
 id_ns->nuse = id_ns->ncap;
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index a38071884a..d795e44bab 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -31,6 +31,7 @@ typedef struct NvmeNamespace {
 int64_t  size;
 NvmeIdNs id_ns;
 const uint32_t *iocs;
+uint8_t  csi;
 
 NvmeNamespaceParams params;
 } NvmeNamespace;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 2a33540542..d9e9fd264c 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1336,7 +1336,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, 
uint32_t buf_len,
 DMA_DIRECTION_FROM_DEVICE, req);
 }
 
-static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len,
+static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len,
  uint64_t off, NvmeRequest *req)
 {
 NvmeEffectsLog log = {};
@@ -1351,8 +1351,15 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t 
buf_len,
 switch (NVME_CC_CSS(n->bar.cc)) {
 case NVME_CC_CSS_NVM:
 src_iocs = nvme_cse_iocs_nvm;
+/* fall through */
 case NVME_CC_CSS_ADMIN_ONLY:
 break;
+case NVME_CC_CSS_CSI:
+switch (csi) {
+case NVME_CSI_NVM:
+src_iocs = nvme_cse_iocs_nvm;
+break;
+}
 }
 
 memcpy(log.acs, nvme_cse_acs, sizeof(nvme_cse_acs));
@@ -1378,6 +1385,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 uint8_t  lid = dw10 & 0xff;
 uint8_t  lsp = (dw10 >> 8) & 0xf;
 uint8_t  rae = (dw10 >> 15) & 0x1;
+uint8_t  csi = le32_to_cpu(cmd->cdw14) >> 24;
 uint32_t numdl, numdu;
 uint64_t off, lpol, lpou;
 size_t   len;
@@ -1411,7 +1419,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 case NVME_LOG_FW_SLOT_INFO:
 return nvme_fw_log_info(n, len, off, req);
 case NVME_LOG_CMD_EFFECTS:
-return nvme_cmd_effects(n, len, off, req);
+return nvme_cmd_effects(n, csi, len, off, req);
 default:
 trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
 return NVME_INVALID_FIELD | NVME_DNR;
@@ -1524,6 +1532,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest 
*req)
 return NVME_SUCCESS;
 }
 
+static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req)
+{
+uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {};
+
+return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req);
+}
+
 static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
 {
 trace_pci_nvme_identify_ctrl();
@@ -1532,11 +1547,23 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, 
NvmeRequest *req)
 DMA_DIRECTION_FROM_DEVICE, req);
 }
 
+static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req)
+{
+NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
+
+trace_pci_nvme_identify_ctrl_csi(c->csi);
+
+if (c->csi == NVME_CSI_NVM) {
+return nvme_rpt_empty_id_struct(n, req);
+}
+
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
 static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeNamespace *ns;
 NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
-NvmeIdNs *id_ns, inactive = { 0 };
 uint32_t nsid = le32_to_cpu(c->nsid);
 
 trace_pci_nvme_identify_ns(nsid);
@@ -1547,23 +1574,46 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, 
NvmeRequest *req)
 
 ns = nvme_ns(n, nsid);
 if (unlikely(!ns)) {
-id_ns = &inactive;
-} else {
-id_ns = &ns->id_ns;
+return nvme_rpt_empty_id_struct(n, req);
 }
 
-return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeI

[PATCH v8 03/11] hw/block/nvme: Separate read and write handlers

2020-10-29 Thread Dmitry Fomichev

With ZNS support in place, the majority of code in nvme_rw() has
become read- or write-specific. Move these parts to two separate
handlers, nvme_read() and nvme_write() to make the code more
readable and to remove multiple is_write checks that so far existed
in the i/o path.

This is a refactoring patch, no change in functionality.

Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
Acked-by: Klaus Jensen 
---
 hw/block/nvme.c   | 91 ++-
 hw/block/trace-events |  3 +-
 2 files changed, 67 insertions(+), 27 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1ec8ccb3e6..3023774484 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -963,6 +963,54 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req)
 return NVME_NO_COMPLETE;
 }
 
+static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req)
+{
+NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
+NvmeNamespace *ns = req->ns;
+uint64_t slba = le64_to_cpu(rw->slba);
+uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
+uint64_t data_size = nvme_l2b(ns, nlb);
+uint64_t data_offset;
+BlockBackend *blk = ns->blkconf.blk;
+uint16_t status;
+
+trace_pci_nvme_read(nvme_cid(req), nvme_nsid(ns), nlb, data_size, slba);
+
+status = nvme_check_mdts(n, data_size);
+if (status) {
+trace_pci_nvme_err_mdts(nvme_cid(req), data_size);
+goto invalid;
+}
+
+status = nvme_check_bounds(n, ns, slba, nlb);
+if (status) {
+trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
+goto invalid;
+}
+
+status = nvme_map_dptr(n, data_size, req);
+if (status) {
+goto invalid;
+}
+
+data_offset = nvme_l2b(ns, slba);
+
+block_acct_start(blk_get_stats(blk), &req->acct, data_size,
+ BLOCK_ACCT_READ);
+if (req->qsg.sg) {
+req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
+  BDRV_SECTOR_SIZE, nvme_rw_cb, req);
+} else {
+req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
+nvme_rw_cb, req);
+}
+return NVME_NO_COMPLETE;
+
+invalid:
+block_acct_invalid(blk_get_stats(blk), BLOCK_ACCT_READ);
+return status | NVME_DNR;
+}
+
 static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
@@ -988,22 +1036,19 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, 
NvmeRequest *req)
 return NVME_NO_COMPLETE;
 }
 
-static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
+static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
 NvmeNamespace *ns = req->ns;
-uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
 uint64_t slba = le64_to_cpu(rw->slba);
-
+uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
 uint64_t data_size = nvme_l2b(ns, nlb);
-uint64_t data_offset = nvme_l2b(ns, slba);
-enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ?
-BLOCK_ACCT_WRITE : BLOCK_ACCT_READ;
+uint64_t data_offset;
 BlockBackend *blk = ns->blkconf.blk;
 uint16_t status;
 
-trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode),
-  nvme_nsid(ns), nlb, data_size, slba);
+trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode),
+ nvme_nsid(ns), nlb, data_size, slba);
 
 status = nvme_check_mdts(n, data_size);
 if (status) {
@@ -1022,29 +1067,22 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req)
 goto invalid;
 }
 
-block_acct_start(blk_get_stats(blk), &req->acct, data_size, acct);
+data_offset = nvme_l2b(ns, slba);
+
+block_acct_start(blk_get_stats(blk), &req->acct, data_size,
+ BLOCK_ACCT_WRITE);
 if (req->qsg.sg) {
-if (acct == BLOCK_ACCT_WRITE) {
-req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
-   BDRV_SECTOR_SIZE, nvme_rw_cb, req);
-} else {
-req->aiocb = dma_blk_read(blk, &req->qsg, data_offset,
-  BDRV_SECTOR_SIZE, nvme_rw_cb, req);
-}
+req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
+   BDRV_SECTOR_SIZE, nvme_rw_cb, req);
 } else {
-if (acct == BLOCK_ACCT_WRITE) {
-req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
- nvme_rw_cb, req);
-} else {
-req->aiocb = blk_aio_preadv(blk, data_offset, &req->iov, 0,
-nvme_rw_cb, req);
-}
+req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
+ nvme_rw_cb, req);
 }
 return NVME_NO_COMPLETE;
 
 invalid:
-block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct);
-return status;
+block_acct_invalid(

[PATCH v8 10/11] hw/block/nvme: Add injection of Offline/Read-Only zones

2020-10-29 Thread Dmitry Fomichev

ZNS specification defines two zone conditions for the zones that no
longer can function properly, possibly because of flash wear or other
internal fault. It is useful to be able to "inject" a small number of
such zones for testing purposes.

This commit defines two optional device properties, "offline_zones"
and "rdonly_zones". Users can assign non-zero values to these variables
to specify the number of zones to be initialized as Offline or
Read-Only. The actual number of injected zones may be smaller than the
requested amount - Read-Only and Offline counts are expected to be much
smaller than the total number of zones on a drive.

Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
---
 hw/block/nvme-ns.c | 52 ++
 hw/block/nvme-ns.h |  2 ++
 2 files changed, 54 insertions(+)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 85dc73cf06..5e4a6705cd 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -21,6 +21,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
+#include "crypto/random.h"
 
 #include "hw/qdev-properties.h"
 #include "hw/qdev-core.h"
@@ -145,6 +146,20 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, 
Error **errp)
 }
 }
 
+if (ns->params.max_open_zones < nz) {
+if (ns->params.nr_offline_zones > nz - ns->params.max_open_zones) {
+error_setg(errp, "offline_zones value %u is too large",
+ns->params.nr_offline_zones);
+return -1;
+}
+if (ns->params.nr_rdonly_zones >
+nz - ns->params.max_open_zones - ns->params.nr_offline_zones) {
+error_setg(errp, "rdonly_zones value %u is too large",
+ns->params.nr_rdonly_zones);
+return -1;
+}
+}
+
 return 0;
 }
 
@@ -153,7 +168,9 @@ static void nvme_init_zone_state(NvmeNamespace *ns)
 uint64_t start = 0, zone_size = ns->zone_size;
 uint64_t capacity = ns->num_zones * zone_size;
 NvmeZone *zone;
+uint32_t rnd;
 int i;
+uint16_t zs;
 
 ns->zone_array = g_malloc0(ns->zone_array_size);
 if (ns->params.zd_extension_size) {
@@ -180,6 +197,37 @@ static void nvme_init_zone_state(NvmeNamespace *ns)
 zone->w_ptr = start;
 start += zone_size;
 }
+
+/* If required, make some zones Offline or Read Only */
+
+for (i = 0; i < ns->params.nr_offline_zones; i++) {
+do {
+qcrypto_random_bytes(&rnd, sizeof(rnd), NULL);
+rnd %= ns->num_zones;
+} while (rnd < ns->params.max_open_zones);
+zone = &ns->zone_array[rnd];
+zs = nvme_get_zone_state(zone);
+if (zs != NVME_ZONE_STATE_OFFLINE) {
+nvme_set_zone_state(zone, NVME_ZONE_STATE_OFFLINE);
+} else {
+i--;
+}
+}
+
+for (i = 0; i < ns->params.nr_rdonly_zones; i++) {
+do {
+qcrypto_random_bytes(&rnd, sizeof(rnd), NULL);
+rnd %= ns->num_zones;
+} while (rnd < ns->params.max_open_zones);
+zone = &ns->zone_array[rnd];
+zs = nvme_get_zone_state(zone);
+if (zs != NVME_ZONE_STATE_OFFLINE &&
+zs != NVME_ZONE_STATE_READ_ONLY) {
+nvme_set_zone_state(zone, NVME_ZONE_STATE_READ_ONLY);
+} else {
+i--;
+}
+}
 }
 
 static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index,
@@ -353,6 +401,10 @@ static Property nvme_ns_props[] = {
params.max_open_zones, 0),
 DEFINE_PROP_UINT32("zoned.descr_ext_size", NvmeNamespace,
params.zd_extension_size, 0),
+DEFINE_PROP_UINT32("zoned.offline_zones", NvmeNamespace,
+   params.nr_offline_zones, 0),
+DEFINE_PROP_UINT32("zoned.rdonly_zones", NvmeNamespace,
+   params.nr_rdonly_zones, 0),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 50a6a0e1ac..b30478e5d7 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -36,6 +36,8 @@ typedef struct NvmeNamespaceParams {
 uint32_t max_active_zones;
 uint32_t max_open_zones;
 uint32_t zd_extension_size;
+uint32_t nr_offline_zones;
+uint32_t nr_rdonly_zones;
 } NvmeNamespaceParams;
 
 typedef struct NvmeNamespace {
-- 
2.21.0

[PATCH v8 01/11] hw/block/nvme: Add Commands Supported and Effects log

2020-10-29 Thread Dmitry Fomichev

This log page becomes necessary to implement to allow checking for
Zone Append command support in Zoned Namespace Command Set.

This commit adds the code to report this log page for NVM Command
Set only. The parts that are specific to zoned operation will be
added later in the series.

All incoming admin and i/o commands are now only processed if their
corresponding support bits are set in this log. This provides an
easy way to control what commands to support and what not to
depending on set CC.CSS.

Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
---
 hw/block/nvme-ns.h|  1 +
 hw/block/nvme.c   | 96 +++
 hw/block/trace-events |  1 +
 include/block/nvme.h  | 19 +
 4 files changed, 108 insertions(+), 9 deletions(-)

diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 83734f4606..ea8c2f785d 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -29,6 +29,7 @@ typedef struct NvmeNamespace {
 int32_t  bootindex;
 int64_t  size;
 NvmeIdNs id_ns;
+const uint32_t *iocs;
 
 NvmeNamespaceParams params;
 } NvmeNamespace;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 9d30ca69dc..9fed061a9d 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -111,6 +111,28 @@ static const uint32_t nvme_feature_cap[NVME_FID_MAX] = {
 [NVME_TIMESTAMP]= NVME_FEAT_CAP_CHANGE,
 };
 
+static const uint32_t nvme_cse_acs[256] = {
+[NVME_ADM_CMD_DELETE_SQ]= NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_CREATE_SQ]= NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_DELETE_CQ]= NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_CREATE_CQ]= NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_IDENTIFY] = NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_ABORT]= NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_SET_FEATURES] = NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_GET_FEATURES] = NVME_CMD_EFF_CSUPP,
+[NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFF_CSUPP,
+};
+
+static const uint32_t nvme_cse_iocs_none[256];
+
+static const uint32_t nvme_cse_iocs_nvm[256] = {
+[NVME_CMD_FLUSH]= NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_CMD_WRITE]= NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_CMD_READ] = NVME_CMD_EFF_CSUPP,
+};
+
 static void nvme_process_sq(void *opaque);
 
 static uint16_t nvme_cid(NvmeRequest *req)
@@ -1032,10 +1054,6 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req),
   req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode));
 
-if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ADMIN_ONLY) {
-return NVME_INVALID_OPCODE | NVME_DNR;
-}
-
 if (!nvme_nsid_valid(n, nsid)) {
 return NVME_INVALID_NSID | NVME_DNR;
 }
@@ -1045,6 +1063,11 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest 
*req)
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 
+if (!(req->ns->iocs[req->cmd.opcode] & NVME_CMD_EFF_CSUPP)) {
+trace_pci_nvme_err_invalid_opc(req->cmd.opcode);
+return NVME_INVALID_OPCODE | NVME_DNR;
+}
+
 switch (req->cmd.opcode) {
 case NVME_CMD_FLUSH:
 return nvme_flush(n, req);
@@ -1054,8 +1077,7 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
 case NVME_CMD_READ:
 return nvme_rw(n, req);
 default:
-trace_pci_nvme_err_invalid_opc(req->cmd.opcode);
-return NVME_INVALID_OPCODE | NVME_DNR;
+assert(false);
 }
 }
 
@@ -1291,6 +1313,37 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t 
rae, uint32_t buf_len,
 DMA_DIRECTION_FROM_DEVICE, req);
 }
 
+static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len,
+ uint64_t off, NvmeRequest *req)
+{
+NvmeEffectsLog log = {};
+const uint32_t *src_iocs = NULL;
+uint32_t trans_len;
+
+if (off >= sizeof(log)) {
+trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(log));
+return NVME_INVALID_FIELD | NVME_DNR;
+}
+
+switch (NVME_CC_CSS(n->bar.cc)) {
+case NVME_CC_CSS_NVM:
+src_iocs = nvme_cse_iocs_nvm;
+case NVME_CC_CSS_ADMIN_ONLY:
+break;
+}
+
+memcpy(log.acs, nvme_cse_acs, sizeof(nvme_cse_acs));
+
+if (src_iocs) {
+memcpy(log.iocs, src_iocs, sizeof(log.iocs));
+}
+
+trans_len = MIN(sizeof(log) - off, buf_len);
+
+return nvme_dma(n, ((uint8_t *)&log) + off, trans_len,
+DMA_DIRECTION_FROM_DEVICE, req);
+}
+
 static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
 {
 NvmeCmd *cmd = &req->cmd;
@@ -1334,6 +1387,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest 
*req)
 return nvme_smart_info(n, rae, len, off, req);
 case NVME_LOG_FW_SLOT_INFO:
 retur

[PATCH v8 07/11] hw/block/nvme: Support Zoned Namespace Command Set

2020-10-29 Thread Dmitry Fomichev

The emulation code has been changed to advertise NVM Command Set when
"zoned" device property is not set (default) and Zoned Namespace
Command Set otherwise.

Define values and structures that are needed to support Zoned
Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator.
Define trace events where needed in newly introduced code.

In order to improve scalability, all open, closed and full zones
are organized in separate linked lists. Consequently, almost all
zone operations don't require scanning of the entire zone array
(which potentially can be quite large) - it is only necessary to
enumerate one or more zone lists.

Handlers for three new NVMe commands introduced in Zoned Namespace
Command Set specification are added, namely for Zone Management
Receive, Zone Management Send and Zone Append.

Device initialization code has been extended to create a proper
configuration for zoned operation using device properties.

Read/Write command handler is modified to only allow writes at the
write pointer if the namespace is zoned. For Zone Append command,
writes implicitly happen at the write pointer and the starting write
pointer value is returned as the result of the command. Write Zeroes
handler is modified to add zoned checks that are identical to those
done as a part of Write flow.

Subsequent commits in this series add ZDE support and checks for
active and open zone limits.

Signed-off-by: Niklas Cassel 
Signed-off-by: Hans Holmberg 
Signed-off-by: Ajay Joshi 
Signed-off-by: Chaitanya Kulkarni 
Signed-off-by: Matias Bjorling 
Signed-off-by: Aravind Ramesh 
Signed-off-by: Shin'ichiro Kawasaki 
Signed-off-by: Adam Manzanares 
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
---
 block/nvme.c  |   2 +-
 hw/block/nvme-ns.c| 173 
 hw/block/nvme-ns.h|  54 +++
 hw/block/nvme.c   | 977 +-
 hw/block/nvme.h   |   8 +
 hw/block/trace-events |  18 +-
 include/block/nvme.h  | 113 -
 7 files changed, 1322 insertions(+), 23 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 05485fdd11..7a513c9a17 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -333,7 +333,7 @@ static inline int nvme_translate_error(const NvmeCqe *c)
 {
 uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF;
 if (status) {
-trace_nvme_error(le32_to_cpu(c->result),
+trace_nvme_error(le32_to_cpu(c->result32),
  le16_to_cpu(c->sq_head),
  le16_to_cpu(c->sq_id),
  le16_to_cpu(c->cid),
diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index e191ef9be0..e6db7f7d3b 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -25,6 +25,7 @@
 #include "hw/qdev-properties.h"
 #include "hw/qdev-core.h"
 
+#include "trace.h"
 #include "nvme.h"
 #include "nvme-ns.h"
 
@@ -77,6 +78,151 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, 
Error **errp)
 return 0;
 }
 
+static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp)
+{
+uint64_t zone_size, zone_cap;
+uint32_t nz, lbasz = ns->blkconf.logical_block_size;
+
+if (ns->params.zone_size_bs) {
+zone_size = ns->params.zone_size_bs;
+} else {
+zone_size = NVME_DEFAULT_ZONE_SIZE;
+}
+if (ns->params.zone_cap_bs) {
+zone_cap = ns->params.zone_cap_bs;
+} else {
+zone_cap = zone_size;
+}
+if (zone_cap > zone_size) {
+error_setg(errp, "zone capacity %luB exceeds zone size %luB",
+   zone_cap, zone_size);
+return -1;
+}
+if (zone_size < lbasz) {
+error_setg(errp, "zone size %luB too small, must be at least %uB",
+   zone_size, lbasz);
+return -1;
+}
+if (zone_cap < lbasz) {
+error_setg(errp, "zone capacity %luB too small, must be at least %uB",
+   zone_cap, lbasz);
+return -1;
+}
+ns->zone_size = zone_size / lbasz;
+ns->zone_capacity = zone_cap / lbasz;
+
+nz = DIV_ROUND_UP(ns->size / lbasz, ns->zone_size);
+ns->num_zones = nz;
+ns->zone_array_size = sizeof(NvmeZone) * nz;
+ns->zone_size_log2 = 0;
+if (is_power_of_2(ns->zone_size)) {
+ns->zone_size_log2 = 63 - clz64(ns->zone_size);
+}
+
+return 0;
+}
+
+static void nvme_init_zone_state(NvmeNamespace *ns)
+{
+uint64_t start = 0, zone_size = ns->zone_size;
+uint64_t capacity = ns->num_zones * zone_size;
+NvmeZone *zone;
+int i;
+
+ns->zone_array = g_malloc0(ns->zone_array_size);
+
+QTAILQ_INIT(&ns->exp_open_zones);
+QTAILQ_INIT(&ns->imp_open_zones);
+QTAILQ_INIT(&ns->closed_zones);
+QTAILQ_INIT(&ns->full_zones);
+
+zone = ns->zone_array;
+for (i = 0; i < ns->num_zones; i++, zone++) {
+if (start + zone_size > capacity) {
+zone_size = capacity - start;
+}
+zone->d.zt = NVME_ZONE_TYPE_SEQ_WRITE;
+nvme_set_zone_state(zone, NVME_ZONE_STATE_E

[PATCH v8 02/11] hw/block/nvme: Generate namespace UUIDs

2020-10-29 Thread Dmitry Fomichev

In NVMe 1.4, a namespace must report an ID descriptor of UUID type
if it doesn't support EUI64 or NGUID. Add a new namespace property,
"uuid", that provides the user the option to either specify the UUID
explicitly or have a UUID generated automatically every time a
namespace is initialized.

Suggested-by: Klaus Jensen 
Signed-off-by: Dmitry Fomichev 
Reviewed-by: Klaus Jensen 
Reviewed-by: Keith Busch 
Reviewed-by: Niklas Cassel 
---
 hw/block/nvme-ns.c | 1 +
 hw/block/nvme-ns.h | 1 +
 hw/block/nvme.c| 9 +
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index b69cdaf27e..de735eb9f3 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -129,6 +129,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp)
 static Property nvme_ns_props[] = {
 DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf),
 DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0),
+DEFINE_PROP_UUID("uuid", NvmeNamespace, params.uuid),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index ea8c2f785d..a38071884a 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -21,6 +21,7 @@
 
 typedef struct NvmeNamespaceParams {
 uint32_t nsid;
+QemuUUID uuid;
 } NvmeNamespaceParams;
 
 typedef struct NvmeNamespace {
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 9fed061a9d..1ec8ccb3e6 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1572,6 +1572,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, 
NvmeRequest *req)
 
 static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
 {
+NvmeNamespace *ns;
 NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
 uint32_t nsid = le32_to_cpu(c->nsid);
 uint8_t list[NVME_IDENTIFY_DATA_SIZE];
@@ -1591,7 +1592,8 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, 
NvmeRequest *req)
 return NVME_INVALID_NSID | NVME_DNR;
 }
 
-if (unlikely(!nvme_ns(n, nsid))) {
+ns = nvme_ns(n, nsid);
+if (unlikely(!ns)) {
 return NVME_INVALID_FIELD | NVME_DNR;
 }
 
@@ -1600,12 +1602,11 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl 
*n, NvmeRequest *req)
 /*
  * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
  * structure, a Namespace UUID (nidt = 0x3) must be reported in the
- * Namespace Identification Descriptor. Add a very basic Namespace UUID
- * here.
+ * Namespace Identification Descriptor. Add the namespace UUID here.
  */
 ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
 ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN;
-stl_be_p(&ns_descrs->uuid.v, nsid);
+memcpy(&ns_descrs->uuid.v, ns->params.uuid.data, NVME_NIDT_UUID_LEN);
 
 return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,
 DMA_DIRECTION_FROM_DEVICE, req);
-- 
2.21.0

[PATCH v2 11/11] target/arm: Improve do_prewiden_3d

2020-10-29 Thread Richard Henderson

We can use proper widening loads to extend 32-bit inputs,
and skip the "widenfn" step.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c  |  6 +++
 target/arm/translate-neon.c.inc | 66 ++---
 2 files changed, 43 insertions(+), 29 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 7611c1f0f1..29ea1eb781 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1183,6 +1183,12 @@ static void read_neon_element64(TCGv_i64 dest, int reg, 
int ele, MemOp memop)
 long off = neon_element_offset(reg, ele, memop);
 
 switch (memop) {
+case MO_SL:
+tcg_gen_ld32s_i64(dest, cpu_env, off);
+break;
+case MO_UL:
+tcg_gen_ld32u_i64(dest, cpu_env, off);
+break;
 case MO_Q:
 tcg_gen_ld_i64(dest, cpu_env, off);
 break;
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
index 1c16c56e7e..59368cb243 100644
--- a/target/arm/translate-neon.c.inc
+++ b/target/arm/translate-neon.c.inc
@@ -1788,11 +1788,10 @@ static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm 
*a)
 static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
NeonGenWidenFn *widenfn,
NeonGenTwo64OpFn *opfn,
-   bool src1_wide)
+   int src1_mop, int src2_mop)
 {
 /* 3-regs different lengths, prewidening case (VADDL/VSUBL/VAADW/VSUBW) */
 TCGv_i64 rn0_64, rn1_64, rm_64;
-TCGv_i32 rm;
 
 if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 return false;
@@ -1804,12 +1803,12 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff 
*a,
 return false;
 }
 
-if (!widenfn || !opfn) {
+if (!opfn) {
 /* size == 3 case, which is an entirely different insn group */
 return false;
 }
 
-if ((a->vd & 1) || (src1_wide && (a->vn & 1))) {
+if ((a->vd & 1) || (src1_mop == MO_Q && (a->vn & 1))) {
 return false;
 }
 
@@ -1821,40 +1820,48 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff 
*a,
 rn1_64 = tcg_temp_new_i64();
 rm_64 = tcg_temp_new_i64();
 
-if (src1_wide) {
-read_neon_element64(rn0_64, a->vn, 0, MO_64);
+if (src1_mop >= 0) {
+read_neon_element64(rn0_64, a->vn, 0, src1_mop);
 } else {
 TCGv_i32 tmp = tcg_temp_new_i32();
 read_neon_element32(tmp, a->vn, 0, MO_32);
 widenfn(rn0_64, tmp);
 tcg_temp_free_i32(tmp);
 }
-rm = tcg_temp_new_i32();
-read_neon_element32(rm, a->vm, 0, MO_32);
+if (src2_mop >= 0) {
+read_neon_element64(rm_64, a->vm, 0, src2_mop);
+} else {
+TCGv_i32 tmp = tcg_temp_new_i32();
+read_neon_element32(tmp, a->vm, 0, MO_32);
+widenfn(rm_64, tmp);
+tcg_temp_free_i32(tmp);
+}
 
-widenfn(rm_64, rm);
-tcg_temp_free_i32(rm);
 opfn(rn0_64, rn0_64, rm_64);
 
 /*
  * Load second pass inputs before storing the first pass result, to
  * avoid incorrect results if a narrow input overlaps with the result.
  */
-if (src1_wide) {
-read_neon_element64(rn1_64, a->vn, 1, MO_64);
+if (src1_mop >= 0) {
+read_neon_element64(rn1_64, a->vn, 1, src1_mop);
 } else {
 TCGv_i32 tmp = tcg_temp_new_i32();
 read_neon_element32(tmp, a->vn, 1, MO_32);
 widenfn(rn1_64, tmp);
 tcg_temp_free_i32(tmp);
 }
-rm = tcg_temp_new_i32();
-read_neon_element32(rm, a->vm, 1, MO_32);
+if (src2_mop >= 0) {
+read_neon_element64(rm_64, a->vm, 1, src2_mop);
+} else {
+TCGv_i32 tmp = tcg_temp_new_i32();
+read_neon_element32(tmp, a->vm, 1, MO_32);
+widenfn(rm_64, tmp);
+tcg_temp_free_i32(tmp);
+}
 
 write_neon_element64(rn0_64, a->vd, 0, MO_64);
 
-widenfn(rm_64, rm);
-tcg_temp_free_i32(rm);
 opfn(rn1_64, rn1_64, rm_64);
 write_neon_element64(rn1_64, a->vd, 1, MO_64);
 
@@ -1865,14 +1872,13 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff 
*a,
 return true;
 }
 
-#define DO_PREWIDEN(INSN, S, EXT, OP, SRC1WIDE) \
+#define DO_PREWIDEN(INSN, S, OP, SRC1WIDE, SIGN)\
 static bool trans_##INSN##_3d(DisasContext *s, arg_3diff *a)\
 {   \
 static NeonGenWidenFn * const widenfn[] = { \
 gen_helper_neon_widen_##S##8,   \
 gen_helper_neon_widen_##S##16,  \
-tcg_gen_##EXT##_i32_i64,\
-NULL,   \
+NULL, NULL, \
 };  \
 static NeonGenTwo64OpFn * const addfn[] = {

[PATCH v2 10/11] target/arm: Simplify do_long_3d and do_2scalar_long

2020-10-29 Thread Richard Henderson

In both cases, we can sink the write-back and perform
the accumulate into the normal destination temps.

Signed-off-by: Richard Henderson 
---
 target/arm/translate-neon.c.inc | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
index c2d67160f9..1c16c56e7e 100644
--- a/target/arm/translate-neon.c.inc
+++ b/target/arm/translate-neon.c.inc
@@ -2037,17 +2037,14 @@ static bool do_long_3d(DisasContext *s, arg_3diff *a,
 if (accfn) {
 tmp = tcg_temp_new_i64();
 read_neon_element64(tmp, a->vd, 0, MO_64);
-accfn(tmp, tmp, rd0);
-write_neon_element64(tmp, a->vd, 0, MO_64);
+accfn(rd0, tmp, rd0);
 read_neon_element64(tmp, a->vd, 1, MO_64);
-accfn(tmp, tmp, rd1);
-write_neon_element64(tmp, a->vd, 1, MO_64);
+accfn(rd1, tmp, rd1);
 tcg_temp_free_i64(tmp);
-} else {
-write_neon_element64(rd0, a->vd, 0, MO_64);
-write_neon_element64(rd1, a->vd, 1, MO_64);
 }
 
+write_neon_element64(rd0, a->vd, 0, MO_64);
+write_neon_element64(rd1, a->vd, 1, MO_64);
 tcg_temp_free_i64(rd0);
 tcg_temp_free_i64(rd1);
 
@@ -2670,16 +2667,14 @@ static bool do_2scalar_long(DisasContext *s, 
arg_2scalar *a,
 if (accfn) {
 TCGv_i64 t64 = tcg_temp_new_i64();
 read_neon_element64(t64, a->vd, 0, MO_64);
-accfn(t64, t64, rn0_64);
-write_neon_element64(t64, a->vd, 0, MO_64);
+accfn(rn0_64, t64, rn0_64);
 read_neon_element64(t64, a->vd, 1, MO_64);
-accfn(t64, t64, rn1_64);
-write_neon_element64(t64, a->vd, 1, MO_64);
+accfn(rn1_64, t64, rn1_64);
 tcg_temp_free_i64(t64);
-} else {
-write_neon_element64(rn0_64, a->vd, 0, MO_64);
-write_neon_element64(rn1_64, a->vd, 1, MO_64);
 }
+
+write_neon_element64(rn0_64, a->vd, 0, MO_64);
+write_neon_element64(rn1_64, a->vd, 1, MO_64);
 tcg_temp_free_i64(rn0_64);
 tcg_temp_free_i64(rn1_64);
 return true;
-- 
2.25.1

[PATCH v8 04/11] hw/block/nvme: Merge nvme_write_zeroes() with nvme_write()

2020-10-29 Thread Dmitry Fomichev

nvme_write() now handles WRITE, WRITE ZEROES and ZONE_APPEND.

Signed-off-by: Dmitry Fomichev 
Reviewed-by: Niklas Cassel 
Acked-by: Klaus Jensen 
---
 hw/block/nvme.c   | 72 +--
 hw/block/trace-events |  1 -
 2 files changed, 28 insertions(+), 45 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 3023774484..2a33540542 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1011,32 +1011,7 @@ invalid:
 return status | NVME_DNR;
 }
 
-static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req)
-{
-NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
-NvmeNamespace *ns = req->ns;
-uint64_t slba = le64_to_cpu(rw->slba);
-uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1;
-uint64_t offset = nvme_l2b(ns, slba);
-uint32_t count = nvme_l2b(ns, nlb);
-uint16_t status;
-
-trace_pci_nvme_write_zeroes(nvme_cid(req), nvme_nsid(ns), slba, nlb);
-
-status = nvme_check_bounds(n, ns, slba, nlb);
-if (status) {
-trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze);
-return status;
-}
-
-block_acct_start(blk_get_stats(req->ns->blkconf.blk), &req->acct, 0,
- BLOCK_ACCT_WRITE);
-req->aiocb = blk_aio_pwrite_zeroes(req->ns->blkconf.blk, offset, count,
-   BDRV_REQ_MAY_UNMAP, nvme_rw_cb, req);
-return NVME_NO_COMPLETE;
-}
-
-static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req)
+static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest *req, bool wrz)
 {
 NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd;
 NvmeNamespace *ns = req->ns;
@@ -1050,10 +1025,12 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest 
*req)
 trace_pci_nvme_write(nvme_cid(req), nvme_io_opc_str(rw->opcode),
  nvme_nsid(ns), nlb, data_size, slba);
 
-status = nvme_check_mdts(n, data_size);
-if (status) {
-trace_pci_nvme_err_mdts(nvme_cid(req), data_size);
-goto invalid;
+if (!wrz) {
+status = nvme_check_mdts(n, data_size);
+if (status) {
+trace_pci_nvme_err_mdts(nvme_cid(req), data_size);
+goto invalid;
+}
 }
 
 status = nvme_check_bounds(n, ns, slba, nlb);
@@ -1062,21 +1039,28 @@ static uint16_t nvme_write(NvmeCtrl *n, NvmeRequest 
*req)
 goto invalid;
 }
 
-status = nvme_map_dptr(n, data_size, req);
-if (status) {
-goto invalid;
-}
-
 data_offset = nvme_l2b(ns, slba);
 
-block_acct_start(blk_get_stats(blk), &req->acct, data_size,
- BLOCK_ACCT_WRITE);
-if (req->qsg.sg) {
-req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
-   BDRV_SECTOR_SIZE, nvme_rw_cb, req);
+if (!wrz) {
+status = nvme_map_dptr(n, data_size, req);
+if (status) {
+goto invalid;
+}
+
+block_acct_start(blk_get_stats(blk), &req->acct, data_size,
+ BLOCK_ACCT_WRITE);
+if (req->qsg.sg) {
+req->aiocb = dma_blk_write(blk, &req->qsg, data_offset,
+   BDRV_SECTOR_SIZE, nvme_rw_cb, req);
+} else {
+req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
+ nvme_rw_cb, req);
+}
 } else {
-req->aiocb = blk_aio_pwritev(blk, data_offset, &req->iov, 0,
- nvme_rw_cb, req);
+block_acct_start(blk_get_stats(blk), &req->acct, 0, BLOCK_ACCT_WRITE);
+req->aiocb = blk_aio_pwrite_zeroes(blk, data_offset, data_size,
+   BDRV_REQ_MAY_UNMAP, nvme_rw_cb,
+   req);
 }
 return NVME_NO_COMPLETE;
 
@@ -1110,9 +1094,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req)
 case NVME_CMD_FLUSH:
 return nvme_flush(n, req);
 case NVME_CMD_WRITE_ZEROES:
-return nvme_write_zeroes(n, req);
+return nvme_write(n, req, true);
 case NVME_CMD_WRITE:
-return nvme_write(n, req);
+return nvme_write(n, req, false);
 case NVME_CMD_READ:
 return nvme_read(n, req);
 default:
diff --git a/hw/block/trace-events b/hw/block/trace-events
index d81b1891d8..658633177d 100644
--- a/hw/block/trace-events
+++ b/hw/block/trace-events
@@ -43,7 +43,6 @@ pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t 
opcode, const char *opna
 pci_nvme_read(uint16_t cid, uint32_t nsid, uint32_t nlb, uint64_t count, 
uint64_t lba) "cid %"PRIu16" nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 
0x%"PRIx64""
 pci_nvme_write(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, 
uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb 
%"PRIu32" count %"PRIu64" lba 0x%"PRIx64""
 pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'"
-pci_nvme_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t s

[PATCH v8 00/11] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

2020-10-29 Thread Dmitry Fomichev

v7 -> v8:

 - Move refactoring commits to the front of the series.

 - Remove "attached" and "fill_pattern" device properties.

 - Only close open zones upon subsystem shutdown, not when CC.EN flag
   is set to 0. Avoid looping through all zones by iterating through
   lists of open and closed zones.

 - Improve bulk processing of zones aka zoned operations with "all"
   flag set. Avoid looping through the entire zone array for all zone
   operations except Offline Zone.

 - Prefix ZNS-related property names with "zoned.". The "zoned" Boolean
   property is retained to turn on zoned command set as it is much more
   intuitive and user-friendly compared to setting a magic number value
   to csi property.

 - Address review comments.

 - Remove unused trace events.

v6 -> v7:

 - Introduce ns->iocs initialization function earlier in the series,
   in CSE Log patch.

 - Set NVM iocs for zoned namespaces when CC.CSS is set to
   NVME_CC_CSS_NVM.

 - Clean up code in CSE log handler.
 
v5 -> v6:

 - Remove zoned state persistence code. Replace position-independent
   zone lists with QTAILQs.

 - Close all open zones upon clearing of the controller. This is
   a similar procedure to the one previously performed upon powering
   up with zone persistence. 

 - Squash NS Types and ZNS triplets of commits to keep definitions
   and trace event definitions together with the implementation code.

 - Move namespace UUID generation to a separate patch. Add the new
   "uuid" property as suggested by Klaus.

 - Rework Commands and Effects patch to make sure that the log is
   always in sync with the actual set of commands supported.

 - Add two refactoring commits at the end of the series to
   optimize read and write i/o path.

- Incorporate feedback from Keith, Klaus and Niklas:

  * fix rebase errors in nvme_identify_ns_descr_list()
  * remove unnecessary code from nvme_write_bar()
  * move csi to NvmeNamespace and use it from the beginning in NSTypes
patch
  * change zone read processing to cover all corner cases with RAZB=1
  * sync w_ptr and d.wp in case of a i/o error at the preceding zone
  * reword the commit message in active/inactive patch with the new
text from Niklas
  * correct dlfeat reporting depending on the fill pattern set
  * add more checks for "attached" n/s parameter to prevent i/o and
get/set features on inactive namespaces
  * Use DEFINE_PROP_SIZE and DEFINE_PROP_SIZE32 for zone size/capacity
and ZASL respectively
  * Improve zone size and capacity validation
  * Correctly report NSZE

v4 -> v5:

 - Rebase to the current qemu-nvme.

 - Use HostMemoryBackendFile as the backing storage for persistent
   zone metadata.

 - Fix the issue with filling the valid data in the next zone if RAZB
   is enabled.

v3 -> v4:

 - Fix bugs introduced in v2/v3 for QD > 1 operation. Now, all writes
   to a zone happen at the new write pointer variable, zone->w_ptr,
   that is advanced right after submitting the backend i/o. The existing
   zone->d.wp variable is updated upon the successful write completion
   and it is used for zone reporting. Some code has been split from
   nvme_finalize_zoned_write() function to a new function,
   nvme_advance_zone_wp().

 - Make the code compile under mingw. Switch to using QEMU API for
   mmap/msync, i.e. memory_region...(). Since mmap is not available in
   mingw (even though there is mman-win32 library available on Github),
   conditional compilation is added around these calls to avoid
   undefined symbols under mingw. A better fix would be to add stub
   functions to softmmu/memory.c for the case when CONFIG_POSIX is not
   defined, but such change is beyond the scope of this patchset and it
   can be made in a separate patch.

 - Correct permission mask used to open zone metadata file.

 - Fold "Define 64 bit cqe.result" patch into ZNS commit.

 - Use clz64/clz32 instead of defining nvme_ilog2() function.

 - Simplify rpt_empty_id_struct() code, move nvme_fill_data() back
   to ZNS patch.

 - Fix a power-on processing bug.

 - Rename NVME_CMD_ZONE_APND to NVME_CMD_ZONE_APPEND.

 - Make the list of review comments addressed in v2 of the series
   (see below).

v2 -> v3:

 - Moved nvme_fill_data() function to the NSTypes patch as it is
   now used there to output empty namespace identify structs.
 - Fixed typo in Maxim's email address.

v1 -> v2:

 - Rebased on top of qemu-nvme/next branch.
 - Incorporated feedback from Klaus and Alistair.
* Allow a subset of CSE log to be read, not the entire log
* Assign admin command entries in CSE log to ACS fields
* Set LPA bit 1 to indicate support of CSE log page
* Rename CC.CSS value CSS_ALL_NSTYPES (110b) to CSS_CSI
* Move the code to assign lbaf.ds to a separate patch
* Remove the change in firmware revision
* Change "driver" to "device" in comments and annotations
* Rename ZAMDS to ZASL
* Correct a few format expressions and some wording in
  trace event definitions
* Re

[PATCH v2 08/11] target/arm: Add read/write_neon_element64

2020-10-29 Thread Richard Henderson

Replace all uses of neon_load/store_reg64 within translate-neon.c.inc.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c  | 26 +
 target/arm/translate-neon.c.inc | 94 -
 2 files changed, 73 insertions(+), 47 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 8491ab705b..4fb0a62200 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1178,6 +1178,19 @@ static void read_neon_element32(TCGv_i32 dest, int reg, 
int ele, MemOp memop)
 }
 }
 
+static void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop)
+{
+long off = neon_element_offset(reg, ele, memop);
+
+switch (memop) {
+case MO_Q:
+tcg_gen_ld_i64(dest, cpu_env, off);
+break;
+default:
+g_assert_not_reached();
+}
+}
+
 static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop)
 {
 long off = neon_element_offset(reg, ele, memop);
@@ -1197,6 +1210,19 @@ static void write_neon_element32(TCGv_i32 src, int reg, 
int ele, MemOp memop)
 }
 }
 
+static void write_neon_element64(TCGv_i64 src, int reg, int ele, MemOp memop)
+{
+long off = neon_element_offset(reg, ele, memop);
+
+switch (memop) {
+case MO_64:
+tcg_gen_st_i64(src, cpu_env, off);
+break;
+default:
+g_assert_not_reached();
+}
+}
+
 static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
 {
 TCGv_ptr ret = tcg_temp_new_ptr();
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
index 549381703e..c2d67160f9 100644
--- a/target/arm/translate-neon.c.inc
+++ b/target/arm/translate-neon.c.inc
@@ -1265,9 +1265,9 @@ static bool do_2shift_env_64(DisasContext *s, 
arg_2reg_shift *a,
 for (pass = 0; pass < a->q + 1; pass++) {
 TCGv_i64 tmp = tcg_temp_new_i64();
 
-neon_load_reg64(tmp, a->vm + pass);
+read_neon_element64(tmp, a->vm, pass, MO_64);
 fn(tmp, cpu_env, tmp, constimm);
-neon_store_reg64(tmp, a->vd + pass);
+write_neon_element64(tmp, a->vd, pass, MO_64);
 tcg_temp_free_i64(tmp);
 }
 tcg_temp_free_i64(constimm);
@@ -1375,8 +1375,8 @@ static bool do_2shift_narrow_64(DisasContext *s, 
arg_2reg_shift *a,
 rd = tcg_temp_new_i32();
 
 /* Load both inputs first to avoid potential overwrite if rm == rd */
-neon_load_reg64(rm1, a->vm);
-neon_load_reg64(rm2, a->vm + 1);
+read_neon_element64(rm1, a->vm, 0, MO_64);
+read_neon_element64(rm2, a->vm, 1, MO_64);
 
 shiftfn(rm1, rm1, constimm);
 narrowfn(rd, cpu_env, rm1);
@@ -1579,7 +1579,7 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift 
*a,
 tcg_gen_shli_i64(tmp, tmp, a->shift);
 tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
 }
-neon_store_reg64(tmp, a->vd);
+write_neon_element64(tmp, a->vd, 0, MO_64);
 
 widenfn(tmp, rm1);
 tcg_temp_free_i32(rm1);
@@ -1587,7 +1587,7 @@ static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift 
*a,
 tcg_gen_shli_i64(tmp, tmp, a->shift);
 tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
 }
-neon_store_reg64(tmp, a->vd + 1);
+write_neon_element64(tmp, a->vd, 1, MO_64);
 tcg_temp_free_i64(tmp);
 return true;
 }
@@ -1822,7 +1822,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
 rm_64 = tcg_temp_new_i64();
 
 if (src1_wide) {
-neon_load_reg64(rn0_64, a->vn);
+read_neon_element64(rn0_64, a->vn, 0, MO_64);
 } else {
 TCGv_i32 tmp = tcg_temp_new_i32();
 read_neon_element32(tmp, a->vn, 0, MO_32);
@@ -1841,7 +1841,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
  * avoid incorrect results if a narrow input overlaps with the result.
  */
 if (src1_wide) {
-neon_load_reg64(rn1_64, a->vn + 1);
+read_neon_element64(rn1_64, a->vn, 1, MO_64);
 } else {
 TCGv_i32 tmp = tcg_temp_new_i32();
 read_neon_element32(tmp, a->vn, 1, MO_32);
@@ -1851,12 +1851,12 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff 
*a,
 rm = tcg_temp_new_i32();
 read_neon_element32(rm, a->vm, 1, MO_32);
 
-neon_store_reg64(rn0_64, a->vd);
+write_neon_element64(rn0_64, a->vd, 0, MO_64);
 
 widenfn(rm_64, rm);
 tcg_temp_free_i32(rm);
 opfn(rn1_64, rn1_64, rm_64);
-neon_store_reg64(rn1_64, a->vd + 1);
+write_neon_element64(rn1_64, a->vd, 1, MO_64);
 
 tcg_temp_free_i64(rn0_64);
 tcg_temp_free_i64(rn1_64);
@@ -1928,15 +1928,15 @@ static bool do_narrow_3d(DisasContext *s, arg_3diff *a,
 rd0 = tcg_temp_new_i32();
 rd1 = tcg_temp_new_i32();
 
-neon_load_reg64(rn_64, a->vn);
-neon_load_reg64(rm_64, a->vm);
+read_neon_element64(rn_64, a->vn, 0, MO_64);
+read_neon_element64(rm_64, a->vm, 0, MO_64);
 
 opfn(rn_64, rn_64, rm_64);
 
 narrowfn(rd0, rn_64);
 
-neon_load_reg64(rn_64, a->vn + 1);
-neon_load_reg64(rm_64, a->vm + 1);
+read_neon_element64(rn_64, a->

[PATCH v2 09/11] target/arm: Rename neon_load_reg64 to vfp_load_reg64

2020-10-29 Thread Richard Henderson

The only uses of this function are for loading VFP
double-precision values, and nothing to do with NEON.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c |  8 ++--
 target/arm/translate-vfp.c.inc | 84 +-
 2 files changed, 46 insertions(+), 46 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4fb0a62200..7611c1f0f1 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1132,14 +1132,14 @@ static long vfp_reg_offset(bool dp, unsigned reg)
 }
 }
 
-static inline void neon_load_reg64(TCGv_i64 var, int reg)
+static inline void vfp_load_reg64(TCGv_i64 var, int reg)
 {
-tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
+tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(true, reg));
 }
 
-static inline void neon_store_reg64(TCGv_i64 var, int reg)
+static inline void vfp_store_reg64(TCGv_i64 var, int reg)
 {
-tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg));
+tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(true, reg));
 }
 
 static inline void vfp_load_reg32(TCGv_i32 var, int reg)
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
index d2a9b658bb..f966de5b1f 100644
--- a/target/arm/translate-vfp.c.inc
+++ b/target/arm/translate-vfp.c.inc
@@ -236,8 +236,8 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
 tcg_gen_ext_i32_i64(nf, cpu_NF);
 tcg_gen_ext_i32_i64(vf, cpu_VF);
 
-neon_load_reg64(frn, rn);
-neon_load_reg64(frm, rm);
+vfp_load_reg64(frn, rn);
+vfp_load_reg64(frm, rm);
 switch (a->cc) {
 case 0: /* eq: Z */
 tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero,
@@ -264,7 +264,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
 tcg_temp_free_i64(tmp);
 break;
 }
-neon_store_reg64(dest, rd);
+vfp_store_reg64(dest, rd);
 tcg_temp_free_i64(frn);
 tcg_temp_free_i64(frm);
 tcg_temp_free_i64(dest);
@@ -385,9 +385,9 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
 TCGv_i64 tcg_res;
 tcg_op = tcg_temp_new_i64();
 tcg_res = tcg_temp_new_i64();
-neon_load_reg64(tcg_op, rm);
+vfp_load_reg64(tcg_op, rm);
 gen_helper_rintd(tcg_res, tcg_op, fpst);
-neon_store_reg64(tcg_res, rd);
+vfp_store_reg64(tcg_res, rd);
 tcg_temp_free_i64(tcg_op);
 tcg_temp_free_i64(tcg_res);
 } else {
@@ -463,7 +463,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
 tcg_double = tcg_temp_new_i64();
 tcg_res = tcg_temp_new_i64();
 tcg_tmp = tcg_temp_new_i32();
-neon_load_reg64(tcg_double, rm);
+vfp_load_reg64(tcg_double, rm);
 if (is_signed) {
 gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst);
 } else {
@@ -1002,9 +1002,9 @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, 
arg_VLDR_VSTR_dp *a)
 tmp = tcg_temp_new_i64();
 if (a->l) {
 gen_aa32_ld64(s, tmp, addr, get_mem_index(s));
-neon_store_reg64(tmp, a->vd);
+vfp_store_reg64(tmp, a->vd);
 } else {
-neon_load_reg64(tmp, a->vd);
+vfp_load_reg64(tmp, a->vd);
 gen_aa32_st64(s, tmp, addr, get_mem_index(s));
 }
 tcg_temp_free_i64(tmp);
@@ -1149,10 +1149,10 @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, 
arg_VLDM_VSTM_dp *a)
 if (a->l) {
 /* load */
 gen_aa32_ld64(s, tmp, addr, get_mem_index(s));
-neon_store_reg64(tmp, a->vd + i);
+vfp_store_reg64(tmp, a->vd + i);
 } else {
 /* store */
-neon_load_reg64(tmp, a->vd + i);
+vfp_load_reg64(tmp, a->vd + i);
 gen_aa32_st64(s, tmp, addr, get_mem_index(s));
 }
 tcg_gen_addi_i32(addr, addr, offset);
@@ -1416,15 +1416,15 @@ static bool do_vfp_3op_dp(DisasContext *s, 
VFPGen3OpDPFn *fn,
 fd = tcg_temp_new_i64();
 fpst = fpstatus_ptr(FPST_FPCR);
 
-neon_load_reg64(f0, vn);
-neon_load_reg64(f1, vm);
+vfp_load_reg64(f0, vn);
+vfp_load_reg64(f1, vm);
 
 for (;;) {
 if (reads_vd) {
-neon_load_reg64(fd, vd);
+vfp_load_reg64(fd, vd);
 }
 fn(fd, f0, f1, fpst);
-neon_store_reg64(fd, vd);
+vfp_store_reg64(fd, vd);
 
 if (veclen == 0) {
 break;
@@ -1433,10 +1433,10 @@ static bool do_vfp_3op_dp(DisasContext *s, 
VFPGen3OpDPFn *fn,
 veclen--;
 vd = vfp_advance_dreg(vd, delta_d);
 vn = vfp_advance_dreg(vn, delta_d);
-neon_load_reg64(f0, vn);
+vfp_load_reg64(f0, vn);
 if (delta_m) {
 vm = vfp_advance_dreg(vm, delta_m);
-neon_load_reg64(f1, vm);
+vfp_load_reg64(f1, vm);
 }
 }
 
@@ -1599,11 +1599,11 @@ static bool do_vfp_2op_dp(DisasContext *s, 
VFPGen2OpDPFn *fn, int vd, int vm)
 f0 = tcg_temp_new_i64();

[PATCH v2 04/11] target/arm: Use neon_element_offset in vfp_reg_offset

2020-10-29 Thread Richard Henderson

This seems a bit more readable than using offsetof CPU_DoubleU.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 88a926d1df..88ded4ac2c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1122,18 +1122,13 @@ static long neon_element_offset(int reg, int element, 
MemOp size)
 return neon_full_reg_offset(reg) + ofs;
 }
 
-static inline long vfp_reg_offset(bool dp, unsigned reg)
+/* Return the offset of a VFP Dreg (dp = true) or VFP Sreg (dp = false). */
+static long vfp_reg_offset(bool dp, unsigned reg)
 {
 if (dp) {
-return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]);
+return neon_element_offset(reg, 0, MO_64);
 } else {
-long ofs = offsetof(CPUARMState, vfp.zregs[reg >> 2].d[(reg >> 1) & 
1]);
-if (reg & 1) {
-ofs += offsetof(CPU_DoubleU, l.upper);
-} else {
-ofs += offsetof(CPU_DoubleU, l.lower);
-}
-return ofs;
+return neon_element_offset(reg >> 1, reg & 1, MO_32);
 }
 }
 
-- 
2.25.1

[PATCH v2 07/11] target/arm: Rename neon_load_reg32 to vfp_load_reg32

2020-10-29 Thread Richard Henderson

The only uses of this function are for loading VFP
single-precision values, and nothing to do with NEON.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c |   4 +-
 target/arm/translate-vfp.c.inc | 184 -
 2 files changed, 94 insertions(+), 94 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 55d5f4ed73..8491ab705b 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1142,12 +1142,12 @@ static inline void neon_store_reg64(TCGv_i64 var, int 
reg)
 tcg_gen_st_i64(var, cpu_env, vfp_reg_offset(1, reg));
 }
 
-static inline void neon_load_reg32(TCGv_i32 var, int reg)
+static inline void vfp_load_reg32(TCGv_i32 var, int reg)
 {
 tcg_gen_ld_i32(var, cpu_env, vfp_reg_offset(false, reg));
 }
 
-static inline void neon_store_reg32(TCGv_i32 var, int reg)
+static inline void vfp_store_reg32(TCGv_i32 var, int reg)
 {
 tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg));
 }
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
index 28f22f9872..d2a9b658bb 100644
--- a/target/arm/translate-vfp.c.inc
+++ b/target/arm/translate-vfp.c.inc
@@ -283,8 +283,8 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
 frn = tcg_temp_new_i32();
 frm = tcg_temp_new_i32();
 dest = tcg_temp_new_i32();
-neon_load_reg32(frn, rn);
-neon_load_reg32(frm, rm);
+vfp_load_reg32(frn, rn);
+vfp_load_reg32(frm, rm);
 switch (a->cc) {
 case 0: /* eq: Z */
 tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero,
@@ -315,7 +315,7 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
 if (sz == 1) {
 tcg_gen_andi_i32(dest, dest, 0x);
 }
-neon_store_reg32(dest, rd);
+vfp_store_reg32(dest, rd);
 tcg_temp_free_i32(frn);
 tcg_temp_free_i32(frm);
 tcg_temp_free_i32(dest);
@@ -395,13 +395,13 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
 TCGv_i32 tcg_res;
 tcg_op = tcg_temp_new_i32();
 tcg_res = tcg_temp_new_i32();
-neon_load_reg32(tcg_op, rm);
+vfp_load_reg32(tcg_op, rm);
 if (sz == 1) {
 gen_helper_rinth(tcg_res, tcg_op, fpst);
 } else {
 gen_helper_rints(tcg_res, tcg_op, fpst);
 }
-neon_store_reg32(tcg_res, rd);
+vfp_store_reg32(tcg_res, rd);
 tcg_temp_free_i32(tcg_op);
 tcg_temp_free_i32(tcg_res);
 }
@@ -470,7 +470,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
 gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst);
 }
 tcg_gen_extrl_i64_i32(tcg_tmp, tcg_res);
-neon_store_reg32(tcg_tmp, rd);
+vfp_store_reg32(tcg_tmp, rd);
 tcg_temp_free_i32(tcg_tmp);
 tcg_temp_free_i64(tcg_res);
 tcg_temp_free_i64(tcg_double);
@@ -478,7 +478,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
 TCGv_i32 tcg_single, tcg_res;
 tcg_single = tcg_temp_new_i32();
 tcg_res = tcg_temp_new_i32();
-neon_load_reg32(tcg_single, rm);
+vfp_load_reg32(tcg_single, rm);
 if (sz == 1) {
 if (is_signed) {
 gen_helper_vfp_toslh(tcg_res, tcg_single, tcg_shift, fpst);
@@ -492,7 +492,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
 gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
 }
 }
-neon_store_reg32(tcg_res, rd);
+vfp_store_reg32(tcg_res, rd);
 tcg_temp_free_i32(tcg_res);
 tcg_temp_free_i32(tcg_single);
 }
@@ -776,14 +776,14 @@ static bool trans_VMOV_half(DisasContext *s, 
arg_VMOV_single *a)
 if (a->l) {
 /* VFP to general purpose register */
 tmp = tcg_temp_new_i32();
-neon_load_reg32(tmp, a->vn);
+vfp_load_reg32(tmp, a->vn);
 tcg_gen_andi_i32(tmp, tmp, 0x);
 store_reg(s, a->rt, tmp);
 } else {
 /* general purpose register to VFP */
 tmp = load_reg(s, a->rt);
 tcg_gen_andi_i32(tmp, tmp, 0x);
-neon_store_reg32(tmp, a->vn);
+vfp_store_reg32(tmp, a->vn);
 tcg_temp_free_i32(tmp);
 }
 
@@ -805,7 +805,7 @@ static bool trans_VMOV_single(DisasContext *s, 
arg_VMOV_single *a)
 if (a->l) {
 /* VFP to general purpose register */
 tmp = tcg_temp_new_i32();
-neon_load_reg32(tmp, a->vn);
+vfp_load_reg32(tmp, a->vn);
 if (a->rt == 15) {
 /* Set the 4 flag bits in the CPSR.  */
 gen_set_nzcv(tmp);
@@ -816,7 +816,7 @@ static bool trans_VMOV_single(DisasContext *s, 
arg_VMOV_single *a)
 } else {
 /* general purpose register to VFP */
 tmp = load_reg(s, a->rt);
-neon_store_reg32(tmp, a->vn);
+vfp_store_reg32(tmp, a->vn);
 tcg_temp_free_i32(tmp);
 }
 
@@ -842,18 +842,18 @@ static bool trans_VMOV_64_sp(

[PATCH v2 06/11] target/arm: Expand read/write_neon_element32 to all MemOp

2020-10-29 Thread Richard Henderson

We can then use this to improve VMOV (scalar to gp) and
VMOV (gp to scalar) so that we simply perform the memory
operation that we wanted, rather than inserting or
extracting from a 32-bit quantity.

These were the last uses of neon_load/store_reg, so remove them.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 50 +---
 target/arm/translate-vfp.c.inc | 71 +-
 2 files changed, 37 insertions(+), 84 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 0ed9eab0b0..55d5f4ed73 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1106,9 +1106,9 @@ static long neon_full_reg_offset(unsigned reg)
  * Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
  * where 0 is the least significant end of the register.
  */
-static long neon_element_offset(int reg, int element, MemOp size)
+static long neon_element_offset(int reg, int element, MemOp memop)
 {
-int element_size = 1 << size;
+int element_size = 1 << (memop & MO_SIZE);
 int ofs = element * element_size;
 #ifdef HOST_WORDS_BIGENDIAN
 /*
@@ -1132,19 +1132,6 @@ static long vfp_reg_offset(bool dp, unsigned reg)
 }
 }
 
-static TCGv_i32 neon_load_reg(int reg, int pass)
-{
-TCGv_i32 tmp = tcg_temp_new_i32();
-tcg_gen_ld_i32(tmp, cpu_env, neon_element_offset(reg, pass, MO_32));
-return tmp;
-}
-
-static void neon_store_reg(int reg, int pass, TCGv_i32 var)
-{
-tcg_gen_st_i32(var, cpu_env, neon_element_offset(reg, pass, MO_32));
-tcg_temp_free_i32(var);
-}
-
 static inline void neon_load_reg64(TCGv_i64 var, int reg)
 {
 tcg_gen_ld_i64(var, cpu_env, vfp_reg_offset(1, reg));
@@ -1165,12 +1152,25 @@ static inline void neon_store_reg32(TCGv_i32 var, int 
reg)
 tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg));
 }
 
-static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size)
+static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp memop)
 {
-long off = neon_element_offset(reg, ele, size);
+long off = neon_element_offset(reg, ele, memop);
 
-switch (size) {
-case MO_32:
+switch (memop) {
+case MO_SB:
+tcg_gen_ld8s_i32(dest, cpu_env, off);
+break;
+case MO_UB:
+tcg_gen_ld8u_i32(dest, cpu_env, off);
+break;
+case MO_SW:
+tcg_gen_ld16s_i32(dest, cpu_env, off);
+break;
+case MO_UW:
+tcg_gen_ld16u_i32(dest, cpu_env, off);
+break;
+case MO_UL:
+case MO_SL:
 tcg_gen_ld_i32(dest, cpu_env, off);
 break;
 default:
@@ -1178,11 +1178,17 @@ static void read_neon_element32(TCGv_i32 dest, int reg, 
int ele, MemOp size)
 }
 }
 
-static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp size)
+static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp memop)
 {
-long off = neon_element_offset(reg, ele, size);
+long off = neon_element_offset(reg, ele, memop);
 
-switch (size) {
+switch (memop) {
+case MO_8:
+tcg_gen_st8_i32(src, cpu_env, off);
+break;
+case MO_16:
+tcg_gen_st16_i32(src, cpu_env, off);
+break;
 case MO_32:
 tcg_gen_st_i32(src, cpu_env, off);
 break;
diff --git a/target/arm/translate-vfp.c.inc b/target/arm/translate-vfp.c.inc
index 368bae0a73..28f22f9872 100644
--- a/target/arm/translate-vfp.c.inc
+++ b/target/arm/translate-vfp.c.inc
@@ -511,11 +511,9 @@ static bool trans_VMOV_to_gp(DisasContext *s, 
arg_VMOV_to_gp *a)
 {
 /* VMOV scalar to general purpose register */
 TCGv_i32 tmp;
-int pass;
-uint32_t offset;
 
-/* SIZE == 2 is a VFP instruction; otherwise NEON.  */
-if (a->size == 2
+/* SIZE == MO_32 is a VFP instruction; otherwise NEON.  */
+if (a->size == MO_32
 ? !dc_isar_feature(aa32_fpsp_v2, s)
 : !arm_dc_feature(s, ARM_FEATURE_NEON)) {
 return false;
@@ -526,44 +524,12 @@ static bool trans_VMOV_to_gp(DisasContext *s, 
arg_VMOV_to_gp *a)
 return false;
 }
 
-offset = a->index << a->size;
-pass = extract32(offset, 2, 1);
-offset = extract32(offset, 0, 2) * 8;
-
 if (!vfp_access_check(s)) {
 return true;
 }
 
-tmp = neon_load_reg(a->vn, pass);
-switch (a->size) {
-case 0:
-if (offset) {
-tcg_gen_shri_i32(tmp, tmp, offset);
-}
-if (a->u) {
-gen_uxtb(tmp);
-} else {
-gen_sxtb(tmp);
-}
-break;
-case 1:
-if (a->u) {
-if (offset) {
-tcg_gen_shri_i32(tmp, tmp, 16);
-} else {
-gen_uxth(tmp);
-}
-} else {
-if (offset) {
-tcg_gen_sari_i32(tmp, tmp, 16);
-} else {
-gen_sxth(tmp);
-}
-}
-break;
-case 2:
-break;
-}
+tmp = tcg_temp_new_i32();
+read_neon_e

[PATCH v2 02/11] target/arm: Move neon_element_offset to translate.c

2020-10-29 Thread Richard Henderson

This will shortly have users outside of translate-neon.c.inc.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c  | 20 
 target/arm/translate-neon.c.inc | 19 ---
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 1b61e50f9c..bf0b5cac61 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1102,6 +1102,26 @@ static long neon_full_reg_offset(unsigned reg)
 return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]);
 }
 
+/*
+ * Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
+ * where 0 is the least significant end of the register.
+ */
+static long neon_element_offset(int reg, int element, MemOp size)
+{
+int element_size = 1 << size;
+int ofs = element * element_size;
+#ifdef HOST_WORDS_BIGENDIAN
+/*
+ * Calculate the offset assuming fully little-endian,
+ * then XOR to account for the order of the 8-byte units.
+ */
+if (element_size < 8) {
+ofs ^= 8 - element_size;
+}
+#endif
+return neon_full_reg_offset(reg) + ofs;
+}
+
 static inline long vfp_reg_offset(bool dp, unsigned reg)
 {
 if (dp) {
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
index e259e24c05..96ab2248fc 100644
--- a/target/arm/translate-neon.c.inc
+++ b/target/arm/translate-neon.c.inc
@@ -60,25 +60,6 @@ static inline int neon_3same_fp_size(DisasContext *s, int x)
 #include "decode-neon-ls.c.inc"
 #include "decode-neon-shared.c.inc"
 
-/* Return the offset of a 2**SIZE piece of a NEON register, at index ELE,
- * where 0 is the least significant end of the register.
- */
-static inline long
-neon_element_offset(int reg, int element, MemOp size)
-{
-int element_size = 1 << size;
-int ofs = element * element_size;
-#ifdef HOST_WORDS_BIGENDIAN
-/* Calculate the offset assuming fully little-endian,
- * then XOR to account for the order of the 8-byte units.
- */
-if (element_size < 8) {
-ofs ^= 8 - element_size;
-}
-#endif
-return neon_full_reg_offset(reg) + ofs;
-}
-
 static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop)
 {
 long offset = neon_element_offset(reg, ele, mop & MO_SIZE);
-- 
2.25.1

[PATCH v2 00/11] target/arm: Fix neon reg offsets

2020-10-29 Thread Richard Henderson

Much of the existing usage of neon_reg_offset is broken for
big-endian hosts, as it computes the offset of the first
32-bit unit, not the offset of the entire vector register.

Fix this by separating out the different usages.  Make the
whole thing look a bit more like the aarch64 code.

Changes for v2:
  * Fix two tcg temp leaks.

r~

Richard Henderson (11):
  target/arm: Introduce neon_full_reg_offset
  target/arm: Move neon_element_offset to translate.c
  target/arm: Use neon_element_offset in neon_load/store_reg
  target/arm: Use neon_element_offset in vfp_reg_offset
  target/arm: Add read/write_neon_element32
  target/arm: Expand read/write_neon_element32 to all MemOp
  target/arm: Rename neon_load_reg32 to vfp_load_reg32
  target/arm: Add read/write_neon_element64
  target/arm: Rename neon_load_reg64 to vfp_load_reg64
  target/arm: Simplify do_long_3d and do_2scalar_long
  target/arm: Improve do_prewiden_3d

 target/arm/translate.c  | 153 ---
 target/arm/translate-neon.c.inc | 472 +---
 target/arm/translate-vfp.c.inc  | 341 ++-
 3 files changed, 516 insertions(+), 450 deletions(-)

-- 
2.25.1

[PATCH v2 01/11] target/arm: Introduce neon_full_reg_offset

2020-10-29 Thread Richard Henderson

This function makes it clear that we're talking about the whole
register, and not the 32-bit piece at index 0.  This fixes a bug
when running on a big-endian host.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c  |  8 ++
 target/arm/translate-neon.c.inc | 44 -
 target/arm/translate-vfp.c.inc  |  2 +-
 3 files changed, 31 insertions(+), 23 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 38371db540..1b61e50f9c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1094,6 +1094,14 @@ static inline void gen_hlt(DisasContext *s, int imm)
 unallocated_encoding(s);
 }
 
+/*
+ * Return the offset of a "full" NEON Dreg.
+ */
+static long neon_full_reg_offset(unsigned reg)
+{
+return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]);
+}
+
 static inline long vfp_reg_offset(bool dp, unsigned reg)
 {
 if (dp) {
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
index 4d1a292981..e259e24c05 100644
--- a/target/arm/translate-neon.c.inc
+++ b/target/arm/translate-neon.c.inc
@@ -76,7 +76,7 @@ neon_element_offset(int reg, int element, MemOp size)
 ofs ^= 8 - element_size;
 }
 #endif
-return neon_reg_offset(reg, 0) + ofs;
+return neon_full_reg_offset(reg) + ofs;
 }
 
 static void neon_load_element(TCGv_i32 var, int reg, int ele, MemOp mop)
@@ -585,12 +585,12 @@ static bool trans_VLD_all_lanes(DisasContext *s, 
arg_VLD_all_lanes *a)
  * We cannot write 16 bytes at once because the
  * destination is unaligned.
  */
-tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
+tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(vd),
  8, 8, tmp);
-tcg_gen_gvec_mov(0, neon_reg_offset(vd + 1, 0),
- neon_reg_offset(vd, 0), 8, 8);
+tcg_gen_gvec_mov(0, neon_full_reg_offset(vd + 1),
+ neon_full_reg_offset(vd), 8, 8);
 } else {
-tcg_gen_gvec_dup_i32(size, neon_reg_offset(vd, 0),
+tcg_gen_gvec_dup_i32(size, neon_full_reg_offset(vd),
  vec_size, vec_size, tmp);
 }
 tcg_gen_addi_i32(addr, addr, 1 << size);
@@ -691,9 +691,9 @@ static bool trans_VLDST_single(DisasContext *s, 
arg_VLDST_single *a)
 static bool do_3same(DisasContext *s, arg_3same *a, GVecGen3Fn fn)
 {
 int vec_size = a->q ? 16 : 8;
-int rd_ofs = neon_reg_offset(a->vd, 0);
-int rn_ofs = neon_reg_offset(a->vn, 0);
-int rm_ofs = neon_reg_offset(a->vm, 0);
+int rd_ofs = neon_full_reg_offset(a->vd);
+int rn_ofs = neon_full_reg_offset(a->vn);
+int rm_ofs = neon_full_reg_offset(a->vm);
 
 if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 return false;
@@ -1177,8 +1177,8 @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift 
*a, GVecGen2iFn *fn)
 {
 /* Handle a 2-reg-shift insn which can be vectorized. */
 int vec_size = a->q ? 16 : 8;
-int rd_ofs = neon_reg_offset(a->vd, 0);
-int rm_ofs = neon_reg_offset(a->vm, 0);
+int rd_ofs = neon_full_reg_offset(a->vd);
+int rm_ofs = neon_full_reg_offset(a->vm);
 
 if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
 return false;
@@ -1620,8 +1620,8 @@ static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
 {
 /* FP operations in 2-reg-and-shift group */
 int vec_size = a->q ? 16 : 8;
-int rd_ofs = neon_reg_offset(a->vd, 0);
-int rm_ofs = neon_reg_offset(a->vm, 0);
+int rd_ofs = neon_full_reg_offset(a->vd);
+int rm_ofs = neon_full_reg_offset(a->vm);
 TCGv_ptr fpst;
 
 if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -1756,7 +1756,7 @@ static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a,
 return true;
 }
 
-reg_ofs = neon_reg_offset(a->vd, 0);
+reg_ofs = neon_full_reg_offset(a->vd);
 vec_size = a->q ? 16 : 8;
 imm = asimd_imm_const(a->imm, a->cmode, a->op);
 
@@ -2300,9 +2300,9 @@ static bool trans_VMULL_P_3d(DisasContext *s, arg_3diff 
*a)
 return true;
 }
 
-tcg_gen_gvec_3_ool(neon_reg_offset(a->vd, 0),
-   neon_reg_offset(a->vn, 0),
-   neon_reg_offset(a->vm, 0),
+tcg_gen_gvec_3_ool(neon_full_reg_offset(a->vd),
+   neon_full_reg_offset(a->vn),
+   neon_full_reg_offset(a->vm),
16, 16, 0, fn_gvec);
 return true;
 }
@@ -2445,8 +2445,8 @@ static bool do_2scalar_fp_vec(DisasContext *s, 
arg_2scalar *a,
 {
 /* Two registers and a scalar, using gvec */
 int vec_size = a->q ? 16 : 8;
-int rd_ofs = neon_reg_offset(a->vd, 0);
-int rn_ofs = neon_reg_offset(a->vn, 0);
+int rd_ofs = neon_full_reg_offset(a->vd);
+int rn_ofs = neon_full_reg_offset(a->vn);
 int rm_ofs;
 int idx;
 TCGv_ptr fpstatus;
@@ -2477,7 +2477,7 @@ static bool do_2scalar_fp_vec(DisasContex

[PATCH v2 03/11] target/arm: Use neon_element_offset in neon_load/store_reg

2020-10-29 Thread Richard Henderson

These are the only users of neon_reg_offset, so remove that.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index bf0b5cac61..88a926d1df 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1137,26 +1137,16 @@ static inline long vfp_reg_offset(bool dp, unsigned reg)
 }
 }
 
-/* Return the offset of a 32-bit piece of a NEON register.
-   zero is the least significant end of the register.  */
-static inline long
-neon_reg_offset (int reg, int n)
-{
-int sreg;
-sreg = reg * 2 + n;
-return vfp_reg_offset(0, sreg);
-}
-
 static TCGv_i32 neon_load_reg(int reg, int pass)
 {
 TCGv_i32 tmp = tcg_temp_new_i32();
-tcg_gen_ld_i32(tmp, cpu_env, neon_reg_offset(reg, pass));
+tcg_gen_ld_i32(tmp, cpu_env, neon_element_offset(reg, pass, MO_32));
 return tmp;
 }
 
 static void neon_store_reg(int reg, int pass, TCGv_i32 var)
 {
-tcg_gen_st_i32(var, cpu_env, neon_reg_offset(reg, pass));
+tcg_gen_st_i32(var, cpu_env, neon_element_offset(reg, pass, MO_32));
 tcg_temp_free_i32(var);
 }
 
-- 
2.25.1

[PATCH v2 05/11] target/arm: Add read/write_neon_element32

2020-10-29 Thread Richard Henderson

Model these off the aa64 read/write_vec_element functions.
Use it within translate-neon.c.inc.  The new functions do
not allocate or free temps, so this rearranges the calling
code a bit.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c  |  26 
 target/arm/translate-neon.c.inc | 256 
 2 files changed, 183 insertions(+), 99 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 88ded4ac2c..0ed9eab0b0 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1165,6 +1165,32 @@ static inline void neon_store_reg32(TCGv_i32 var, int 
reg)
 tcg_gen_st_i32(var, cpu_env, vfp_reg_offset(false, reg));
 }
 
+static void read_neon_element32(TCGv_i32 dest, int reg, int ele, MemOp size)
+{
+long off = neon_element_offset(reg, ele, size);
+
+switch (size) {
+case MO_32:
+tcg_gen_ld_i32(dest, cpu_env, off);
+break;
+default:
+g_assert_not_reached();
+}
+}
+
+static void write_neon_element32(TCGv_i32 src, int reg, int ele, MemOp size)
+{
+long off = neon_element_offset(reg, ele, size);
+
+switch (size) {
+case MO_32:
+tcg_gen_st_i32(src, cpu_env, off);
+break;
+default:
+g_assert_not_reached();
+}
+}
+
 static TCGv_ptr vfp_reg_ptr(bool dp, int reg)
 {
 TCGv_ptr ret = tcg_temp_new_ptr();
diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc
index 96ab2248fc..549381703e 100644
--- a/target/arm/translate-neon.c.inc
+++ b/target/arm/translate-neon.c.inc
@@ -956,18 +956,24 @@ static bool do_3same_pair(DisasContext *s, arg_3same *a, 
NeonGenTwoOpFn *fn)
  * early. Since Q is 0 there are always just two passes, so instead
  * of a complicated loop over each pass we just unroll.
  */
-tmp = neon_load_reg(a->vn, 0);
-tmp2 = neon_load_reg(a->vn, 1);
+tmp = tcg_temp_new_i32();
+tmp2 = tcg_temp_new_i32();
+tmp3 = tcg_temp_new_i32();
+
+read_neon_element32(tmp, a->vn, 0, MO_32);
+read_neon_element32(tmp2, a->vn, 1, MO_32);
 fn(tmp, tmp, tmp2);
-tcg_temp_free_i32(tmp2);
 
-tmp3 = neon_load_reg(a->vm, 0);
-tmp2 = neon_load_reg(a->vm, 1);
+read_neon_element32(tmp3, a->vm, 0, MO_32);
+read_neon_element32(tmp2, a->vm, 1, MO_32);
 fn(tmp3, tmp3, tmp2);
-tcg_temp_free_i32(tmp2);
 
-neon_store_reg(a->vd, 0, tmp);
-neon_store_reg(a->vd, 1, tmp3);
+write_neon_element32(tmp, a->vd, 0, MO_32);
+write_neon_element32(tmp3, a->vd, 1, MO_32);
+
+tcg_temp_free_i32(tmp);
+tcg_temp_free_i32(tmp2);
+tcg_temp_free_i32(tmp3);
 return true;
 }
 
@@ -1275,7 +1281,7 @@ static bool do_2shift_env_32(DisasContext *s, 
arg_2reg_shift *a,
  * 2-reg-and-shift operations, size < 3 case, where the
  * helper needs to be passed cpu_env.
  */
-TCGv_i32 constimm;
+TCGv_i32 constimm, tmp;
 int pass;
 
 if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
@@ -1301,12 +1307,14 @@ static bool do_2shift_env_32(DisasContext *s, 
arg_2reg_shift *a,
  * by immediate using the variable shift operations.
  */
 constimm = tcg_const_i32(dup_const(a->size, a->shift));
+tmp = tcg_temp_new_i32();
 
 for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
-TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+read_neon_element32(tmp, a->vm, pass, MO_32);
 fn(tmp, cpu_env, tmp, constimm);
-neon_store_reg(a->vd, pass, tmp);
+write_neon_element32(tmp, a->vd, pass, MO_32);
 }
+tcg_temp_free_i32(tmp);
 tcg_temp_free_i32(constimm);
 return true;
 }
@@ -1364,21 +1372,21 @@ static bool do_2shift_narrow_64(DisasContext *s, 
arg_2reg_shift *a,
 constimm = tcg_const_i64(-a->shift);
 rm1 = tcg_temp_new_i64();
 rm2 = tcg_temp_new_i64();
+rd = tcg_temp_new_i32();
 
 /* Load both inputs first to avoid potential overwrite if rm == rd */
 neon_load_reg64(rm1, a->vm);
 neon_load_reg64(rm2, a->vm + 1);
 
 shiftfn(rm1, rm1, constimm);
-rd = tcg_temp_new_i32();
 narrowfn(rd, cpu_env, rm1);
-neon_store_reg(a->vd, 0, rd);
+write_neon_element32(rd, a->vd, 0, MO_32);
 
 shiftfn(rm2, rm2, constimm);
-rd = tcg_temp_new_i32();
 narrowfn(rd, cpu_env, rm2);
-neon_store_reg(a->vd, 1, rd);
+write_neon_element32(rd, a->vd, 1, MO_32);
 
+tcg_temp_free_i32(rd);
 tcg_temp_free_i64(rm1);
 tcg_temp_free_i64(rm2);
 tcg_temp_free_i64(constimm);
@@ -1428,10 +1436,14 @@ static bool do_2shift_narrow_32(DisasContext *s, 
arg_2reg_shift *a,
 constimm = tcg_const_i32(imm);
 
 /* Load all inputs first to avoid potential overwrite */
-rm1 = neon_load_reg(a->vm, 0);
-rm2 = neon_load_reg(a->vm, 1);
-rm3 = neon_load_reg(a->vm + 1, 0);
-rm4 = neon_load_reg(a->vm + 1, 1);
+rm1 = tcg_temp_new_i32();
+rm2 = tcg_temp_new_i32();
+rm3 = tcg_temp_new_i32();
+rm4 = tcg_temp_new_i32();
+read_neon_element32(rm1, a->vm, 0, MO_32);
+

Re: [PATCH v2 00/19] Mirror map JIT memory for TCG

2020-10-29 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20201030004921.721096-1-richard.hender...@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20201030004921.721096-1-richard.hender...@linaro.org
Subject: [PATCH v2 00/19] Mirror map JIT memory for TCG

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/20201030004921.721096-1-richard.hender...@linaro.org -> 
patchew/20201030004921.721096-1-richard.hender...@linaro.org
Switched to a new branch 'test'
1cae0aa tcg/aarch64: Support split-rwx code generation
2a8bb6f tcg/aarch64: Implement flush_idcache_range manually
5aac937 tcg/aarch64: Use B not BL for tcg_out_goto_long
9f81275 tcg/i386: Support split-rwx code generation
5a11dd0 tcg: Return the rx mirror of TranslationBlock from exit_tb
af71330 RFC: accel/tcg: Support split-rwx for darwin/iOS with vm_remap
bf60965 accel/tcg: Support split-rwx for linux with memfd
90c1e77 tcg: Add --accel tcg,split-rwx property
d4805b8 tcg: Use Error with alloc_code_gen_buffer
9bfafc6 tcg: Make tb arg to synchronize_from_tb const
27ebfc9 tcg: Make DisasContextBase.tb const
5a7bc51 tcg: Adjust tb_target_set_jmp_target for split rwx
6a7c2c6 tcg: Adjust tcg_register_jit for const
51a0884 tcg: Adjust tcg_out_label for const
79f0c8e tcg: Adjust tcg_out_call for const
cd13bcf tcg: Introduce tcg_mirror_rw_to_rx/tcg_mirror_rx_to_rw
43a4727 tcg: Move tcg epilogue pointer out of TCGContext
5c5b85a tcg: Move tcg prologue pointer out of TCGContext
e320c51 tcg: Enhance flush_icache_range with separate data pointer

=== OUTPUT BEGIN ===
1/19 Checking commit e320c51e3e4a (tcg: Enhance flush_icache_range with 
separate data pointer)
2/19 Checking commit 5c5b85a1a024 (tcg: Move tcg prologue pointer out of 
TCGContext)
3/19 Checking commit 43a47275e60e (tcg: Move tcg epilogue pointer out of 
TCGContext)
4/19 Checking commit cd13bcf48f36 (tcg: Introduce 
tcg_mirror_rw_to_rx/tcg_mirror_rx_to_rw)
5/19 Checking commit 79f0c8e8dd04 (tcg: Adjust tcg_out_call for const)
6/19 Checking commit 51a088446659 (tcg: Adjust tcg_out_label for const)
7/19 Checking commit 6a7c2c61ac7c (tcg: Adjust tcg_register_jit for const)
8/19 Checking commit 5a7bc5100680 (tcg: Adjust tb_target_set_jmp_target for 
split rwx)
9/19 Checking commit 27ebfc9c7710 (tcg: Make DisasContextBase.tb const)
10/19 Checking commit 9bfafc6a1e66 (tcg: Make tb arg to synchronize_from_tb 
const)
11/19 Checking commit d4805b8c7ba9 (tcg: Use Error with alloc_code_gen_buffer)
12/19 Checking commit 90c1e773778d (tcg: Add --accel tcg,split-rwx property)
13/19 Checking commit bf60965b715c (accel/tcg: Support split-rwx for linux with 
memfd)
14/19 Checking commit af71330a9b77 (RFC: accel/tcg: Support split-rwx for 
darwin/iOS with vm_remap)
ERROR: externs should be avoided in .c files
#39: FILE: accel/tcg/translate-all.c:1195:
+extern kern_return_t mach_vm_remap(vm_map_t target_task,

total: 1 errors, 0 warnings, 86 lines checked

Patch 14/19 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

15/19 Checking commit 5a11dd063165 (tcg: Return the rx mirror of 
TranslationBlock from exit_tb)
16/19 Checking commit 9f812754c6c2 (tcg/i386: Support split-rwx code generation)
17/19 Checking commit 5aac937754cc (tcg/aarch64: Use B not BL for 
tcg_out_goto_long)
18/19 Checking commit 2a8bb6f3e71e (tcg/aarch64: Implement flush_idcache_range 
manually)
19/19 Checking commit 1cae0aa42840 (tcg/aarch64: Support split-rwx code 
generation)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20201030004921.721096-1-richard.hender...@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH V15 1/6] target/mips: Fix PageMask with variable page size

2020-10-29 Thread chen huacai

Hi, Richard,

On Wed, Oct 28, 2020 at 4:48 PM Richard Henderson
 wrote:
>
> On 10/27/20 9:17 PM, Huacai Chen wrote:
> > +invalid:
> > +/*
> > + * When invalid, ensure the value is bigger than or equal to
> > + * the minimal but smaller than or equal to the maxium.
> > + */
> > +maskbits = MIN(16, MAX(maskbits, TARGET_PAGE_BITS - 12));
> > +env->CP0_PageMask = ((1 << (16 + 1)) - 1) << CP0PM_MASK;
>
> maskbits is unused.  Did you mean to use it?
This is redundant, will be removed.

Huacai
>
>
> r~



-- 
Huacai Chen

Re: Out-of-Process Device Emulation session at KVM Forum 2020

2020-10-29 Thread Jason Wang




On 2020/10/29 下午11:46, Alex Williamson wrote:

On Thu, 29 Oct 2020 23:09:33 +0800
Jason Wang  wrote:


On 2020/10/29 下午10:31, Alex Williamson wrote:

On Thu, 29 Oct 2020 21:02:05 +0800
Jason Wang  wrote:
  

On 2020/10/29 下午8:08, Stefan Hajnoczi wrote:

Here are notes from the session:

protocol stability:
   * vhost-user already exists for existing third-party applications
   * vfio-user is more general but will take more time to develop
   * libvfio-user can be provided to allow device implementations

management:
   * Should QEMU launch device emulation processes?
   * Nicer user experience
   * Technical blockers: forking, hotplug, security is hard once
QEMU has started running
   * Probably requires a new process model with a long-running
QEMU management process proxying QMP requests to the emulator process

migration:
   * dbus-vmstate
   * VFIO live migration ioctls
   * Source device can continue if migration fails
   * Opaque blobs are transferred to destination, destination can
fail migration if it decides the blobs are incompatible

I'm not sure this can work:

1) Reading something that is opaque to userspace is probably a hint of
bad uAPI design
2) Did qemu even try to migrate opaque blobs before? It's probably a bad
design of migration protocol as well.

It looks to me have a migration driver in qemu that can clearly define
each byte in the migration stream is a better approach.

Any time during the previous two years of development might have been a
more appropriate time to express your doubts.


Somehow I did that in this series[1]. But the main issue is still there.

That series is related to a migration compatibility interface, not the
migration data itself.



They are not independent. The compatibility interface design depends on 
the migration data design. I ask the uAPI issue in that thread but 
without any response.






Is this legal to have a uAPI that turns out to be opaque to userspace?
(VFIO seems to be the first). If it's not,  the only choice is to do
that in Qemu.

So you're suggesting that any time the kernel is passing through opaque
data that gets interpreted by some entity elsewhere, potentially with
proprietary code, that we're in legal jeopardy?  VFIO is certainly not
the first to do that (storage and network devices come to mind).
Devices are essentially opaque data themselves, vfio provides access to
(ex.) BARs, but the interpretation of what resides in that BAR is device
specific.  Sometimes it's defined in a public datasheet, sometimes not.
Suggesting that we can't move opaque data through a uAPI seems rather
absurd.



No, I think we are talking about different things. What I meant is the 
data carried via uAPI should not opaque userspace. What you said here is 
a good example for this actually. When you expose BAR to userspace, 
there should be driver that knows the semantics of BAR running in the 
userspace, so it's not opaque to userspace.






Note that we're not talking about vDPA devices here, we're talking
about arbitrary devices with arbitrary state.  Some degree of migration
support for assigned devices can be implemented in QEMU, Alex Graf
proved this several years ago with i40evf.  Years later, we don't have
any vendors proposing device specific migration code for QEMU.


Yes but it's not necessarily VFIO as well.

I don't know what this means.



I meant we can't not assume VFIO is the only uAPI that will be used by Qemu.





Clearly we're also trying to account for proprietary devices where even
for suspend/resume support, proprietary drivers may be required for
manipulating that internal state.  When we move device emulation
outside of QEMU, whether in kernel or to other userspace processes,
does it still make sense to require code in QEMU to support
interpretation of that device for migration purposes?


Well, we could extend Qemu to support property module (or have we
supported that now?). And then it can talk to property drivers via
either VFIO or vendor specific uAPI.

Yikes, I thought out-of-process devices was exactly the compromise
being developed to avoid QEMU supporting proprietary modules and ad-hoc
vendor specific uAPIs.



We can't even prevent this in kernel, so I don't see how possible we can 
make it for Qemu.




I think you're actually questioning even the
premise of developing a standardized API for out-of-process devices
here.  Thanks,



Actually not, it's just question in my mind when looking at VFIO 
migration compatibility patches, since vfio-user is being proposed, it's 
a good time to revisit them.


Thanks




Alex

[PATCH v2 19/19] tcg/aarch64: Support split-rwx code generation

2020-10-29 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  2 +-
 tcg/aarch64/tcg-target.c.inc | 57 
 tcg/tcg-pool.c.inc   |  6 +++-
 3 files changed, 38 insertions(+), 27 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index e62d38ba55..abb94f9458 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -155,6 +155,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif /* AARCH64_TCG_TARGET_H */
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 5e8f3faad2..c082a06152 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -78,38 +78,42 @@ static const int tcg_target_call_oarg_regs[1] = {
 #define TCG_REG_GUEST_BASE TCG_REG_X28
 #endif
 
-static inline bool reloc_pc26(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_pc26(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - code_ptr;
+const tcg_insn_unit *src_rx = tcg_mirror_rw_to_rx(src_rw);
+ptrdiff_t offset = target - src_rx;
+
 if (offset == sextract64(offset, 0, 26)) {
 /* read instruction, mask away previous PC_REL26 parameter contents,
set the proper offset, then write back the instruction. */
-*code_ptr = deposit32(*code_ptr, 0, 26, offset);
+*src_rw = deposit32(*src_rw, 0, 26, offset);
 return true;
 }
 return false;
 }
 
-static inline bool reloc_pc19(tcg_insn_unit *code_ptr, tcg_insn_unit *target)
+static bool reloc_pc19(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - code_ptr;
+const tcg_insn_unit *src_rx = tcg_mirror_rw_to_rx(src_rw);
+ptrdiff_t offset = target - src_rx;
+
 if (offset == sextract64(offset, 0, 19)) {
-*code_ptr = deposit32(*code_ptr, 5, 19, offset);
+*src_rw = deposit32(*src_rw, 5, 19, offset);
 return true;
 }
 return false;
 }
 
-static inline bool patch_reloc(tcg_insn_unit *code_ptr, int type,
-   intptr_t value, intptr_t addend)
+static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
+intptr_t value, intptr_t addend)
 {
 tcg_debug_assert(addend == 0);
 switch (type) {
 case R_AARCH64_JUMP26:
 case R_AARCH64_CALL26:
-return reloc_pc26(code_ptr, (tcg_insn_unit *)value);
+return reloc_pc26(code_ptr, (const tcg_insn_unit *)value);
 case R_AARCH64_CONDBR19:
-return reloc_pc19(code_ptr, (tcg_insn_unit *)value);
+return reloc_pc19(code_ptr, (const tcg_insn_unit *)value);
 default:
 g_assert_not_reached();
 }
@@ -1050,12 +1054,13 @@ static void tcg_out_movi(TCGContext *s, TCGType type, 
TCGReg rd,
 /* Look for host pointer values within 4G of the PC.  This happens
often when loading pointers to QEMU's own data structures.  */
 if (type == TCG_TYPE_I64) {
-tcg_target_long disp = value - (intptr_t)s->code_ptr;
+intptr_t src_rx = (intptr_t)tcg_mirror_rw_to_rx(s->code_ptr);
+tcg_target_long disp = value - src_rx;
 if (disp == sextract64(disp, 0, 21)) {
 tcg_out_insn(s, 3406, ADR, rd, disp);
 return;
 }
-disp = (value >> 12) - ((intptr_t)s->code_ptr >> 12);
+disp = (value >> 12) - (src_rx >> 12);
 if (disp == sextract64(disp, 0, 21)) {
 tcg_out_insn(s, 3406, ADRP, rd, disp);
 if (value & 0xfff) {
@@ -1308,14 +1313,14 @@ static void tcg_out_cmp(TCGContext *s, TCGType ext, 
TCGReg a,
 
 static void tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - s->code_ptr;
+ptrdiff_t offset = tcg_pcrel_diff(s, target) >> 2;
 tcg_debug_assert(offset == sextract64(offset, 0, 26));
 tcg_out_insn(s, 3206, B, offset);
 }
 
-static inline void tcg_out_goto_long(TCGContext *s, tcg_insn_unit *target)
+static void tcg_out_goto_long(TCGContext *s, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - s->code_ptr;
+ptrdiff_t offset = tcg_pcrel_diff(s, target) >> 2;
 if (offset == sextract64(offset, 0, 26)) {
 tcg_out_insn(s, 3206, B, offset);
 } else {
@@ -1329,9 +1334,9 @@ static inline void tcg_out_callr(TCGContext *s, TCGReg 
reg)
 tcg_out_insn(s, 3207, BLR, reg);
 }
 
-static inline void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
 {
-ptrdiff_t offset = target - s->code_ptr;
+ptrdiff_t offset = tcg_pcrel_diff(s, target) >> 2;
 if (offset == sextract64(offset, 0, 26)) {
 tcg_out_insn(s, 3206, BL, offset);
 } else {
@@ -1393,7 +1398,7 @@ static void tcg_out_brcond(TCGConte

[PATCH v2 16/19] tcg/i386: Support split-rwx code generation

2020-10-29 Thread Richard Henderson

Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.h |  2 +-
 tcg/i386/tcg-target.c.inc | 20 +++-
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 1b9d41bd56..bbbd1c2d4a 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -236,6 +236,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
-#define TCG_TARGET_SUPPORT_MIRROR   0
+#define TCG_TARGET_SUPPORT_MIRROR   1
 
 #endif
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 7f74c77d7f..e2c85381cd 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -165,7 +165,7 @@ static bool have_lzcnt;
 # define have_lzcnt 0
 #endif
 
-static tcg_insn_unit *tb_ret_addr;
+static const tcg_insn_unit *tb_ret_addr;
 
 static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 intptr_t value, intptr_t addend)
@@ -173,7 +173,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 value += addend;
 switch(type) {
 case R_386_PC32:
-value -= (uintptr_t)code_ptr;
+value -= (uintptr_t)tcg_mirror_rw_to_rx(code_ptr);
 if (value != (int32_t)value) {
 return false;
 }
@@ -182,7 +182,7 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 tcg_patch32(code_ptr, value);
 break;
 case R_386_PC8:
-value -= (uintptr_t)code_ptr;
+value -= (uintptr_t)tcg_mirror_rw_to_rx(code_ptr);
 if (value != (int8_t)value) {
 return false;
 }
@@ -1006,7 +1006,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type,
 }
 
 /* Try a 7 byte pc-relative lea before the 10 byte movq.  */
-diff = arg - ((uintptr_t)s->code_ptr + 7);
+diff = tcg_pcrel_diff(s, (const void *)arg) - 7;
 if (diff == (int32_t)diff) {
 tcg_out_opc(s, OPC_LEA | P_REXW, ret, 0, 0);
 tcg_out8(s, (LOWREGMASK(ret) << 3) | 5);
@@ -1615,7 +1615,7 @@ static inline void tcg_out_call(TCGContext *s, const 
tcg_insn_unit *dest)
 tcg_out_branch(s, 1, dest);
 }
 
-static void tcg_out_jmp(TCGContext *s, tcg_insn_unit *dest)
+static void tcg_out_jmp(TCGContext *s, const tcg_insn_unit *dest)
 {
 tcg_out_branch(s, 0, dest);
 }
@@ -1786,7 +1786,8 @@ static void add_qemu_ldst_label(TCGContext *s, bool 
is_ld, bool is_64,
 label->datahi_reg = datahi;
 label->addrlo_reg = addrlo;
 label->addrhi_reg = addrhi;
-label->raddr = raddr;
+/* TODO: Cast goes away when all hosts converted */
+label->raddr = (void *)tcg_mirror_rw_to_rx(raddr);
 label->label_ptr[0] = label_ptr[0];
 if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
 label->label_ptr[1] = label_ptr[1];
@@ -2280,7 +2281,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 /* jump displacement must be aligned for atomic patching;
  * see if we need to add extra nops before jump
  */
-gap = tcg_pcrel_diff(s, QEMU_ALIGN_PTR_UP(s->code_ptr + 1, 4));
+gap = QEMU_ALIGN_PTR_UP(s->code_ptr + 1, 4) - s->code_ptr;
 if (gap != 1) {
 tcg_out_nopn(s, gap - 1);
 }
@@ -3825,11 +3826,12 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-tcg_code_gen_epilogue = s->code_ptr;
+/* TODO: Cast goes away when all hosts converted */
+tcg_code_gen_epilogue = (void *)tcg_mirror_rw_to_rx(s->code_ptr);
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_EAX, 0);
 
 /* TB epilogue */
-tb_ret_addr = s->code_ptr;
+tb_ret_addr = tcg_mirror_rw_to_rx(s->code_ptr);
 
 tcg_out_addi(s, TCG_REG_CALL_STACK, stack_addend);
 
-- 
2.25.1

[PATCH v2 17/19] tcg/aarch64: Use B not BL for tcg_out_goto_long

2020-10-29 Thread Richard Henderson

A typo generated a branch-and-link insn instead of plain branch.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index fea784cf75..bd888bc66d 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1317,7 +1317,7 @@ static inline void tcg_out_goto_long(TCGContext *s, 
tcg_insn_unit *target)
 {
 ptrdiff_t offset = target - s->code_ptr;
 if (offset == sextract64(offset, 0, 26)) {
-tcg_out_insn(s, 3206, BL, offset);
+tcg_out_insn(s, 3206, B, offset);
 } else {
 tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_TMP, (intptr_t)target);
 tcg_out_insn(s, 3207, BR, TCG_REG_TMP);
-- 
2.25.1

[PATCH v2 10/19] tcg: Make tb arg to synchronize_from_tb const

2020-10-29 Thread Richard Henderson

There is nothing within the translators that ought to be
changing the TranslationBlock data, so make it const.

This does not actually use the read-only copy of the
data structure that exists within the rx mirror.

Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h   | 3 ++-
 target/arm/cpu.c| 3 ++-
 target/avr/cpu.c| 3 ++-
 target/hppa/cpu.c   | 3 ++-
 target/i386/cpu.c   | 3 ++-
 target/microblaze/cpu.c | 3 ++-
 target/mips/cpu.c   | 3 ++-
 target/riscv/cpu.c  | 3 ++-
 target/rx/cpu.c | 3 ++-
 target/sh4/cpu.c| 3 ++-
 target/sparc/cpu.c  | 3 ++-
 target/tricore/cpu.c| 2 +-
 12 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 9c3a45ad7b..67253e662b 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -189,7 +189,8 @@ struct CPUClass {
 void (*get_memory_mapping)(CPUState *cpu, MemoryMappingList *list,
Error **errp);
 void (*set_pc)(CPUState *cpu, vaddr value);
-void (*synchronize_from_tb)(CPUState *cpu, struct TranslationBlock *tb);
+void (*synchronize_from_tb)(CPUState *cpu,
+const struct TranslationBlock *tb);
 bool (*tlb_fill)(CPUState *cpu, vaddr address, int size,
  MMUAccessType access_type, int mmu_idx,
  bool probe, uintptr_t retaddr);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 07492e9f9a..2f9be1c0ee 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -54,7 +54,8 @@ static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 }
 }
 
-static void arm_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void arm_cpu_synchronize_from_tb(CPUState *cs,
+const TranslationBlock *tb)
 {
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = &cpu->env;
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
index 5d9c4ad5bf..6f3d5a9e4a 100644
--- a/target/avr/cpu.c
+++ b/target/avr/cpu.c
@@ -41,7 +41,8 @@ static bool avr_cpu_has_work(CPUState *cs)
 && cpu_interrupts_enabled(env);
 }
 
-static void avr_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void avr_cpu_synchronize_from_tb(CPUState *cs,
+const TranslationBlock *tb)
 {
 AVRCPU *cpu = AVR_CPU(cs);
 CPUAVRState *env = &cpu->env;
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
index 71b6aca45d..e28f047d10 100644
--- a/target/hppa/cpu.c
+++ b/target/hppa/cpu.c
@@ -35,7 +35,8 @@ static void hppa_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.iaoq_b = value + 4;
 }
 
-static void hppa_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void hppa_cpu_synchronize_from_tb(CPUState *cs,
+ const TranslationBlock *tb)
 {
 HPPACPU *cpu = HPPA_CPU(cs);
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0d8606958e..01a8acafe3 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7012,7 +7012,8 @@ static void x86_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.eip = value;
 }
 
-static void x86_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void x86_cpu_synchronize_from_tb(CPUState *cs,
+const TranslationBlock *tb)
 {
 X86CPU *cpu = X86_CPU(cs);
 
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 9b2482159d..c8e754cfb1 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -83,7 +83,8 @@ static void mb_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.iflags = 0;
 }
 
-static void mb_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void mb_cpu_synchronize_from_tb(CPUState *cs,
+   const TranslationBlock *tb)
 {
 MicroBlazeCPU *cpu = MICROBLAZE_CPU(cs);
 
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index 76d50b00b4..79eee215cf 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -44,7 +44,8 @@ static void mips_cpu_set_pc(CPUState *cs, vaddr value)
 }
 }
 
-static void mips_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void mips_cpu_synchronize_from_tb(CPUState *cs,
+ const TranslationBlock *tb)
 {
 MIPSCPU *cpu = MIPS_CPU(cs);
 CPUMIPSState *env = &cpu->env;
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 0bbfd7f457..faaa9d1e8f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -282,7 +282,8 @@ static void riscv_cpu_set_pc(CPUState *cs, vaddr value)
 env->pc = value;
 }
 
-static void riscv_cpu_synchronize_from_tb(CPUState *cs, TranslationBlock *tb)
+static void riscv_cpu_synchronize_from_tb(CPUState *cs,
+  const TranslationBlock *tb)
 {
 RISCVCPU *cpu = RISCV_CPU(cs);
 CPURISCVState *env = &cpu->env;
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
index 23ee17a701..

[PATCH v2 13/19] accel/tcg: Support split-rwx for linux with memfd

2020-10-29 Thread Richard Henderson

We cannot use a real temp file, because we would need to find
a filesystem that does not have noexec enabled.  However, a
memfd is not associated with any filesystem.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 87 +++
 1 file changed, 80 insertions(+), 7 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 8918a09f10..3e69ebd1d3 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1088,17 +1088,11 @@ static bool alloc_code_gen_buffer(size_t size, int 
mirror, Error **errp)
 return true;
 }
 #else
-static bool alloc_code_gen_buffer(size_t size, int mirror, Error **errp)
+static bool alloc_code_gen_buffer_anon(size_t size, int prot, Error **errp)
 {
-int prot = PROT_WRITE | PROT_READ | PROT_EXEC;
 int flags = MAP_PRIVATE | MAP_ANONYMOUS;
 void *buf;
 
-if (mirror > 0) {
-error_setg(errp, "jit split-rwx not supported");
-return false;
-}
-
 buf = mmap(NULL, size, prot, flags, -1, 0);
 if (buf == MAP_FAILED) {
 error_setg_errno(errp, errno,
@@ -1147,6 +1141,85 @@ static bool alloc_code_gen_buffer(size_t size, int 
mirror, Error **errp)
 tcg_ctx->code_gen_buffer = buf;
 return true;
 }
+
+#ifdef CONFIG_LINUX
+#include "qemu/memfd.h"
+
+static bool alloc_code_gen_buffer_mirror_memfd(size_t size, Error **errp)
+{
+void *buf_rw, *buf_rx;
+int fd;
+
+fd = qemu_memfd_create("tcg-jit", size, false, 0, 0, errp);
+if (fd < 0) {
+return false;
+}
+
+buf_rw = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+if (buf_rw == MAP_FAILED) {
+error_setg_errno(errp, errno,
+ "allocate %zu bytes for jit buffer", size);
+close(fd);
+return false;
+}
+
+buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
+if (buf_rx == MAP_FAILED) {
+error_setg_errno(errp, errno,
+ "allocate %zu bytes for jit mirror", size);
+munmap(buf_rw, size);
+close(fd);
+return false;
+}
+close(fd);
+
+tcg_ctx->code_gen_buffer = buf_rw;
+tcg_ctx->code_gen_buffer_size = size;
+tcg_rx_mirror_diff = buf_rx - buf_rw;
+
+/* Request large pages for the buffer and the mirror.  */
+qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
+qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
+return true;
+}
+#endif
+
+static bool alloc_code_gen_buffer_mirror(size_t size, Error **errp)
+{
+if (TCG_TARGET_SUPPORT_MIRROR) {
+#ifdef CONFIG_LINUX
+return alloc_code_gen_buffer_mirror_memfd(size, errp);
+#endif
+}
+error_setg(errp, "jit split-rwx not supported");
+return false;
+}
+
+static bool alloc_code_gen_buffer(size_t size, int mirror, Error **errp)
+{
+if (mirror) {
+Error *local_err = NULL;
+if (alloc_code_gen_buffer_mirror(size, &local_err)) {
+return true;
+}
+/*
+ * If mirror force-on (1), fail;
+ * if mirror default-on (-1), fall through to mirror off.
+ */
+if (mirror > 0) {
+error_propagate(errp, local_err);
+return false;
+}
+}
+
+int prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+#ifdef CONFIG_TCG_INTERPRETER
+/* The tcg interpreter does not need execute permission. */
+prot = PROT_READ | PROT_WRITE;
+#endif
+
+return alloc_code_gen_buffer_anon(size, prot, errp);
+}
 #endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
 
 static bool tb_cmp(const void *ap, const void *bp)
-- 
2.25.1

[PATCH v2 04/19] tcg: Introduce tcg_mirror_rw_to_rx/tcg_mirror_rx_to_rw

2020-10-29 Thread Richard Henderson

Add two helper functions, using a global variable to hold
the displacement.  The displacement is currently always 0,
so no change in behaviour.

Begin using the functions in tcg common code only.

Signed-off-by: Richard Henderson 
---
 accel/tcg/tcg-runtime.h  |  2 +-
 include/disas/disas.h|  2 +-
 include/exec/exec-all.h  |  2 +-
 include/exec/log.h   |  2 +-
 include/tcg/tcg.h| 28 +
 accel/tcg/cpu-exec.c |  2 +-
 accel/tcg/tcg-runtime.c  |  2 +-
 accel/tcg/translate-all.c| 29 -
 disas.c  |  4 ++-
 tcg/tcg.c| 60 +++-
 tcg/tci.c|  5 +--
 accel/tcg/trace-events   |  2 +-
 tcg/aarch64/tcg-target.c.inc |  2 +-
 13 files changed, 101 insertions(+), 41 deletions(-)

diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
index 4eda24e63a..c276c8beb5 100644
--- a/accel/tcg/tcg-runtime.h
+++ b/accel/tcg/tcg-runtime.h
@@ -24,7 +24,7 @@ DEF_HELPER_FLAGS_1(clrsb_i64, TCG_CALL_NO_RWG_SE, i64, i64)
 DEF_HELPER_FLAGS_1(ctpop_i32, TCG_CALL_NO_RWG_SE, i32, i32)
 DEF_HELPER_FLAGS_1(ctpop_i64, TCG_CALL_NO_RWG_SE, i64, i64)
 
-DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, env)
+DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, cptr, env)
 
 DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
 
diff --git a/include/disas/disas.h b/include/disas/disas.h
index 36c33f6f19..d363e95ede 100644
--- a/include/disas/disas.h
+++ b/include/disas/disas.h
@@ -7,7 +7,7 @@
 #include "cpu.h"
 
 /* Disassemble this for me please... (debugging). */
-void disas(FILE *out, void *code, unsigned long size);
+void disas(FILE *out, const void *code, unsigned long size);
 void target_disas(FILE *out, CPUState *cpu, target_ulong code,
   target_ulong size);
 
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 4707ac140c..aa65103702 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -448,7 +448,7 @@ int probe_access_flags(CPUArchState *env, target_ulong addr,
  * Note: the address of search data can be obtained by adding @size to @ptr.
  */
 struct tb_tc {
-void *ptr;/* pointer to the translated code */
+const void *ptr;/* pointer to the translated code */
 size_t size;
 };
 
diff --git a/include/exec/log.h b/include/exec/log.h
index e02fff5de1..3c7fa65ead 100644
--- a/include/exec/log.h
+++ b/include/exec/log.h
@@ -56,7 +56,7 @@ static inline void log_target_disas(CPUState *cpu, 
target_ulong start,
 rcu_read_unlock();
 }
 
-static inline void log_disas(void *code, unsigned long size)
+static inline void log_disas(const void *code, unsigned long size)
 {
 QemuLogFile *logfile;
 rcu_read_lock();
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 3c56a90abc..f6f84421b2 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -261,7 +261,7 @@ struct TCGLabel {
 unsigned refs : 16;
 union {
 uintptr_t value;
-tcg_insn_unit *value_ptr;
+const tcg_insn_unit *value_ptr;
 } u;
 QSIMPLEQ_HEAD(, TCGRelocation) relocs;
 QSIMPLEQ_ENTRY(TCGLabel) next;
@@ -678,8 +678,24 @@ struct TCGContext {
 extern TCGContext tcg_init_ctx;
 extern __thread TCGContext *tcg_ctx;
 extern void *tcg_code_gen_epilogue;
+extern uintptr_t tcg_rx_mirror_diff;
 extern TCGv_env cpu_env;
 
+#ifdef CONFIG_DEBUG_TCG
+const void *tcg_mirror_rw_to_rx(void *rw);
+void *tcg_mirror_rx_to_rw(const void *rx);
+#else
+static inline const void *tcg_mirror_rw_to_rx(void *rw)
+{
+return rw ? rw + tcg_rx_mirror_diff : NULL;
+}
+
+static inline void *tcg_mirror_rx_to_rw(const void *rx)
+{
+return rx ? (void *)rx - tcg_rx_mirror_diff : NULL;
+}
+#endif
+
 static inline size_t temp_idx(TCGTemp *ts)
 {
 ptrdiff_t n = ts - tcg_ctx->temps;
@@ -1098,7 +1114,7 @@ static inline TCGLabel *arg_label(TCGArg i)
  * correct result.
  */
 
-static inline ptrdiff_t tcg_ptr_byte_diff(void *a, void *b)
+static inline ptrdiff_t tcg_ptr_byte_diff(const void *a, const void *b)
 {
 return a - b;
 }
@@ -1112,9 +1128,9 @@ static inline ptrdiff_t tcg_ptr_byte_diff(void *a, void 
*b)
  * to the destination address.
  */
 
-static inline ptrdiff_t tcg_pcrel_diff(TCGContext *s, void *target)
+static inline ptrdiff_t tcg_pcrel_diff(TCGContext *s, const void *target)
 {
-return tcg_ptr_byte_diff(target, s->code_ptr);
+return tcg_ptr_byte_diff(target, tcg_mirror_rw_to_rx(s->code_ptr));
 }
 
 /**
@@ -1220,9 +1236,9 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
 #define TB_EXIT_REQUESTED 3
 
 #ifdef CONFIG_TCG_INTERPRETER
-uintptr_t tcg_qemu_tb_exec(CPUArchState *env, void *tb_ptr);
+uintptr_t tcg_qemu_tb_exec(CPUArchState *env, const void *tb_ptr);
 #else
-typedef uintptr_t tcg_prologue_fn(CPUArchState *env, void *tb_ptr);
+typedef uintptr_t tcg_prologue_fn(CPUArchState *env, const void *tb_ptr);
 extern tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
diff --git a/

Re: Out-of-Process Device Emulation session at KVM Forum 2020

2020-10-29 Thread Jason Wang




On 2020/10/30 上午2:07, Paolo Bonzini wrote:

On 29/10/20 18:47, Kirti Wankhede wrote:

On 10/29/2020 10:12 PM, Daniel P. Berrangé wrote:

On Thu, Oct 29, 2020 at 04:15:30PM +, David Edmondson wrote:

On Thursday, 2020-10-29 at 21:02:05 +08, Jason Wang wrote:


2) Did qemu even try to migrate opaque blobs before? It's probably a
bad
design of migration protocol as well.

The TPM emulator backend migrates blobs that are only understood by
swtpm.

The separate slirp-helper net backend does the same too IIUC

When sys mem pages are marked dirty and content is copied to
destination, content of sys mem is also opaque to QEMU.

Non-opaque RAM might be a bit too much to expect, though. :)

Paolo



True, and in this case you know you don't need to care about compatibility.

Thanks

[PATCH v2 07/19] tcg: Adjust tcg_register_jit for const

2020-10-29 Thread Richard Henderson

We must change all targets at once, since all must match
the declaration in tcg.c.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h|  2 +-
 tcg/tcg.c| 10 +-
 tcg/aarch64/tcg-target.c.inc |  2 +-
 tcg/arm/tcg-target.c.inc |  2 +-
 tcg/i386/tcg-target.c.inc|  2 +-
 tcg/mips/tcg-target.c.inc|  2 +-
 tcg/ppc/tcg-target.c.inc |  2 +-
 tcg/riscv/tcg-target.c.inc   |  2 +-
 tcg/s390/tcg-target.c.inc|  2 +-
 tcg/sparc/tcg-target.c.inc   |  2 +-
 10 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index f6f84421b2..76717b358b 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1242,7 +1242,7 @@ typedef uintptr_t tcg_prologue_fn(CPUArchState *env, 
const void *tb_ptr);
 extern tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
-void tcg_register_jit(void *buf, size_t buf_size);
+void tcg_register_jit(const void *buf, size_t buf_size);
 
 #if TCG_TARGET_MAYBE_vec
 /* Return zero if the tuple (opc, type, vece) is unsupportable;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index da16378d1c..4d5c95526c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -96,7 +96,7 @@ typedef struct QEMU_PACKED {
 DebugFrameFDEHeader fde;
 } DebugFrameHeader;
 
-static void tcg_register_jit_int(void *buf, size_t size,
+static void tcg_register_jit_int(const void *buf, size_t size,
  const void *debug_frame,
  size_t debug_frame_size)
 __attribute__((unused));
@@ -1133,7 +1133,7 @@ void tcg_prologue_init(TCGContext *s)
 total_size -= prologue_size;
 s->code_gen_buffer_size = total_size;
 
-tcg_register_jit(s->code_gen_buffer, total_size);
+tcg_register_jit(tcg_mirror_rw_to_rx(s->code_gen_buffer), total_size);
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
@@ -4449,7 +4449,7 @@ static int find_string(const char *strtab, const char 
*str)
 }
 }
 
-static void tcg_register_jit_int(void *buf_ptr, size_t buf_size,
+static void tcg_register_jit_int(const void *buf_ptr, size_t buf_size,
  const void *debug_frame,
  size_t debug_frame_size)
 {
@@ -4651,13 +4651,13 @@ static void tcg_register_jit_int(void *buf_ptr, size_t 
buf_size,
 /* No support for the feature.  Provide the entry point expected by exec.c,
and implement the internal function we declared earlier.  */
 
-static void tcg_register_jit_int(void *buf, size_t size,
+static void tcg_register_jit_int(const void *buf, size_t size,
  const void *debug_frame,
  size_t debug_frame_size)
 {
 }
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 }
 #endif /* ELF_HOST_MACHINE */
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 6d8152c468..9ace859db3 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2964,7 +2964,7 @@ static const DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index d6dfe2b428..431af3107c 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -2353,7 +2353,7 @@ static const DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 0ac1ef3d82..7f74c77d7f 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -3998,7 +3998,7 @@ static const DebugFrame debug_frame = {
 #endif
 
 #if defined(ELF_HOST_MACHINE)
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index 064f46fc6d..b74dc15b86 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -2702,7 +2702,7 @@ static const DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 513d784a83..bdaffeabb3 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -3847,7 +3847,7 @@ static DebugFrame debug_frame = {
 }
 };
 
-void tcg_register_jit(void *buf, size_t buf_size)
+void tcg_register_jit(const void *buf, size_t buf_size)
 {
 uint8_t *p = &debug_frame.fde_reg_

[PATCH v2 00/19] Mirror map JIT memory for TCG

2020-10-29 Thread Richard Henderson

This is my take on Joelle's patch set:
https://lists.nongnu.org/archive/html/qemu-devel/2020-10/msg07837.html

First, lots more patches.  For the most part, I convert one interface
at a time, instead of trying to do it all at once.  Then, convert the
tcg backends one at a time, allowing for a backend to say that it has
not been updated and not to use the split.  This takes care of TCI,
for one, which would never be converted, as it makes no sense.  But I
don't expect to ever try to convert mips either -- the memory mapping
constraints there are ugly.

There are many more places that "const" could logically be pushed.
I stopped at several major interfaces and left TODO comments.

I have only converted tcg/i386 and tcg/aarch64 so far.  That should
certainly be sufficient for immediate darwin/iOS testing.

Second, I've taken the start with rw and offset to rx approach, which
is the opposite of Joelle's patch set.  It's a close call, but this
direction may be slightly cleaner.

Third, there are almost no ifdefs.  The only ones are related to host
specific support.  That means that this is always available, modulo
the actual tcg backend support.  When the feature is disabled, we will
be adding and subtracting a 0 stored in a global variable.

Fourth, I have renamed the command-line parameter to "split-rwx".
I don't think this is perfect, and I'm not even sure if it's better
than "mirror-jit".  What this has done, though, is left the code
with inconsistant language -- "mirror" in some places, "split" in
others.  I'll clean that up once we know decide on naming.

Fifth, I have auto-enabled the feature for CONFIG_DEBUG_TCG, so that
it will fall-back to disabled without error.  But if you try to enable
it from the command-line without complete host support a fatal error
will be generated.  But this will make sure that the feature is
regularly tested.


r~


Richard Henderson (19):
  tcg: Enhance flush_icache_range with separate data pointer
  tcg: Move tcg prologue pointer out of TCGContext
  tcg: Move tcg epilogue pointer out of TCGContext
  tcg: Introduce tcg_mirror_rw_to_rx/tcg_mirror_rx_to_rw
  tcg: Adjust tcg_out_call for const
  tcg: Adjust tcg_out_label for const
  tcg: Adjust tcg_register_jit for const
  tcg: Adjust tb_target_set_jmp_target for split rwx
  tcg: Make DisasContextBase.tb const
  tcg: Make tb arg to synchronize_from_tb const
  tcg: Use Error with alloc_code_gen_buffer
  tcg: Add --accel tcg,split-rwx property
  accel/tcg: Support split-rwx for linux with memfd
  RFC: accel/tcg: Support split-rwx for darwin/iOS with vm_remap
  tcg: Return the rx mirror of TranslationBlock from exit_tb
  tcg/i386: Support split-rwx code generation
  tcg/aarch64: Use B not BL for tcg_out_goto_long
  tcg/aarch64: Implement flush_idcache_range manually
  tcg/aarch64: Support split-rwx code generation

 accel/tcg/tcg-runtime.h  |   2 +-
 include/disas/disas.h|   2 +-
 include/exec/exec-all.h  |   2 +-
 include/exec/gen-icount.h|   4 +-
 include/exec/log.h   |   2 +-
 include/exec/translator.h|   2 +-
 include/hw/core/cpu.h|   3 +-
 include/sysemu/tcg.h |   2 +-
 include/tcg/tcg-op.h |   2 +-
 include/tcg/tcg.h|  37 +++--
 tcg/aarch64/tcg-target.h |   9 +-
 tcg/arm/tcg-target.h |  11 +-
 tcg/i386/tcg-target.h|  10 +-
 tcg/mips/tcg-target.h|  11 +-
 tcg/ppc/tcg-target.h |   5 +-
 tcg/riscv/tcg-target.h   |  11 +-
 tcg/s390/tcg-target.h|  12 +-
 tcg/sparc/tcg-target.h   |  11 +-
 tcg/tci/tcg-target.h |  12 +-
 accel/tcg/cpu-exec.c |  41 +++---
 accel/tcg/tcg-all.c  |  26 +++-
 accel/tcg/tcg-runtime.c  |   4 +-
 accel/tcg/translate-all.c| 255 ---
 accel/tcg/translator.c   |   4 +-
 bsd-user/main.c  |   2 +-
 disas.c  |   4 +-
 linux-user/main.c|   2 +-
 softmmu/physmem.c|   9 +-
 target/arm/cpu.c |   3 +-
 target/arm/translate-a64.c   |   2 +-
 target/avr/cpu.c |   3 +-
 target/hppa/cpu.c|   3 +-
 target/i386/cpu.c|   3 +-
 target/microblaze/cpu.c  |   3 +-
 target/mips/cpu.c|   3 +-
 target/riscv/cpu.c   |   3 +-
 target/rx/cpu.c  |   3 +-
 target/sh4/cpu.c |   3 +-
 target/sparc/cpu.c   |   3 +-
 target/tricore/cpu.c |   2 +-
 tcg/tcg-op.c |  15 ++-
 tcg/tcg.c|  85 +---
 tcg/tci.c|   4 +-
 accel/tcg/trace-events   |   2 +-
 tcg/aarch64/tcg-target.c.inc | 130 +-
 tcg/arm/tcg-target.c.inc |   6 +-
 tcg/i386/tcg-target.c.inc|  38 +++---
 tcg/mips/tcg-target.c.inc|  18 +--
 tcg/ppc/tcg-target.c.inc |  45 ---
 tcg/riscv/tcg-target.c.inc   |  12 +-
 tcg/s390/tcg-target.c.inc|   8 +-
 tcg/sparc/tcg-target.c.inc   |  22 +--
 tcg/tcg-pool.c.inc   |   6 +-
 tcg/tci

[PATCH v2 06/19] tcg: Adjust tcg_out_label for const

2020-10-29 Thread Richard Henderson

Simplify the arguments to always use s->code_ptr instead of
take it as an argument.  That makes it easy to ensure that
the value_ptr is always the rx version.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c |  6 +++---
 tcg/i386/tcg-target.c.inc | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index ddc38b8c50..da16378d1c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -301,11 +301,11 @@ static void tcg_out_reloc(TCGContext *s, tcg_insn_unit 
*code_ptr, int type,
 QSIMPLEQ_INSERT_TAIL(&l->relocs, r, next);
 }
 
-static void tcg_out_label(TCGContext *s, TCGLabel *l, tcg_insn_unit *ptr)
+static void tcg_out_label(TCGContext *s, TCGLabel *l)
 {
 tcg_debug_assert(!l->has_value);
 l->has_value = 1;
-l->u.value_ptr = tcg_mirror_rw_to_rx(ptr);
+l->u.value_ptr = tcg_mirror_rw_to_rx(s->code_ptr);
 }
 
 TCGLabel *gen_new_label(void)
@@ -4270,7 +4270,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb)
 break;
 case INDEX_op_set_label:
 tcg_reg_alloc_bb_end(s, s->reserved_regs);
-tcg_out_label(s, arg_label(op->args[0]), s->code_ptr);
+tcg_out_label(s, arg_label(op->args[0]));
 break;
 case INDEX_op_call:
 tcg_reg_alloc_call(s, op);
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 095553ce28..0ac1ef3d82 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1452,7 +1452,7 @@ static void tcg_out_brcond2(TCGContext *s, const TCGArg 
*args,
 default:
 tcg_abort();
 }
-tcg_out_label(s, label_next, s->code_ptr);
+tcg_out_label(s, label_next);
 }
 #endif
 
@@ -1494,10 +1494,10 @@ static void tcg_out_setcond2(TCGContext *s, const 
TCGArg *args,
 
 tcg_out_movi(s, TCG_TYPE_I32, args[0], 0);
 tcg_out_jxx(s, JCC_JMP, label_over, 1);
-tcg_out_label(s, label_true, s->code_ptr);
+tcg_out_label(s, label_true);
 
 tcg_out_movi(s, TCG_TYPE_I32, args[0], 1);
-tcg_out_label(s, label_over, s->code_ptr);
+tcg_out_label(s, label_over);
 } else {
 /* When the destination does not overlap one of the arguments,
clear the destination first, jump if cond false, and emit an
@@ -1511,7 +1511,7 @@ static void tcg_out_setcond2(TCGContext *s, const TCGArg 
*args,
 tcg_out_brcond2(s, new_args, const_args+1, 1);
 
 tgen_arithi(s, ARITH_ADD, args[0], 1, 0);
-tcg_out_label(s, label_over, s->code_ptr);
+tcg_out_label(s, label_over);
 }
 }
 #endif
@@ -1525,7 +1525,7 @@ static void tcg_out_cmov(TCGContext *s, TCGCond cond, int 
rexw,
 TCGLabel *over = gen_new_label();
 tcg_out_jxx(s, tcg_cond_to_jcc[tcg_invert_cond(cond)], over, 1);
 tcg_out_mov(s, TCG_TYPE_I32, dest, v1);
-tcg_out_label(s, over, s->code_ptr);
+tcg_out_label(s, over);
 }
 }
 
-- 
2.25.1

Re: [RFC PATCH v5 00/33] Hexagon patch series

2020-10-29 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/1604016519-28065-1-git-send-email-tsimp...@quicinc.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 1604016519-28065-1-git-send-email-tsimp...@quicinc.com
Subject: [RFC PATCH v5 00/33] Hexagon patch series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/1604016519-28065-1-git-send-email-tsimp...@quicinc.com -> 
patchew/1604016519-28065-1-git-send-email-tsimp...@quicinc.com
Switched to a new branch 'test'
0c330fa Add Dockerfile for hexagon
d85cf6f Hexagon build infrastructure
3af1dde Hexagon (tests/tcg/hexagon) TCG tests
fa9a5dd Hexagon (linux-user/hexagon) Linux user emulation
5176cb3 Hexagon (target/hexagon) translation
274dd27 Hexagon (target/hexagon) TCG for floating point instructions
b802cba Hexagon (target/hexagon) TCG for instructions with multiple definitions
dcc2a90 Hexagon (target/hexagon) TCG generation
576f1cc Hexagon (target/hexagon) instruction classes
452b826 Hexagon (target/hexagon) macros
0a133bf Hexagon (target/hexagon) opcode data structures
767ce35 Hexagon (target/hexagon) generater phase 4 - decode tree
e217e1c Hexagon (target/hexagon) generator phase 3 - C preprocessor for decode 
tree
a776029 Hexagon (target/hexagon) generator phase 2 - generate header files
f7fce74 Hexagon (target/hexagon) generator phase 1 - C preprocessor for 
semantics
bd3fbea Hexagon (target/hexagon/imported) arch import
b0ec0c4 Hexagon (target/hexagon/fma_emu.[ch]) utility functions
71d3179 Hexagon (target/hexagon/conv_emu.[ch]) utility functions
4a2fc1e Hexagon (target/hexagon/arch.[ch]) utility functions
30e1f51 Hexagon (target/hexagon) instruction printing
7811188 Hexagon (target/hexagon) instruction/packet decode
14c10b4 Hexagon (target/hexagon) instruction attributes
04f03bb Hexagon (target/hexagon) register fields
d374a01 Hexagon (target/hexagon) instruction and packet types
797f862 Hexagon (target/hexagon) architecture types
f96acbe Hexagon (target/hexagon) GDB Stub
6e52390 Hexagon (target/hexagon) scalar core helpers
46e63e5 Hexagon (target/hexagon) register names
7ae71bf Hexagon (disas) disassembler
08fba28 Hexagon (target/hexagon) scalar core definition
efd74e2 Hexagon (include/elf.h) ELF machine definition
5771cf0 Hexagon (target/hexagon) README
bbfeea5 Hexagon Update MAINTAINERS file

=== OUTPUT BEGIN ===
1/33 Checking commit bbfeea5304db (Hexagon Update MAINTAINERS file)
2/33 Checking commit 5771cf011159 (Hexagon (target/hexagon) README)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 235 lines checked

Patch 2/33 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/33 Checking commit efd74e2de913 (Hexagon (include/elf.h) ELF machine 
definition)
4/33 Checking commit 08fba2813f35 (Hexagon (target/hexagon) scalar core 
definition)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 595 lines checked

Patch 4/33 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/33 Checking commit 7ae71bf8e6ef (Hexagon (disas) disassembler)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#15: 
new file mode 100644

total: 0 errors, 1 warnings, 82 lines checked

Patch 5/33 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/33 Checking commit 46e63e56e7be (Hexagon (target/hexagon) register names)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#12: 
new file mode 100644

total: 0 errors, 1 warnings, 83 lines checked

Patch 6/33 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
7/33 Checking commit 6e523900586c (Hexagon (target/hexagon) scalar core helpers)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 1056 lines checked

Patch 7/33 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/33 Checking commit f96acbef0e83 (Hexagon (target/hexagon) GDB Stub)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#27: 
new file mode 100644

total: 0 errors, 1 warnings, 64 lines checked

Patch 8/33 has style proble

[PATCH v2 2/8] hw/intc/arm_gicv3_kvm: silence the compiler warnings

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
hw/intc/arm_gicv3_kvm.c: In function ‘kvm_arm_gicv3_put’:
hw/intc/arm_gicv3_kvm.c:484:13: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 kvm_gicc_access(s, ICC_AP0R_EL1(1), ncpu, ®64, true);
 ^~~
hw/intc/arm_gicv3_kvm.c:485:9: note: here
 default:
 ^~~
hw/intc/arm_gicv3_kvm.c:495:13: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 kvm_gicc_access(s, ICC_AP1R_EL1(2), ncpu, ®64, true);
 ^~~
hw/intc/arm_gicv3_kvm.c:496:9: note: here
 case 6:
 ^~~~
hw/intc/arm_gicv3_kvm.c:498:13: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 kvm_gicc_access(s, ICC_AP1R_EL1(1), ncpu, ®64, true);
 ^~~
hw/intc/arm_gicv3_kvm.c:499:9: note: here
 default:
 ^~~

hw/intc/arm_gicv3_kvm.c: In function ‘kvm_arm_gicv3_get’:
hw/intc/arm_gicv3_kvm.c:634:37: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 c->icc_apr[GICV3_G0][2] = reg64;
 ^~~
hw/intc/arm_gicv3_kvm.c:635:9: note: here
 case 6:
 ^~~~
hw/intc/arm_gicv3_kvm.c:637:37: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 c->icc_apr[GICV3_G0][1] = reg64;
 ^~~
hw/intc/arm_gicv3_kvm.c:638:9: note: here
 default:
 ^~~
hw/intc/arm_gicv3_kvm.c:648:39: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 c->icc_apr[GICV3_G1NS][2] = reg64;
 ~~^~~
hw/intc/arm_gicv3_kvm.c:649:9: note: here
 case 6:
 ^~~~
hw/intc/arm_gicv3_kvm.c:651:39: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 c->icc_apr[GICV3_G1NS][1] = reg64;
 ~~^~~
hw/intc/arm_gicv3_kvm.c:652:9: note: here
 default:
 ^~~

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Reviewed-by: Peter Maydell 
---
Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
---
 hw/intc/arm_gicv3_kvm.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 187eb054e0..d040a5d1e9 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -478,9 +478,11 @@ static void kvm_arm_gicv3_put(GICv3State *s)
 kvm_gicc_access(s, ICC_AP0R_EL1(3), ncpu, ®64, true);
 reg64 = c->icc_apr[GICV3_G0][2];
 kvm_gicc_access(s, ICC_AP0R_EL1(2), ncpu, ®64, true);
+/* fall through */
 case 6:
 reg64 = c->icc_apr[GICV3_G0][1];
 kvm_gicc_access(s, ICC_AP0R_EL1(1), ncpu, ®64, true);
+/* fall through */
 default:
 reg64 = c->icc_apr[GICV3_G0][0];
 kvm_gicc_access(s, ICC_AP0R_EL1(0), ncpu, ®64, true);
@@ -492,9 +494,11 @@ static void kvm_arm_gicv3_put(GICv3State *s)
 kvm_gicc_access(s, ICC_AP1R_EL1(3), ncpu, ®64, true);
 reg64 = c->icc_apr[GICV3_G1NS][2];
 kvm_gicc_access(s, ICC_AP1R_EL1(2), ncpu, ®64, true);
+/* fall through */
 case 6:
 reg64 = c->icc_apr[GICV3_G1NS][1];
 kvm_gicc_access(s, ICC_AP1R_EL1(1), ncpu, ®64, true);
+/* fall through */
 default:
 reg64 = c->icc_apr[GICV3_G1NS][0];
 kvm_gicc_access(s, ICC_AP1R_EL1(0), ncpu, ®64, true);
@@ -631,9 +635,11 @@ static void kvm_arm_gicv3_get(GICv3State *s)
 c->icc_apr[GICV3_G0][3] = reg64;
 kvm_gicc_access(s, ICC_AP0R_EL1(2), ncpu, ®64, false);
 c->icc_apr[GICV3_G0][2] = reg64;
+/* fall through */
 case 6:
 kvm_gicc_access(s, ICC_AP0R_EL1(1), ncpu, ®64, false);
 c->icc_apr[GICV3_G0][1] = reg64;
+/* fall through */
 default:
 kvm_gicc_access(s, ICC_AP0R_EL1(0), ncpu, ®64, false);
 c->icc_apr[GICV3_G0][0] = reg64;
@@ -645,9 +651,11 @@ static void kvm_arm_gicv3_get(GICv3State *s)
 c->icc_apr[GICV3_G1NS][3] = reg64;
 kvm_gicc_access(s, ICC_AP1R_EL1(2), ncpu, ®64, false);
 c->icc_apr[GICV3_G1NS][2] = reg64;
+/* fall through */
 case 6:
 kvm_gicc_access(s, ICC_AP1R_EL1(1), ncpu, ®64, false);
 c->icc_apr[GICV3_G1NS][1] = reg64;
+/* fall through */
 default:
 kvm_gicc_access(s, ICC_AP1R_EL1(0), ncpu, ®64, false);
 c->icc_apr[GICV3_G1NS][0] = reg64;
-- 
2.27.0

[PATCH v2 05/19] tcg: Adjust tcg_out_call for const

2020-10-29 Thread Richard Henderson

We must change all targets at once, since all must match
the declaration in tcg.c.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c| 2 +-
 tcg/aarch64/tcg-target.c.inc | 2 +-
 tcg/arm/tcg-target.c.inc | 2 +-
 tcg/i386/tcg-target.c.inc| 4 ++--
 tcg/mips/tcg-target.c.inc| 6 +++---
 tcg/ppc/tcg-target.c.inc | 8 
 tcg/riscv/tcg-target.c.inc   | 6 +++---
 tcg/s390/tcg-target.c.inc| 2 +-
 tcg/sparc/tcg-target.c.inc   | 4 ++--
 tcg/tci/tcg-target.c.inc | 2 +-
 10 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 88b13b9321..ddc38b8c50 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -148,7 +148,7 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg 
arg, TCGReg arg1,
intptr_t arg2);
 static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
 TCGReg base, intptr_t ofs);
-static void tcg_out_call(TCGContext *s, tcg_insn_unit *target);
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *target);
 static int tcg_target_const_match(tcg_target_long val, TCGType type,
   const TCGArgConstraint *arg_ct);
 #ifdef TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 96dc9f4d0b..6d8152c468 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1329,7 +1329,7 @@ static inline void tcg_out_callr(TCGContext *s, TCGReg 
reg)
 tcg_out_insn(s, 3207, BLR, reg);
 }
 
-static inline void tcg_out_call(TCGContext *s, tcg_insn_unit *target)
+static inline void tcg_out_call(TCGContext *s, const tcg_insn_unit *target)
 {
 ptrdiff_t offset = target - s->code_ptr;
 if (offset == sextract64(offset, 0, 26)) {
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 1e32bf42b8..d6dfe2b428 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1033,7 +1033,7 @@ static void tcg_out_goto(TCGContext *s, int cond, 
tcg_insn_unit *addr)
 
 /* The call case is mostly used for helpers - so it's not unreasonable
  * for them to be beyond branch range */
-static void tcg_out_call(TCGContext *s, tcg_insn_unit *addr)
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *addr)
 {
 intptr_t addri = (intptr_t)addr;
 ptrdiff_t disp = tcg_pcrel_diff(s, addr);
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 424dd1cdcf..095553ce28 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1591,7 +1591,7 @@ static void tcg_out_clz(TCGContext *s, int rexw, TCGReg 
dest, TCGReg arg1,
 }
 }
 
-static void tcg_out_branch(TCGContext *s, int call, tcg_insn_unit *dest)
+static void tcg_out_branch(TCGContext *s, int call, const tcg_insn_unit *dest)
 {
 intptr_t disp = tcg_pcrel_diff(s, dest) - 5;
 
@@ -1610,7 +1610,7 @@ static void tcg_out_branch(TCGContext *s, int call, 
tcg_insn_unit *dest)
 }
 }
 
-static inline void tcg_out_call(TCGContext *s, tcg_insn_unit *dest)
+static inline void tcg_out_call(TCGContext *s, const tcg_insn_unit *dest)
 {
 tcg_out_branch(s, 1, dest);
 }
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index f641105f9a..064f46fc6d 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -516,7 +516,7 @@ static void tcg_out_opc_sa64(TCGContext *s, MIPSInsn opc1, 
MIPSInsn opc2,
  * Type jump.
  * Returns true if the branch was in range and the insn was emitted.
  */
-static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, void *target)
+static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, const void *target)
 {
 uintptr_t dest = (uintptr_t)target;
 uintptr_t from = (uintptr_t)s->code_ptr + 4;
@@ -1079,7 +1079,7 @@ static void tcg_out_movcond(TCGContext *s, TCGCond cond, 
TCGReg ret,
 }
 }
 
-static void tcg_out_call_int(TCGContext *s, tcg_insn_unit *arg, bool tail)
+static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool 
tail)
 {
 /* Note that the ABI requires the called function's address to be
loaded into T9, even if a direct branch is in range.  */
@@ -1097,7 +1097,7 @@ static void tcg_out_call_int(TCGContext *s, tcg_insn_unit 
*arg, bool tail)
 }
 }
 
-static void tcg_out_call(TCGContext *s, tcg_insn_unit *arg)
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg)
 {
 tcg_out_call_int(s, arg, false);
 tcg_out_nop(s);
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index be116c6164..513d784a83 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1106,7 +1106,7 @@ static void tcg_out_xori32(TCGContext *s, TCGReg dst, 
TCGReg src, uint32_t c)
 tcg_out_zori32(s, dst, src, c, XORI, XORIS);
 }
 
-static void tcg_out_b(TCGContext *s, int mask, tcg_insn_unit *target)
+static void tcg_out_b(TCGContext *s, int mask, const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_pcrel_diff(s, target);
 if (in_range_b(disp)) {
@

[PATCH v2 7/8] ppc: Add a missing break for PPC6xx_INPUT_TBEN

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
hw/ppc/ppc.c: In function ‘ppc6xx_set_irq’:
hw/ppc/ppc.c:118:16: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  118 | if (level) {
  |^
hw/ppc/ppc.c:123:9: note: here
  123 | case PPC6xx_INPUT_INT:
  | ^~~~

According to the discussion, a break statement needs to be added here.

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
---
v1->v2: Add a "break" statement here instead of /* fall through */ comments
(Base on Thomas's and David review).

Cc: Thomas Huth 
Cc: David Gibson 
---
 hw/ppc/ppc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 4a11fb1640..1b98272076 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -120,6 +120,7 @@ static void ppc6xx_set_irq(void *opaque, int pin, int level)
 } else {
 cpu_ppc_tb_stop(env);
 }
+break;
 case PPC6xx_INPUT_INT:
 /* Level sensitive - active high */
 LOG_IRQ("%s: set the external IRQ state to %d\n",
-- 
2.27.0

[PATCH v2 1/8] target/i386: silence the compiler warnings in gen_shiftd_rm_T1

2020-10-29 Thread Chen Qun

The current "#ifdef TARGET_X86_64" statement affects
the compiler's determination of fall through.

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
target/i386/translate.c: In function ‘gen_shiftd_rm_T1’:
target/i386/translate.c:1773:12: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 if (is_right) {
^
target/i386/translate.c:1782:5: note: here
 case MO_32:
 ^~~~

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Reviewed-by: Richard Henderson 
Reviewed-by: Thomas Huth 
---
v1->v2: Add comments to explain the two case of fall through,
depending on whether TARGET_X86_64 is defined.

Cc: Thomas Huth 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
---
 target/i386/translate.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index caea6f5fb1..77cb66208e 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -1777,9 +1777,12 @@ static void gen_shiftd_rm_T1(DisasContext *s, MemOp ot, 
int op1,
 } else {
 tcg_gen_deposit_tl(s->T1, s->T0, s->T1, 16, 16);
 }
-/* FALLTHRU */
-#ifdef TARGET_X86_64
+/*
+ * If TARGET_X86_64 defined then fall through into MO_32 case,
+ * otherwise fall through default case.
+ */
 case MO_32:
+#ifdef TARGET_X86_64
 /* Concatenate the two 32-bit values and use a 64-bit shift.  */
 tcg_gen_subi_tl(s->tmp0, count, 1);
 if (is_right) {
-- 
2.27.0

[PATCH v2 3/8] accel/tcg/user-exec: silence the compiler warnings

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
../accel/tcg/user-exec.c: In function ‘handle_cpu_signal’:
../accel/tcg/user-exec.c:169:13: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  169 | cpu_exit_tb_from_sighandler(cpu, old_set);
  | ^
../accel/tcg/user-exec.c:172:9: note: here
  172 | default:

Mark the cpu_exit_tb_from_sighandler() function with QEMU_NORETURN to fix it.

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
---
v1->v2: Add QEMU_NORETURN to cpu_exit_tb_from_sighandler() function
to avoid the compiler warnings(Base on Thomas's and Richard's comments).

Cc: Thomas Huth 
Cc: Riku Voipio 
Cc: Richard Henderson 
Cc: Paolo Bonzini 
---
 accel/tcg/user-exec.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 4ebe25461a..293ee86ea4 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -49,7 +49,8 @@ __thread uintptr_t helper_retaddr;
 /* exit the current TB from a signal handler. The host registers are
restored in a state compatible with the CPU emulator
  */
-static void cpu_exit_tb_from_sighandler(CPUState *cpu, sigset_t *old_set)
+static void QEMU_NORETURN cpu_exit_tb_from_sighandler(CPUState *cpu,
+  sigset_t *old_set)
 {
 /* XXX: use siglongjmp ? */
 sigprocmask(SIG_SETMASK, old_set, NULL);
-- 
2.27.0

[RFC PATCH v5 33/33] Add Dockerfile for hexagon

2020-10-29 Thread Taylor Simpson

Signed-off-by: Alessandro Di Federico 
Signed-off-by: Taylor Simpson 
---
 .../debian-hexagon-cross.build-toolchain.sh| 141 +
 .../docker/dockerfiles/debian-hexagon-cross.docker |  18 +++
 2 files changed, 159 insertions(+)
 create mode 100755 
tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh
 create mode 100644 tests/docker/dockerfiles/debian-hexagon-cross.docker

diff --git a/tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh 
b/tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh
new file mode 100755
index 000..a08c6cd
--- /dev/null
+++ b/tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh
@@ -0,0 +1,141 @@
+#!/bin/bash
+
+set -e
+
+BASE=$(readlink -f ${PWD})
+
+TOOLCHAIN_INSTALL=$(readlink -f "$TOOLCHAIN_INSTALL")
+ROOTFS=$(readlink -f "$ROOTFS")
+
+TOOLCHAIN_BIN=${TOOLCHAIN_INSTALL}/bin
+HEX_SYSROOT=${TOOLCHAIN_INSTALL}/hexagon-unknown-linux-musl
+HEX_TOOLS_TARGET_BASE=${HEX_SYSROOT}/usr
+
+function cdp() {
+  DIR="$1"
+  mkdir -p "$DIR"
+  cd "$DIR"
+}
+
+function fetch() {
+  DIR="$1"
+  URL="$2"
+  TEMP="$(readlink -f "$PWD/tmp.tar.gz")"
+  wget --quiet "$URL" -O "$TEMP"
+  cdp "$DIR"
+  tar xaf "$TEMP" --strip-components=1
+  rm "$TEMP"
+  cd -
+}
+
+build_llvm_clang() {
+  fetch "$BASE/llvm-project" "$LLVM_URL"
+  cdp "$BASE/build-llvm"
+
+  cmake -G Ninja \
+-DCMAKE_BUILD_TYPE=Release \
+-DCMAKE_INSTALL_PREFIX=${TOOLCHAIN_INSTALL} \
+-DLLVM_ENABLE_LLD=ON \
+-DLLVM_TARGETS_TO_BUILD="X86;Hexagon" \
+-DLLVM_ENABLE_PROJECTS="clang;lld" \
+"$BASE/llvm-project/llvm"
+  ninja all install
+  cd ${TOOLCHAIN_BIN}
+  ln -sf clang hexagon-unknown-linux-musl-clang
+  ln -sf clang++ hexagon-unknown-linux-musl-clang++
+  ln -sf llvm-ar hexagon-unknown-linux-musl-ar
+  ln -sf llvm-objdump hexagon-unknown-linux-musl-objdump
+  ln -sf llvm-objcopy hexagon-unknown-linux-musl-objcopy
+  ln -sf llvm-readelf hexagon-unknown-linux-musl-readelf
+  ln -sf llvm-ranlib hexagon-unknown-linux-musl-ranlib
+
+  # workaround for now:
+  cat < hexagon-unknown-linux-musl.cfg
+-G0 --sysroot=${HEX_SYSROOT}
+EOF
+}
+
+build_clang_rt() {
+  cdp "$BASE/build-clang_rt"
+  cmake -G Ninja \
+-DCMAKE_BUILD_TYPE=Release \
+-DLLVM_CONFIG_PATH="$BASE/build-llvm/bin/llvm-config" \
+-DCMAKE_ASM_FLAGS="-G0 -mlong-calls -fno-pic 
--target=hexagon-unknown-linux-musl " \
+-DCMAKE_SYSTEM_NAME=Linux \
+-DCMAKE_C_COMPILER="${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang" \
+-DCMAKE_ASM_COMPILER="${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang" \
+-DCMAKE_INSTALL_PREFIX=${HEX_TOOLS_TARGET_BASE} \
+-DCMAKE_CROSSCOMPILING=ON \
+-DCMAKE_C_COMPILER_FORCED=ON \
+-DCMAKE_CXX_COMPILER_FORCED=ON \
+-DCOMPILER_RT_BUILD_BUILTINS=ON \
+-DCOMPILER_RT_BUILTINS_ENABLE_PIC=OFF \
+-DCMAKE_SIZEOF_VOID_P=4 \
+-DCOMPILER_RT_OS_DIR= \
+-DCAN_TARGET_hexagon=1 \
+-DCAN_TARGET_x86_64=0 \
+-DCOMPILER_RT_SUPPORTED_ARCH=hexagon \
+-DLLVM_ENABLE_PROJECTS="compiler-rt" \
+"$BASE/llvm-project/compiler-rt"
+  ninja install-compiler-rt
+}
+
+build_musl_headers() {
+  fetch "$BASE/musl" "$MUSL_URL"
+  cd "$BASE/musl"
+  make clean
+  CC=${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang \
+CROSS_COMPILE=hexagon-unknown-linux-musl \
+LIBCC=${HEX_TOOLS_TARGET_BASE}/lib/libclang_rt.builtins-hexagon.a \
+CROSS_CFLAGS="-G0 -O0 -mv65 -fno-builtin -fno-rounding-math 
--target=hexagon-unknown-linux-musl" \
+./configure --target=hexagon --prefix=${HEX_TOOLS_TARGET_BASE}
+  PATH=${TOOLCHAIN_BIN}:$PATH make CROSS_COMPILE= install-headers
+
+  cd ${HEX_SYSROOT}/..
+  ln -sf hexagon-unknown-linux-musl hexagon
+}
+
+build_kernel_headers() {
+  fetch "$BASE/linux" "$LINUX_URL"
+  mkdir -p "$BASE/build-linux"
+  cd "$BASE/linux"
+  make O=../build-linux ARCH=hexagon \
+   KBUILD_CFLAGS_KERNEL="-mlong-calls" \
+   CC=${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang \
+   LD=${TOOLCHAIN_BIN}/ld.lld \
+   KBUILD_VERBOSE=1 comet_defconfig
+  make mrproper
+
+  cd "$BASE/build-linux"
+  make \
+ARCH=hexagon \
+CC=${TOOLCHAIN_BIN}/clang \
+INSTALL_HDR_PATH=${HEX_TOOLS_TARGET_BASE} \
+V=1 \
+headers_install
+}
+
+build_musl() {
+  cd "$BASE/musl"
+  make clean
+  CROSS_COMPILE=hexagon-unknown-linux-musl- \
+AR=llvm-ar \
+RANLIB=llvm-ranlib \
+STRIP=llvm-strip \
+CC=clang \
+LIBCC=${HEX_TOOLS_TARGET_BASE}/lib/libclang_rt.builtins-hexagon.a \
+CFLAGS="-G0 -O0 -mv65 -fno-builtin -fno-rounding-math 
--target=hexagon-unknown-linux-musl" \
+./configure --target=hexagon --prefix=${HEX_TOOLS_TARGET_BASE}
+  PATH=${TOOLCHAIN_BIN}/:$PATH make -j CROSS_COMPILE= install
+  cd ${HEX_TOOLS_TARGET_BASE}/lib
+  ln -sf libc.so ld-musl-hexagon.so
+  ln -sf ld-musl-hexagon.so ld-musl-hexagon.so.1
+  cdp ${HEX_TOOLS_TARGET_BASE}/../lib
+  ln -sf ../usr/lib/ld-musl-hexagon.so.1
+}
+
+build_llvm_clang
+build_kernel_headers
+build_musl_headers
+build_clang_rt
+build_musl
diff

[RFC PATCH v5 32/33] Hexagon build infrastructure

2020-10-29 Thread Taylor Simpson

Add file to default-configs
Add hexagon to meson.build
Add hexagon to target/meson.build
Add target/hexagon/meson.build
Change scripts/qemu-binfmt-conf.sh

We can build a hexagon-linux-user target and run programs on the Hexagon
scalar core.  With hexagon-linux-clang installed, "make check-tcg" will
pass.

Signed-off-by: Taylor Simpson 
---
 default-configs/targets/hexagon-linux-user.mak |   1 +
 meson.build|   1 +
 scripts/qemu-binfmt-conf.sh|   6 +-
 target/hexagon/meson.build | 178 +
 target/meson.build |   1 +
 5 files changed, 186 insertions(+), 1 deletion(-)
 create mode 100644 default-configs/targets/hexagon-linux-user.mak
 create mode 100644 target/hexagon/meson.build

diff --git a/default-configs/targets/hexagon-linux-user.mak 
b/default-configs/targets/hexagon-linux-user.mak
new file mode 100644
index 000..003ed0a
--- /dev/null
+++ b/default-configs/targets/hexagon-linux-user.mak
@@ -0,0 +1 @@
+TARGET_ARCH=hexagon
diff --git a/meson.build b/meson.build
index 47e32e1..6a393f9 100644
--- a/meson.build
+++ b/meson.build
@@ -814,6 +814,7 @@ disassemblers = {
   'arm' : ['CONFIG_ARM_DIS'],
   'avr' : ['CONFIG_AVR_DIS'],
   'cris' : ['CONFIG_CRIS_DIS'],
+  'hexagon' : ['CONFIG_HEXAGON_DIS'],
   'hppa' : ['CONFIG_HPPA_DIS'],
   'i386' : ['CONFIG_I386_DIS'],
   'x86_64' : ['CONFIG_I386_DIS'],
diff --git a/scripts/qemu-binfmt-conf.sh b/scripts/qemu-binfmt-conf.sh
index 9f1580a..7b5d54b 100755
--- a/scripts/qemu-binfmt-conf.sh
+++ b/scripts/qemu-binfmt-conf.sh
@@ -4,7 +4,7 @@
 qemu_target_list="i386 i486 alpha arm armeb sparc sparc32plus sparc64 \
 ppc ppc64 ppc64le m68k mips mipsel mipsn32 mipsn32el mips64 mips64el \
 sh4 sh4eb s390x aarch64 aarch64_be hppa riscv32 riscv64 xtensa xtensaeb \
-microblaze microblazeel or1k x86_64"
+microblaze microblazeel or1k x86_64 hexagon"
 
 
i386_magic='\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00'
 
i386_mask='\xff\xff\xff\xff\xff\xfe\xfe\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'
@@ -136,6 +136,10 @@ 
or1k_magic='\x7fELF\x01\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\
 
or1k_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff'
 or1k_family=or1k
 
+hexagon_magic='\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xa4\x00'
+hexagon_mask='\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff'
+hexagon_family=hexagon
+
 qemu_get_family() {
 cpu=${HOST_ARCH:-$(uname -m)}
 case "$cpu" in
diff --git a/target/hexagon/meson.build b/target/hexagon/meson.build
new file mode 100644
index 000..8ff5cf6
--- /dev/null
+++ b/target/hexagon/meson.build
@@ -0,0 +1,178 @@
+##
+##  Copyright(c) 2020 Qualcomm Innovation Center, Inc. All Rights Reserved.
+##
+##  This program is free software; you can redistribute it and/or modify
+##  it under the terms of the GNU General Public License as published by
+##  the Free Software Foundation; either version 2 of the License, or
+##  (at your option) any later version.
+##
+##  This program is distributed in the hope that it will be useful,
+##  but WITHOUT ANY WARRANTY; without even the implied warranty of
+##  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+##  GNU General Public License for more details.
+##
+##  You should have received a copy of the GNU General Public License
+##  along with this program; if not, see .
+##
+
+hexagon_ss = ss.source_set()
+
+prog_python = import('python').find_installation('python3')
+
+hex_common_py = 'hex_common.py'
+attribs_def_h = meson.current_source_dir() / 'attribs_def.h'
+gen_tcg_h = meson.current_source_dir() / 'gen_tcg.h'
+
+#
+#  Step 1
+#  We use a C program to create semantics_generated.pyinc
+#
+gen_semantics = executable('gen_semantics', 'gen_semantics.c')
+
+semantics = custom_target(
+'semantics_generated.pyinc',
+output: 'semantics_generated.pyinc',
+input: gen_semantics,
+command: ['@INPUT@', '@OUTPUT@'],
+)
+hexagon_ss.add(semantics)
+
+#
+# Step 2
+# We use Python scripts to generate the following files
+# shortcode_generated.h
+# helper_protos_generated.h
+# tcg_funcs_generated.h
+# tcg_func_table_generated.h
+# helper_funcs_generated.h
+# printinsn_generated.h
+# op_regs_generated.h
+# op_attribs_generated.h
+# opcodes_def_generated.h
+#
+shortcode_h = custom_target(
+'shortcode_generated.h',
+output: 'shortcode_generated.h',
+input: 'gen_shortcode.py',
+depend_files: [hex_common_py],
+command: [prog_python, '@INPUT@', semantics, attribs_def_h, '@OUTPUT@'],
+)
+hexagon_ss.add(shortcode_h)
+
+helper_protos_h = custom_target(
+'helper_protos_generated.h',
+output: 'helper_protos_generated.h',
+input: 'gen_helper_protos.py',
+depend_files: [hex_common_py],
+command: [prog_python, '@INPUT@',

[PATCH v2 18/19] tcg/aarch64: Implement flush_idcache_range manually

2020-10-29 Thread Richard Henderson

Copy the single pointer implementation from libgcc and modify it to
support the double pointer interface we require.  This halves the
number of cache operations required when split-rwx is enabled.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h | 11 +---
 tcg/aarch64/tcg-target.c.inc | 53 
 2 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index fa64058d43..e62d38ba55 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -148,16 +148,7 @@ typedef enum {
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-/* Flush the dcache at RW, and the icache at RX, as necessary. */
-static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
-{
-/* TODO: Copy this from gcc to avoid 4 loops instead of 2. */
-if (rw != rx) {
-__builtin___clear_cache((char *)rw, (char *)(rw + len));
-}
-__builtin___clear_cache((char *)rx, (char *)(rx + len));
-}
-
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len);
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index bd888bc66d..5e8f3faad2 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -2968,3 +2968,56 @@ void tcg_register_jit(const void *buf, size_t buf_size)
 {
 tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
 }
+
+/*
+ * Flush the dcache at RW, and the icache at RX, as necessary.
+ * This is a copy of gcc's __aarch64_sync_cache_range, modified
+ * to fit this three-operand interface.
+ */
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
+{
+const unsigned CTR_IDC = 1u << 28;
+const unsigned CTR_DIC = 1u << 29;
+static unsigned int cache_info;
+uintptr_t icache_lsize, dcache_lsize, p;
+
+if (!cache_info) {
+/*
+ * CTR_EL0 [3:0] contains log2 of icache line size in words.
+ * CTR_EL0 [19:16] contains log2 of dcache line size in words.
+ */
+asm volatile("mrs\t%0, ctr_el0" : "=r"(cache_info));
+}
+
+icache_lsize = 4 << extract32(cache_info, 0, 4);
+dcache_lsize = 4 << extract32(cache_info, 16, 4);
+
+/*
+ * If CTR_EL0.IDC is enabled, Data cache clean to the Point of Unification
+ * is not required for instruction to data coherence.
+ */
+if (!(cache_info & CTR_IDC)) {
+/*
+ * Loop over the address range, clearing one cache line at once.
+ * Data cache must be flushed to unification first to make sure
+ * the instruction cache fetches the updated data.
+ */
+for (p = rw & -dcache_lsize; p < rw + len; p += dcache_lsize) {
+asm volatile("dc\tcvau, %0" : : "r" (p) : "memory");
+}
+asm volatile("dsb\tish" : : : "memory");
+}
+
+/*
+ * If CTR_EL0.DIC is enabled, Instruction cache cleaning to the Point
+ * of Unification is not required for instruction to data coherence.
+ */
+if (!(cache_info & CTR_DIC)) {
+for (p = rx & -icache_lsize; p < rx + len; p += icache_lsize) {
+asm volatile("ic\tivau, %0" : : "r"(p) : "memory");
+}
+asm volatile ("dsb\tish" : : : "memory");
+}
+
+asm volatile("isb" : : : "memory");
+}
-- 
2.25.1

[RFC PATCH v5 24/33] Hexagon (target/hexagon) macros

2020-10-29 Thread Taylor Simpson

macros to interface with the generator
macros referenced in instruction semantics

Signed-off-by: Taylor Simpson 
---
 target/hexagon/macros.h | 591 
 1 file changed, 591 insertions(+)
 create mode 100644 target/hexagon/macros.h

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
new file mode 100644
index 000..ff2ddd0
--- /dev/null
+++ b/target/hexagon/macros.h
@@ -0,0 +1,591 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_MACROS_H
+#define HEXAGON_MACROS_H
+
+#include "cpu.h"
+#include "hex_regs.h"
+#include "reg_fields.h"
+
+#ifdef QEMU_GENERATE
+#define READ_REG(dest, NUM)  gen_read_reg(dest, NUM)
+#define READ_PREG(dest, NUM) gen_read_preg(dest, (NUM))
+#else
+#define READ_REG(NUM)(env->gpr[(NUM)])
+#define READ_PREG(NUM)   (env->pred[NUM])
+
+#define WRITE_RREG(NUM, VAL) log_reg_write(env, NUM, VAL, slot)
+#define WRITE_PREG(NUM, VAL) log_pred_write(env, NUM, VAL)
+#endif
+
+#define PCALIGN 4
+#define PCALIGN_MASK (PCALIGN - 1)
+
+#define GET_FIELD(FIELD, REGIN) \
+fEXTRACTU_BITS(REGIN, reg_field_info[FIELD].width, \
+   reg_field_info[FIELD].offset)
+
+#ifdef QEMU_GENERATE
+#define GET_USR_FIELD(FIELD, DST) \
+tcg_gen_extract_tl(DST, hex_gpr[HEX_REG_USR], \
+   reg_field_info[FIELD].offset, \
+   reg_field_info[FIELD].width)
+
+#define TYPE_INT(X)  __builtin_types_compatible_p(typeof(X), int)
+#define TYPE_TCGV(X) __builtin_types_compatible_p(typeof(X), TCGv)
+#define TYPE_TCGV_I64(X) __builtin_types_compatible_p(typeof(X), TCGv_i64)
+
+#define SET_USR_FIELD_FUNC(X) \
+__builtin_choose_expr(TYPE_INT(X), \
+gen_set_usr_fieldi, \
+__builtin_choose_expr(TYPE_TCGV(X), \
+gen_set_usr_field, (void)0))
+#define SET_USR_FIELD(FIELD, VAL) \
+SET_USR_FIELD_FUNC(VAL)(FIELD, VAL)
+#else
+#define GET_USR_FIELD(FIELD) \
+fEXTRACTU_BITS(env->gpr[HEX_REG_USR], reg_field_info[FIELD].width, \
+   reg_field_info[FIELD].offset)
+
+#define SET_USR_FIELD(FIELD, VAL) \
+fINSERT_BITS(env->gpr[HEX_REG_USR], reg_field_info[FIELD].width, \
+ reg_field_info[FIELD].offset, (VAL))
+#endif
+
+#ifdef QEMU_GENERATE
+/*
+ * Section 5.5 of the Hexagon V67 Programmer's Reference Manual
+ *
+ * Slot 1 store with slot 0 load
+ * A slot 1 store operation with a slot 0 load operation can appear in a 
packet.
+ * The packet attribute :mem_noshuf inhibits the instruction reordering that
+ * would otherwise be done by the assembler. For example:
+ * {
+ * memw(R5) = R2 // slot 1 store
+ * R3 = memh(R6) // slot 0 load
+ * }:mem_noshuf
+ * Unlike most packetized operations, these memory operations are not executed
+ * in parallel (Section 3.3.1). Instead, the store instruction in Slot 1
+ * effectively executes first, followed by the load instruction in Slot 0. If
+ * the addresses of the two operations are overlapping, the load will receive
+ * the newly stored data. This feature is supported in processor versions
+ * V65 or greater.
+ *
+ *
+ * For qemu, we look for a load in slot 0 when there is  a store in slot 1
+ * in the same packet.  When we see this, we call a helper that merges the
+ * bytes from the store buffer with the value loaded from memory.
+ */
+#define CHECK_NOSHUF \
+do { \
+if (insn->slot == 0 && pkt->pkt_has_store_s1) { \
+process_store(ctx, 1); \
+} \
+} while (0)
+
+#define MEM_LOAD1s(DST, VA) \
+do { \
+CHECK_NOSHUF; \
+tcg_gen_qemu_ld8s(DST, VA, ctx->mem_idx); \
+} while (0)
+#define MEM_LOAD1u(DST, VA) \
+do { \
+CHECK_NOSHUF; \
+tcg_gen_qemu_ld8u(DST, VA, ctx->mem_idx); \
+} while (0)
+#define MEM_LOAD2s(DST, VA) \
+do { \
+CHECK_NOSHUF; \
+tcg_gen_qemu_ld16s(DST, VA, ctx->mem_idx); \
+} while (0)
+#define MEM_LOAD2u(DST, VA) \
+do { \
+CHECK_NOSHUF; \
+tcg_gen_qemu_ld16u(DST, VA, ctx->mem_idx); \
+} while (0)
+#define MEM_LOAD4s(DST, VA) \
+do { \
+CHECK_NOSHUF; \
+tcg_gen_qemu_ld32s(DST, VA, ctx->mem_idx); \
+} w

[PATCH v2 14/19] RFC: accel/tcg: Support split-rwx for darwin/iOS with vm_remap

2020-10-29 Thread Richard Henderson

Cribbed from code posted by Joelle van Dyne ,
and rearranged to a cleaner structure.  Completely untested.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 68 ++-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 3e69ebd1d3..bf8263fdb4 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1093,6 +1093,11 @@ static bool alloc_code_gen_buffer_anon(size_t size, int 
prot, Error **errp)
 int flags = MAP_PRIVATE | MAP_ANONYMOUS;
 void *buf;
 
+#ifdef CONFIG_DARWIN
+/* Applicable to both iOS and macOS (Apple Silicon). */
+flags |= MAP_JIT;
+#endif
+
 buf = mmap(NULL, size, prot, flags, -1, 0);
 if (buf == MAP_FAILED) {
 error_setg_errno(errp, errno,
@@ -1182,13 +1187,74 @@ static bool alloc_code_gen_buffer_mirror_memfd(size_t 
size, Error **errp)
 qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
 return true;
 }
-#endif
+#endif /* CONFIG_LINUX */
+
+#ifdef CONFIG_DARWIN
+#include 
+
+extern kern_return_t mach_vm_remap(vm_map_t target_task,
+   mach_vm_address_t *target_address,
+   mach_vm_size_t size,
+   mach_vm_offset_t mask,
+   int flags,
+   vm_map_t src_task,
+   mach_vm_address_t src_address,
+   boolean_t copy,
+   vm_prot_t *cur_protection,
+   vm_prot_t *max_protection,
+   vm_inherit_t inheritance);
+
+static bool alloc_code_gen_buffer_mirror_vmremap(size_t size, Error **errp)
+{
+kern_return_t ret;
+mach_vm_address_t buf_rw, buf_rx;
+vm_prot_t cur_prot, max_prot;
+
+/* Map the read-write portion via normal anon memory. */
+if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE, errp)) {
+return false;
+}
+
+buf_rw = tcg_ctx->code_gen_buffer;
+buf_rx = 0;
+ret = mach_vm_remap(mach_task_self(),
+&buf_rx,
+size,
+0,
+VM_FLAGS_ANYWHERE | VM_FLAGS_RANDOM_ADDR,
+mach_task_self(),
+buf_rw,
+false,
+&cur_prot,
+&max_prot,
+VM_INHERIT_NONE);
+if (ret != KERN_SUCCESS) {
+/* TODO: Convert "ret" to a human readable error message. */
+error_setg(errp, "vm_remap for jit mirror failed");
+munmap((void *)buf_rw, size);
+return false;
+}
+
+if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
+error_setg_errno(errp, errno, "mprotect for jit mirror");
+munmap((void *)buf_rx, size);
+munmap((void *)buf_rw, size);
+return false;
+}
+
+tcg_rx_mirror_diff = buf_rx - buf_rw;
+return true;
+}
+#endif /* CONFIG_DARWIN */
 
 static bool alloc_code_gen_buffer_mirror(size_t size, Error **errp)
 {
 if (TCG_TARGET_SUPPORT_MIRROR) {
 #ifdef CONFIG_LINUX
 return alloc_code_gen_buffer_mirror_memfd(size, errp);
+#endif
+#ifdef CONFIG_DARWIN
+return alloc_code_gen_buffer_mirror_vmremap(size, errp);
 #endif
 }
 error_setg(errp, "jit split-rwx not supported");
-- 
2.25.1

[PATCH v2 15/19] tcg: Return the rx mirror of TranslationBlock from exit_tb

2020-10-29 Thread Richard Henderson

This produces a small pc-relative displacement within the
generated code to the TB structure that preceeds it.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cpu-exec.c | 35 ++-
 tcg/tcg-op.c | 13 -
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 4af3faba80..f3d17f28d0 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -144,12 +144,13 @@ static void init_delay_params(SyncClocks *sc, const 
CPUState *cpu)
 #endif /* CONFIG USER ONLY */
 
 /* Execute a TB, and fix up the CPU state afterwards if necessary */
-static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock 
*itb)
+static inline TranslationBlock *cpu_tb_exec(CPUState *cpu,
+TranslationBlock *itb,
+int *tb_exit)
 {
 CPUArchState *env = cpu->env_ptr;
 uintptr_t ret;
 TranslationBlock *last_tb;
-int tb_exit;
 const void *tb_ptr = itb->tc.ptr;
 
 qemu_log_mask_and_addr(CPU_LOG_EXEC, itb->pc,
@@ -177,11 +178,20 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
TranslationBlock *itb)
 
 ret = tcg_qemu_tb_exec(env, tb_ptr);
 cpu->can_do_io = 1;
-last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
-tb_exit = ret & TB_EXIT_MASK;
-trace_exec_tb_exit(last_tb, tb_exit);
+/*
+ * TODO: Delay swapping back to the read-write mirror of the TB
+ * until we actually need to modify the TB.  The read-only copy,
+ * coming from the rx mirror, shares the same host TLB entry as
+ * the code that executed the exit_tb opcode that arrived here.
+ * If we insist on touching both the RX and the RW pages, we
+ * double the host TLB pressure.
+ */
+last_tb = tcg_mirror_rx_to_rw((void *)(ret & ~TB_EXIT_MASK));
+*tb_exit = ret & TB_EXIT_MASK;
 
-if (tb_exit > TB_EXIT_IDX1) {
+trace_exec_tb_exit(last_tb, *tb_exit);
+
+if (*tb_exit > TB_EXIT_IDX1) {
 /* We didn't start executing this TB (eg because the instruction
  * counter hit zero); we must restore the guest PC to the address
  * of the start of the TB.
@@ -199,7 +209,7 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, 
TranslationBlock *itb)
 cc->set_pc(cpu, last_tb->pc);
 }
 }
-return ret;
+return last_tb;
 }
 
 #ifndef CONFIG_USER_ONLY
@@ -210,6 +220,7 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
 {
 TranslationBlock *tb;
 uint32_t cflags = curr_cflags() | CF_NOCACHE;
+int tb_exit;
 
 if (ignore_icount) {
 cflags &= ~CF_USE_ICOUNT;
@@ -227,7 +238,7 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
 
 /* execute the generated code */
 trace_exec_tb_nocache(tb, tb->pc);
-cpu_tb_exec(cpu, tb);
+cpu_tb_exec(cpu, tb, &tb_exit);
 
 mmap_lock();
 tb_phys_invalidate(tb, -1);
@@ -244,6 +255,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
 uint32_t flags;
 uint32_t cflags = 1;
 uint32_t cf_mask = cflags & CF_HASH_MASK;
+int tb_exit;
 
 if (sigsetjmp(cpu->jmp_env, 0) == 0) {
 start_exclusive();
@@ -260,7 +272,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
 cc->cpu_exec_enter(cpu);
 /* execute the generated code */
 trace_exec_tb(tb, pc);
-cpu_tb_exec(cpu, tb);
+cpu_tb_exec(cpu, tb, &tb_exit);
 cc->cpu_exec_exit(cpu);
 } else {
 /*
@@ -653,13 +665,10 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
 static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
 TranslationBlock **last_tb, int *tb_exit)
 {
-uintptr_t ret;
 int32_t insns_left;
 
 trace_exec_tb(tb, tb->pc);
-ret = cpu_tb_exec(cpu, tb);
-tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
-*tb_exit = ret & TB_EXIT_MASK;
+tb = cpu_tb_exec(cpu, tb, tb_exit);
 if (*tb_exit != TB_EXIT_REQUESTED) {
 *last_tb = tb;
 return;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index e3dc0cb4cb..f0d22de3de 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2666,7 +2666,18 @@ void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, 
TCGv_i64 arg)
 
 void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
 {
-uintptr_t val = (uintptr_t)tb + idx;
+/*
+ * Let the jit code return the read-only version of the
+ * TranslationBlock, so that we minimize the pc-relative
+ * distance of the address of the exit_tb code to TB.
+ * This will improve utilization of pc-relative address loads.
+ *
+ * TODO: Move this to translator_loop, so that all const
+ * TranslationBlock pointers refer to read-only memory.
+ * This requires coordination with targets that do not use
+ * the translator_loop.
+ */
+uintptr_t val = (uintptr_t)tcg_mirror_rw_to_rx((void *)tb) + idx;
 
 if (tb == NULL) {

[RFC PATCH v5 30/33] Hexagon (linux-user/hexagon) Linux user emulation

2020-10-29 Thread Taylor Simpson

Implementation of Linux user emulation for Hexagon
Some common files modified in addition to new files in linux-user/hexagon

Signed-off-by: Taylor Simpson 
Reviewed-by: Richard Henderson 
---
 linux-user/hexagon/sockbits.h   |  18 ++
 linux-user/hexagon/syscall_nr.h | 322 
 linux-user/hexagon/target_cpu.h |  44 +
 linux-user/hexagon/target_elf.h |  40 +
 linux-user/hexagon/target_fcntl.h   |  18 ++
 linux-user/hexagon/target_signal.h  |  34 
 linux-user/hexagon/target_structs.h |  46 ++
 linux-user/hexagon/target_syscall.h |  36 
 linux-user/hexagon/termbits.h   |  18 ++
 linux-user/qemu.h   |   2 +
 linux-user/syscall_defs.h   |  33 
 linux-user/elfload.c|  16 ++
 linux-user/hexagon/cpu_loop.c   |  99 +++
 linux-user/hexagon/signal.c | 276 +++
 scripts/gensyscalls.sh  |   1 +
 15 files changed, 1003 insertions(+)
 create mode 100644 linux-user/hexagon/sockbits.h
 create mode 100644 linux-user/hexagon/syscall_nr.h
 create mode 100644 linux-user/hexagon/target_cpu.h
 create mode 100644 linux-user/hexagon/target_elf.h
 create mode 100644 linux-user/hexagon/target_fcntl.h
 create mode 100644 linux-user/hexagon/target_signal.h
 create mode 100644 linux-user/hexagon/target_structs.h
 create mode 100644 linux-user/hexagon/target_syscall.h
 create mode 100644 linux-user/hexagon/termbits.h
 create mode 100644 linux-user/hexagon/cpu_loop.c
 create mode 100644 linux-user/hexagon/signal.c

diff --git a/linux-user/hexagon/sockbits.h b/linux-user/hexagon/sockbits.h
new file mode 100644
index 000..a6e8966
--- /dev/null
+++ b/linux-user/hexagon/sockbits.h
@@ -0,0 +1,18 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "../generic/sockbits.h"
diff --git a/linux-user/hexagon/syscall_nr.h b/linux-user/hexagon/syscall_nr.h
new file mode 100644
index 000..da1314f
--- /dev/null
+++ b/linux-user/hexagon/syscall_nr.h
@@ -0,0 +1,322 @@
+/*
+ * This file contains the system call numbers.
+ * Do not modify.
+ * This file is generated by scripts/gensyscalls.sh
+ */
+#ifndef LINUX_USER_HEXAGON_SYSCALL_NR_H
+#define LINUX_USER_HEXAGON_SYSCALL_NR_H
+
+#define TARGET_NR_io_setup 0
+#define TARGET_NR_io_destroy 1
+#define TARGET_NR_io_submit 2
+#define TARGET_NR_io_cancel 3
+#define TARGET_NR_io_getevents 4
+#define TARGET_NR_setxattr 5
+#define TARGET_NR_lsetxattr 6
+#define TARGET_NR_fsetxattr 7
+#define TARGET_NR_getxattr 8
+#define TARGET_NR_lgetxattr 9
+#define TARGET_NR_fgetxattr 10
+#define TARGET_NR_listxattr 11
+#define TARGET_NR_llistxattr 12
+#define TARGET_NR_flistxattr 13
+#define TARGET_NR_removexattr 14
+#define TARGET_NR_lremovexattr 15
+#define TARGET_NR_fremovexattr 16
+#define TARGET_NR_getcwd 17
+#define TARGET_NR_lookup_dcookie 18
+#define TARGET_NR_eventfd2 19
+#define TARGET_NR_epoll_create1 20
+#define TARGET_NR_epoll_ctl 21
+#define TARGET_NR_epoll_pwait 22
+#define TARGET_NR_dup 23
+#define TARGET_NR_dup3 24
+#define TARGET_NR_fcntl64 25
+#define TARGET_NR_inotify_init1 26
+#define TARGET_NR_inotify_add_watch 27
+#define TARGET_NR_inotify_rm_watch 28
+#define TARGET_NR_ioctl 29
+#define TARGET_NR_ioprio_set 30
+#define TARGET_NR_ioprio_get 31
+#define TARGET_NR_flock 32
+#define TARGET_NR_mknodat 33
+#define TARGET_NR_mkdirat 34
+#define TARGET_NR_unlinkat 35
+#define TARGET_NR_symlinkat 36
+#define TARGET_NR_linkat 37
+#define TARGET_NR_renameat 38
+#define TARGET_NR_umount2 39
+#define TARGET_NR_mount 40
+#define TARGET_NR_pivot_root 41
+#define TARGET_NR_nfsservctl 42
+#define TARGET_NR_statfs64 43
+#define TARGET_NR_fstatfs64 44
+#define TARGET_NR_truncate64 45
+#define TARGET_NR_ftruncate64 46
+#define TARGET_NR_fallocate 47
+#define TARGET_NR_faccessat 48
+#define TARGET_NR_chdir 49
+#define TARGET_NR_fchdir 50
+#define TARGET_NR_chroot 51
+#define TARGET_NR_fchmod 52
+#define TARGET_NR_fchmodat 53
+#define TARGET_NR_fchownat 54
+#define TARGET_NR_fchown 55
+#define TARGET_NR_openat 56
+#define TARGET_NR_close 57
+#define TARGET_NR_vhangup 58
+#define TARGET_NR_pipe2 59
+#define TARGET_NR_quotactl 60
+#define TARGET_NR_getdents64 61
+#define TARGET_NR_llseek 62
+#define TARGET_NR_read 63
+#define TARGET_NR_

[PATCH v2 03/19] tcg: Move tcg epilogue pointer out of TCGContext

2020-10-29 Thread Richard Henderson

This value is constant across all thread-local copies of TCGContext,
so we might as well move it out of thread-local storage.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h| 2 +-
 accel/tcg/tcg-runtime.c  | 2 +-
 tcg/tcg.c| 3 ++-
 tcg/aarch64/tcg-target.c.inc | 4 ++--
 tcg/arm/tcg-target.c.inc | 2 +-
 tcg/i386/tcg-target.c.inc| 4 ++--
 tcg/mips/tcg-target.c.inc| 2 +-
 tcg/ppc/tcg-target.c.inc | 2 +-
 tcg/riscv/tcg-target.c.inc   | 4 ++--
 tcg/s390/tcg-target.c.inc| 4 ++--
 tcg/sparc/tcg-target.c.inc   | 2 +-
 11 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 5ff5bf2a73..3c56a90abc 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -621,7 +621,6 @@ struct TCGContext {
here, because there's too much arithmetic throughout that relies
on addition and subtraction working on bytes.  Rely on the GCC
extension that allows arithmetic on void*.  */
-void *code_gen_epilogue;
 void *code_gen_buffer;
 size_t code_gen_buffer_size;
 void *code_gen_ptr;
@@ -678,6 +677,7 @@ struct TCGContext {
 
 extern TCGContext tcg_init_ctx;
 extern __thread TCGContext *tcg_ctx;
+extern void *tcg_code_gen_epilogue;
 extern TCGv_env cpu_env;
 
 static inline size_t temp_idx(TCGTemp *ts)
diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
index 446465a09a..f85dfefeab 100644
--- a/accel/tcg/tcg-runtime.c
+++ b/accel/tcg/tcg-runtime.c
@@ -154,7 +154,7 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env)
 
 tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, curr_cflags());
 if (tb == NULL) {
-return tcg_ctx->code_gen_epilogue;
+return tcg_code_gen_epilogue;
 }
 qemu_log_mask_and_addr(CPU_LOG_EXEC, pc,
"Chain %d: %p ["
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 8d63c714fb..1916a818d9 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -160,6 +160,7 @@ static int tcg_out_ldst_finalize(TCGContext *s);
 static TCGContext **tcg_ctxs;
 static unsigned int n_tcg_ctxs;
 TCGv_env cpu_env = 0;
+void *tcg_code_gen_epilogue;
 
 #ifndef CONFIG_TCG_INTERPRETER
 tcg_prologue_fn *tcg_qemu_tb_exec;
@@ -1128,7 +1129,7 @@ void tcg_prologue_init(TCGContext *s)
 
 /* Assert that goto_ptr is implemented completely.  */
 if (TCG_TARGET_HAS_goto_ptr) {
-tcg_debug_assert(s->code_gen_epilogue != NULL);
+tcg_debug_assert(tcg_code_gen_epilogue != NULL);
 }
 }
 
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 83af3108a4..76f8ae48ad 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1873,7 +1873,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 case INDEX_op_exit_tb:
 /* Reuse the zeroing that exists for goto_ptr.  */
 if (a0 == 0) {
-tcg_out_goto_long(s, s->code_gen_epilogue);
+tcg_out_goto_long(s, tcg_code_gen_epilogue);
 } else {
 tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_X0, a0);
 tcg_out_goto_long(s, tb_ret_addr);
@@ -2894,7 +2894,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-s->code_gen_epilogue = s->code_ptr;
+tcg_code_gen_epilogue = s->code_ptr;
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_X0, 0);
 
 /* TB epilogue */
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 62c37a954b..1e32bf42b8 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -2297,7 +2297,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-s->code_gen_epilogue = s->code_ptr;
+tcg_code_gen_epilogue = s->code_ptr;
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R0, 0);
 tcg_out_epilogue(s);
 }
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index d8797ed398..424dd1cdcf 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -2267,7 +2267,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
opc,
 case INDEX_op_exit_tb:
 /* Reuse the zeroing that exists for goto_ptr.  */
 if (a0 == 0) {
-tcg_out_jmp(s, s->code_gen_epilogue);
+tcg_out_jmp(s, tcg_code_gen_epilogue);
 } else {
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_EAX, a0);
 tcg_out_jmp(s, tb_ret_addr);
@@ -3825,7 +3825,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  * Return path for goto_ptr. Set return value to 0, a-la exit_tb,
  * and fall through to the rest of the epilogue.
  */
-s->code_gen_epilogue = s->code_ptr;
+tcg_code_gen_epilogue = s->code_ptr;
 tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_EAX, 0);
 
 /* TB epilogue */
diff --git a/tcg/mips/tcg-target.c.inc

[PATCH v2 12/19] tcg: Add --accel tcg,split-rwx property

2020-10-29 Thread Richard Henderson

Plumb the value through to alloc_code_gen_buffer.
This is not supported by any os or tcg backend so
for now, enabling it will result in an error.

Signed-off-by: Richard Henderson 
---
 include/sysemu/tcg.h  |  2 +-
 tcg/aarch64/tcg-target.h  |  1 +
 tcg/arm/tcg-target.h  |  1 +
 tcg/i386/tcg-target.h |  1 +
 tcg/mips/tcg-target.h |  1 +
 tcg/ppc/tcg-target.h  |  1 +
 tcg/riscv/tcg-target.h|  1 +
 tcg/s390/tcg-target.h |  1 +
 tcg/sparc/tcg-target.h|  1 +
 tcg/tci/tcg-target.h  |  1 +
 accel/tcg/tcg-all.c   | 26 +-
 accel/tcg/translate-all.c | 35 +++
 bsd-user/main.c   |  2 +-
 linux-user/main.c |  2 +-
 14 files changed, 64 insertions(+), 12 deletions(-)

diff --git a/include/sysemu/tcg.h b/include/sysemu/tcg.h
index d9d3ca8559..5734dd92dc 100644
--- a/include/sysemu/tcg.h
+++ b/include/sysemu/tcg.h
@@ -8,7 +8,7 @@
 #ifndef SYSEMU_TCG_H
 #define SYSEMU_TCG_H
 
-void tcg_exec_init(unsigned long tb_size);
+void tcg_exec_init(unsigned long tb_size, int mirror);
 #ifdef CONFIG_TCG
 extern bool tcg_allowed;
 #define tcg_enabled() (tcg_allowed)
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 91313d93be..fa64058d43 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -164,5 +164,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif /* AARCH64_TCG_TARGET_H */
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index b21a2fb6a1..e355d6a4b2 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -150,5 +150,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index f52ba0ffec..1b9d41bd56 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -236,5 +236,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index cd548dacec..d231522dc9 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -206,6 +206,7 @@ extern bool use_mips32r2_instructions;
 
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 /* Flush the dcache at RW, and the icache at RX, as necessary. */
 static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 8f3e4c924a..78d6a5e96f 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -185,5 +185,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index e03fd17427..3c2e8305b0 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -179,5 +179,6 @@ void tb_target_set_jmp_target(uintptr_t, uintptr_t, 
uintptr_t, uintptr_t);
 #define TCG_TARGET_NEED_POOL_LABELS
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 0
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c5a749e425..8324197127 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -163,5 +163,6 @@ static inline void tb_target_set_jmp_target(uintptr_t 
tc_ptr, uintptr_t jmp_rx,
 #define TCG_TARGET_NEED_LDST_LABELS
 #endif
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 87e2be61e6..517840705f 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -181,5 +181,6 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_NEED_POOL_LABELS
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 #endif
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index a19a6b06e5..3653fef947 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -200,6 +200,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 #define TCG_TARGET_DEFAULT_MO  (0)
 
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
+#define TCG_TARGET_SUPPORT_MIRROR   0
 
 static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
 uintptr_t jmp_rw, uintptr_t addr)
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index fa1

[RFC PATCH v5 28/33] Hexagon (target/hexagon) TCG for floating point instructions

2020-10-29 Thread Taylor Simpson

The imported code uses host floating point.  We override them
to use qemu softfloat

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h | 121 +++
 1 file changed, 121 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 35568d1..d605b1e 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -195,4 +195,125 @@
 #define fGEN_TCG_S4_stored_locked(SHORTCODE) \
 do { SHORTCODE; READ_PREG(PdV, PdN); } while (0)
 
+/* Floating point */
+#define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
+gen_helper_conv_sf2df(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_df2sf(SHORTCODE) \
+gen_helper_conv_df2sf(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_uw2sf(SHORTCODE) \
+gen_helper_conv_uw2sf(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_uw2df(SHORTCODE) \
+gen_helper_conv_uw2df(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_w2sf(SHORTCODE) \
+gen_helper_conv_w2sf(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_w2df(SHORTCODE) \
+gen_helper_conv_w2df(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_ud2sf(SHORTCODE) \
+gen_helper_conv_ud2sf(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_ud2df(SHORTCODE) \
+gen_helper_conv_ud2df(RddV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_d2sf(SHORTCODE) \
+gen_helper_conv_d2sf(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_d2df(SHORTCODE) \
+gen_helper_conv_d2df(RddV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_sf2uw(SHORTCODE) \
+gen_helper_conv_sf2uw(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_sf2w(SHORTCODE) \
+gen_helper_conv_sf2w(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_sf2ud(SHORTCODE) \
+gen_helper_conv_sf2ud(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_sf2d(SHORTCODE) \
+gen_helper_conv_sf2d(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_df2uw(SHORTCODE) \
+gen_helper_conv_df2uw(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_df2w(SHORTCODE) \
+gen_helper_conv_df2w(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_df2ud(SHORTCODE) \
+gen_helper_conv_df2ud(RddV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_df2d(SHORTCODE) \
+gen_helper_conv_df2d(RddV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_sf2uw_chop(SHORTCODE) \
+gen_helper_conv_sf2uw_chop(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_sf2w_chop(SHORTCODE) \
+gen_helper_conv_sf2w_chop(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_sf2ud_chop(SHORTCODE) \
+gen_helper_conv_sf2ud_chop(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_sf2d_chop(SHORTCODE) \
+gen_helper_conv_sf2d_chop(RddV, cpu_env, RsV)
+#define fGEN_TCG_F2_conv_df2uw_chop(SHORTCODE) \
+gen_helper_conv_df2uw_chop(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_df2w_chop(SHORTCODE) \
+gen_helper_conv_df2w_chop(RdV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_df2ud_chop(SHORTCODE) \
+gen_helper_conv_df2ud_chop(RddV, cpu_env, RssV)
+#define fGEN_TCG_F2_conv_df2d_chop(SHORTCODE) \
+gen_helper_conv_df2d_chop(RddV, cpu_env, RssV)
+#define fGEN_TCG_F2_sfadd(SHORTCODE) \
+gen_helper_sfadd(RdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfsub(SHORTCODE) \
+gen_helper_sfsub(RdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfcmpeq(SHORTCODE) \
+gen_helper_sfcmpeq(PdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfcmpgt(SHORTCODE) \
+gen_helper_sfcmpgt(PdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfcmpge(SHORTCODE) \
+gen_helper_sfcmpge(PdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfcmpuo(SHORTCODE) \
+gen_helper_sfcmpuo(PdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfmax(SHORTCODE) \
+gen_helper_sfmax(RdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfmin(SHORTCODE) \
+gen_helper_sfmin(RdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sfclass(SHORTCODE) \
+do { \
+TCGv imm = tcg_const_tl(uiV); \
+gen_helper_sfclass(PdV, cpu_env, RsV, imm); \
+tcg_temp_free(imm); \
+} while (0)
+#define fGEN_TCG_F2_sffixupn(SHORTCODE) \
+gen_helper_sffixupn(RdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sffixupd(SHORTCODE) \
+gen_helper_sffixupd(RdV, cpu_env, RsV, RtV)
+#define fGEN_TCG_F2_sffixupr(SHORTCODE) \
+gen_helper_sffixupr(RdV, cpu_env, RsV)
+#define fGEN_TCG_F2_dfadd(SHORTCODE) \
+gen_helper_dfadd(RddV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfsub(SHORTCODE) \
+gen_helper_dfsub(RddV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfmax(SHORTCODE) \
+gen_helper_dfmax(RddV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfmin(SHORTCODE) \
+gen_helper_dfmin(RddV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfcmpeq(SHORTCODE) \
+gen_helper_dfcmpeq(PdV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfcmpgt(SHORTCODE) \
+gen_helper_dfcmpgt(PdV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfcmpge(SHORTCODE) \
+gen_helper_dfcmpge(PdV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfcmpuo(SHORTCODE) \
+gen_helper_dfcmpuo(PdV, cpu_env, RssV, RttV)
+#define fGEN_TCG_F2_dfclass(SHORTCODE) \
+do { \
+TCGv imm = tcg_const_tl(uiV); \
+gen_helpe

[RFC PATCH v5 29/33] Hexagon (target/hexagon) translation

2020-10-29 Thread Taylor Simpson

Read the instruction memory
Create a packet data structure
Generate TCG code for the start of the packet
Invoke the generate function for each instruction
Generate TCG code for the end of the packet

Signed-off-by: Taylor Simpson 
---
 target/hexagon/translate.h |  85 ++
 target/hexagon/translate.c | 687 +
 2 files changed, 772 insertions(+)
 create mode 100644 target/hexagon/translate.h
 create mode 100644 target/hexagon/translate.c

diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
new file mode 100644
index 000..c91d89d
--- /dev/null
+++ b/target/hexagon/translate.h
@@ -0,0 +1,85 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_TRANSLATE_H
+#define HEXAGON_TRANSLATE_H
+
+#include "qemu/bitmap.h"
+#include "cpu.h"
+#include "exec/translator.h"
+#include "tcg/tcg-op.h"
+#include "internal.h"
+
+typedef struct DisasContext {
+DisasContextBase base;
+uint32_t mem_idx;
+int reg_log[REG_WRITES_MAX];
+int reg_log_idx;
+DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS);
+int preg_log[PRED_WRITES_MAX];
+int preg_log_idx;
+uint8_t store_width[STORES_MAX];
+uint8_t s1_store_processed;
+} DisasContext;
+
+static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
+{
+#if HEX_DEBUG
+if (test_bit(rnum, ctx->regs_written)) {
+HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum);
+}
+#endif
+ctx->reg_log[ctx->reg_log_idx] = rnum;
+ctx->reg_log_idx++;
+set_bit(rnum, ctx->regs_written);
+}
+
+static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
+{
+ctx->preg_log[ctx->preg_log_idx] = pnum;
+ctx->preg_log_idx++;
+}
+
+static inline bool is_preloaded(DisasContext *ctx, int num)
+{
+return test_bit(num, ctx->regs_written);
+}
+
+extern TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_pred[NUM_PREGS];
+extern TCGv hex_next_PC;
+extern TCGv hex_this_PC;
+extern TCGv hex_slot_cancelled;
+extern TCGv hex_branch_taken;
+extern TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
+extern TCGv hex_new_pred_value[NUM_PREGS];
+extern TCGv hex_pred_written;
+extern TCGv hex_store_addr[STORES_MAX];
+extern TCGv hex_store_width[STORES_MAX];
+extern TCGv hex_store_val32[STORES_MAX];
+extern TCGv_i64 hex_store_val64[STORES_MAX];
+extern TCGv hex_dczero_addr;
+extern TCGv hex_llsc_addr;
+extern TCGv hex_llsc_val;
+extern TCGv_i64 hex_llsc_val_i64;
+
+extern void gen_exception(int excp);
+extern void gen_exception_debug(void);
+
+extern void process_store(DisasContext *ctx, int slot_num);
+#endif
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
new file mode 100644
index 000..efc51ce
--- /dev/null
+++ b/target/hexagon/translate.c
@@ -0,0 +1,687 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#define QEMU_GENERATE
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "tcg/tcg-op.h"
+#include "exec/cpu_ldst.h"
+#include "exec/log.h"
+#include "internal.h"
+#include "attribs.h"
+#include "insn.h"
+#include "decode.h"
+#include "translate.h"
+#include "printinsn.h"
+
+TCGv hex_gpr[TOTAL_PER_THREAD_REGS];
+TCGv hex_pred[NUM_PREGS];
+TCGv hex_next_PC;
+TCGv hex_this_PC;
+TCGv hex_slot_cancelled;
+TCGv hex_branch_taken;
+TCGv hex_new_value[TOTAL_PER_THREAD_REGS];
+#if HEX_DEBUG
+TCGv hex_reg_written[TOTAL_PER_THREAD_REGS];
+#endif
+TCGv hex_new_pred_value[NUM_PREGS];
+TCGv hex_pred_written;
+TCGv hex_store_addr[STORES_MAX];
+TCGv hex_store_width[STORES_MAX];
+TCGv hex_store_val3

[RFC PATCH v5 27/33] Hexagon (target/hexagon) TCG for instructions with multiple definitions

2020-10-29 Thread Taylor Simpson

Helpers won't work if there are multiple definitions, so we override these
instructions using #define fGEN_TCG_.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h | 198 +++
 1 file changed, 198 insertions(+)
 create mode 100644 target/hexagon/gen_tcg.h

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
new file mode 100644
index 000..35568d1
--- /dev/null
+++ b/target/hexagon/gen_tcg.h
@@ -0,0 +1,198 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_GEN_TCG_H
+#define HEXAGON_GEN_TCG_H
+
+/*
+ * Here is a primer to understand the tag names for load/store instructions
+ *
+ * Data types
+ *  bsigned byte   r0 = memb(r2+#0)
+ * ubunsigned byte r0 = memub(r2+#0)
+ *  hsigned half word (16 bits)r0 = memh(r2+#0)
+ * uhunsigned half wordr0 = memuh(r2+#0)
+ *  iinteger (32 bits) r0 = memw(r2+#0)
+ *  ddouble word (64 bits) r1:0 = memd(r2+#0)
+ *
+ * Addressing modes
+ * _io   indirect with offset  r0 = memw(r1+#4)
+ * _ur   absolute with register offset r0 = memw(r1<<#4+##variable)
+ * _rr   indirect with register offset r0 = memw(r1+r4<<#2)
+ * gpglobal pointer relative   r0 = memw(gp+#200)
+ * _sp   stack pointer relativer0 = memw(r29+#12)
+ * _ap   absolute set  r0 = memw(r1=##variable)
+ * _pr   post increment register   r0 = memw(r1++m1)
+ * _pi   post increment immediate  r0 = memb(r1++#1)
+ */
+
+/* Macros for complex addressing modes */
+#define GET_EA_ap \
+do { \
+fEA_IMM(UiV); \
+tcg_gen_movi_tl(ReV, UiV); \
+} while (0)
+#define GET_EA_pr \
+do { \
+fEA_REG(RxV); \
+fPM_M(RxV, MuV); \
+} while (0)
+#define GET_EA_pi \
+do { \
+fEA_REG(RxV); \
+fPM_I(RxV, siV); \
+} while (0)
+
+
+/* Instructions with multiple definitions */
+#define fGEN_TCG_LOAD_AP(RES, SIZE, SIGN) \
+do { \
+fMUST_IMMEXT(UiV); \
+fEA_IMM(UiV); \
+fLOAD(1, SIZE, SIGN, EA, RES); \
+tcg_gen_movi_tl(ReV, UiV); \
+} while (0)
+
+#define fGEN_TCG_L4_loadrub_ap(SHORTCODE) \
+fGEN_TCG_LOAD_AP(RdV, 1, u)
+#define fGEN_TCG_L4_loadrb_ap(SHORTCODE) \
+fGEN_TCG_LOAD_AP(RdV, 1, s)
+#define fGEN_TCG_L4_loadruh_ap(SHORTCODE) \
+fGEN_TCG_LOAD_AP(RdV, 2, u)
+#define fGEN_TCG_L4_loadrh_ap(SHORTCODE) \
+fGEN_TCG_LOAD_AP(RdV, 2, s)
+#define fGEN_TCG_L4_loadri_ap(SHORTCODE) \
+fGEN_TCG_LOAD_AP(RdV, 4, u)
+#define fGEN_TCG_L4_loadrd_ap(SHORTCODE) \
+fGEN_TCG_LOAD_AP(RddV, 8, u)
+
+#define fGEN_TCG_L2_loadrub_pr(SHORTCODE)  SHORTCODE
+#define fGEN_TCG_L2_loadrub_pi(SHORTCODE)  SHORTCODE
+#define fGEN_TCG_L2_loadrb_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrb_pi(SHORTCODE)   SHORTCODE;
+#define fGEN_TCG_L2_loadruh_pr(SHORTCODE)  SHORTCODE
+#define fGEN_TCG_L2_loadruh_pi(SHORTCODE)  SHORTCODE;
+#define fGEN_TCG_L2_loadrh_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrh_pi(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadri_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadri_pi(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrd_pr(SHORTCODE)   SHORTCODE
+#define fGEN_TCG_L2_loadrd_pi(SHORTCODE)   SHORTCODE
+
+/*
+ * Predicated loads
+ * Here is a primer to understand the tag names
+ *
+ * Predicate used
+ *  ttrue "old" value  if (p0) r0 = memb(r2+#0)
+ *  ffalse "old" value if (!p0) r0 = memb(r2+#0)
+ *  tnew true "new" value  if (p0.new) r0 = memb(r2+#0)
+ *  fnew false "new" value if (!p0.new) r0 = 
memb(r2+#0)
+ */
+#define fGEN_TCG_PRED_LOAD(GET_EA, PRED, SIZE, SIGN) \
+do { \
+TCGv LSB = tcg_temp_local_new(); \
+TCGLabel *label = gen_new_label(); \
+GET_EA; \
+PRED;  \
+PRED_LOAD_CANCEL(LSB, EA); \
+tcg_gen_movi_tl(RdV, 0); \
+tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, label); \
+fLOAD(1, SIZ

[PATCH] target/riscv/csr.c : add space before the open parenthesis '('

2020-10-29 Thread Xinhao Zhang

Fix code style. Space required before the open parenthesis '('.

Signed-off-by: Xinhao Zhang 
Signed-off-by: Kai Deng 
Reported-by: Euler Robot 
---
 target/riscv/csr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index aaef6c6f20..e8b149f0d2 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -881,7 +881,7 @@ static int write_satp(CPURISCVState *env, int csrno, 
target_ulong val)
 if (env->priv == PRV_S && get_field(env->mstatus, MSTATUS_TVM)) {
 return -RISCV_EXCP_ILLEGAL_INST;
 } else {
-if((val ^ env->satp) & SATP_ASID) {
+if ((val ^ env->satp) & SATP_ASID) {
 tlb_flush(env_cpu(env));
 }
 env->satp = val;
-- 
2.29.0-rc1

[RFC PATCH v5 25/33] Hexagon (target/hexagon) instruction classes

2020-10-29 Thread Taylor Simpson

Determine legal VLIW slots for each instruction

Signed-off-by: Taylor Simpson 
---
 target/hexagon/iclass.h| 50 ++
 target/hexagon/iclass.c| 73 ++
 target/hexagon/imported/iclass.def | 51 ++
 3 files changed, 174 insertions(+)
 create mode 100644 target/hexagon/iclass.h
 create mode 100644 target/hexagon/iclass.c
 create mode 100644 target/hexagon/imported/iclass.def

diff --git a/target/hexagon/iclass.h b/target/hexagon/iclass.h
new file mode 100644
index 000..b57f11d
--- /dev/null
+++ b/target/hexagon/iclass.h
@@ -0,0 +1,50 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_ICLASS_H
+#define HEXAGON_ICLASS_H
+
+#include "opcodes.h"
+
+#define ICLASS_FROM_TYPE(TYPE) ICLASS_##TYPE
+
+enum {
+
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS)ICLASS_FROM_TYPE(TYPE),
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS)ICLASS_FROM_TYPE(TYPE),
+#include "imported/iclass.def"
+#undef DEF_PP_ICLASS32
+#undef DEF_EE_ICLASS32
+
+ICLASS_FROM_TYPE(COPROC_VX),
+ICLASS_FROM_TYPE(COPROC_VMEM),
+NUM_ICLASSES
+};
+
+typedef enum {
+SLOTS_0  = (1 << 0),
+SLOTS_1  = (1 << 1),
+SLOTS_2  = (1 << 2),
+SLOTS_3  = (1 << 3),
+SLOTS_01 = SLOTS_0 | SLOTS_1,
+SLOTS_23 = SLOTS_2 | SLOTS_3,
+SLOTS_0123   = SLOTS_0 | SLOTS_1 | SLOTS_2 | SLOTS_3,
+} SlotMask;
+
+extern SlotMask find_iclass_slots(Opcode opcode, int itype);
+
+#endif
diff --git a/target/hexagon/iclass.c b/target/hexagon/iclass.c
new file mode 100644
index 000..05117a9
--- /dev/null
+++ b/target/hexagon/iclass.c
@@ -0,0 +1,73 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "iclass.h"
+
+static const SlotMask iclass_info[] = {
+
+#define DEF_PP_ICLASS32(TYPE, SLOTS, UNITS) \
+[ICLASS_FROM_TYPE(TYPE)] = SLOTS_##SLOTS,
+#define DEF_EE_ICLASS32(TYPE, SLOTS, UNITS) \
+[ICLASS_FROM_TYPE(TYPE)] = SLOTS_##SLOTS,
+#include "imported/iclass.def"
+#undef DEF_PP_ICLASS32
+#undef DEF_EE_ICLASS32
+};
+
+SlotMask find_iclass_slots(Opcode opcode, int itype)
+{
+/* There are some exceptions to what the iclass dictates */
+if (GET_ATTRIB(opcode, A_ICOP)) {
+return SLOTS_2;
+} else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT0ONLY)) {
+return SLOTS_0;
+} else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT1ONLY)) {
+return SLOTS_1;
+} else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT2ONLY)) {
+return SLOTS_2;
+} else if (GET_ATTRIB(opcode, A_RESTRICT_SLOT3ONLY)) {
+return SLOTS_3;
+} else if (GET_ATTRIB(opcode, A_COF) &&
+   GET_ATTRIB(opcode, A_INDIRECT) &&
+   !GET_ATTRIB(opcode, A_MEMLIKE) &&
+   !GET_ATTRIB(opcode, A_MEMLIKE_PACKET_RULES)) {
+return SLOTS_2;
+} else if (GET_ATTRIB(opcode, A_RESTRICT_NOSLOT1)) {
+return SLOTS_0;
+} else if ((opcode == J2_trap0) ||
+   (opcode == Y2_isync) ||
+   (opcode == J4_hintjumpr)) {
+return SLOTS_2;
+} else if ((itype == ICLASS_V2LDST) && (GET_ATTRIB(opcode, A_STORE))) {
+return SLOTS_01;
+} else if ((itype == ICLASS_V2LDST) && (!GET_ATTRIB(opcode, A_STORE))) {
+return SLOTS_01;
+} else if (GET_ATTRIB(opcode, A_CRSLOT23)) {
+return SLOTS_23;
+} else if (GET_ATTRIB(opcode, A_RESTRICT_PREFERSLOT0)) {
+return SLOTS_0;
+} else if (GET_ATTRIB(opcode, A_SUBINSN)) {
+return SLOTS_01;
+} e

[PATCH v2 11/19] tcg: Use Error with alloc_code_gen_buffer

2020-10-29 Thread Richard Henderson

Report better error messages than just "could not allocate".
Let alloc_code_gen_buffer set ctx->code_gen_buffer_size
and ctx->code_gen_buffer, and simply return bool.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 60 ++-
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index c3e35bdee6..fca632eefa 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -59,6 +59,7 @@
 #include "sysemu/cpus.h"
 #include "sysemu/cpu-timers.h"
 #include "sysemu/tcg.h"
+#include "qapi/error.h"
 
 /* #define DEBUG_TB_INVALIDATE */
 /* #define DEBUG_TB_FLUSH */
@@ -973,7 +974,7 @@ static void page_lock_pair(PageDesc **ret_p1, 
tb_page_addr_t phys1,
   (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
 
-static inline size_t size_code_gen_buffer(size_t tb_size)
+static size_t size_code_gen_buffer(size_t tb_size)
 {
 /* Size the buffer.  */
 if (tb_size == 0) {
@@ -1024,7 +1025,7 @@ static inline void *split_cross_256mb(void *buf1, size_t 
size1)
 static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
 __attribute__((aligned(CODE_GEN_ALIGN)));
 
-static inline void *alloc_code_gen_buffer(void)
+static bool alloc_code_gen_buffer(size_t tb_size, Error **errp)
 {
 void *buf = static_code_gen_buffer;
 void *end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
@@ -1037,9 +1038,8 @@ static inline void *alloc_code_gen_buffer(void)
 size = end - buf;
 
 /* Honor a command-line option limiting the size of the buffer.  */
-if (size > tcg_ctx->code_gen_buffer_size) {
-size = QEMU_ALIGN_DOWN(tcg_ctx->code_gen_buffer_size,
-   qemu_real_host_page_size);
+if (size > tb_size) {
+size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
 }
 tcg_ctx->code_gen_buffer_size = size;
 
@@ -1051,31 +1051,43 @@ static inline void *alloc_code_gen_buffer(void)
 #endif
 
 if (qemu_mprotect_rwx(buf, size)) {
-abort();
+error_setg_errno(errp, errno, "mprotect of jit buffer");
+return false;
 }
 qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-return buf;
+tcg_ctx->code_gen_buffer = buf;
+return true;
 }
 #elif defined(_WIN32)
-static inline void *alloc_code_gen_buffer(void)
+static bool alloc_code_gen_buffer(size_t size, Error **errp)
 {
-size_t size = tcg_ctx->code_gen_buffer_size;
-return VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
-PAGE_EXECUTE_READWRITE);
+void *buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
+ PAGE_EXECUTE_READWRITE);
+if (buf == NULL) {
+error_setg_win32(errp, GetLastError(),
+ "allocate %zu bytes for jit buffer", size);
+return false;
+}
+
+tcg_ctx->code_gen_buffer = buf;
+tcg_ctx->code_gen_buffer_size = size;
+return true;
 }
 #else
-static inline void *alloc_code_gen_buffer(void)
+static bool alloc_code_gen_buffer(size_t size, Error **errp)
 {
 int prot = PROT_WRITE | PROT_READ | PROT_EXEC;
 int flags = MAP_PRIVATE | MAP_ANONYMOUS;
-size_t size = tcg_ctx->code_gen_buffer_size;
 void *buf;
 
 buf = mmap(NULL, size, prot, flags, -1, 0);
 if (buf == MAP_FAILED) {
-return NULL;
+error_setg_errno(errp, errno,
+ "allocate %zu bytes for jit buffer", size);
+return false;
 }
+tcg_ctx->code_gen_buffer_size = size;
 
 #ifdef __mips__
 if (cross_256mb(buf, size)) {
@@ -1114,20 +1126,11 @@ static inline void *alloc_code_gen_buffer(void)
 /* Request large pages for the buffer.  */
 qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-return buf;
+tcg_ctx->code_gen_buffer = buf;
+return true;
 }
 #endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
 
-static inline void code_gen_alloc(size_t tb_size)
-{
-tcg_ctx->code_gen_buffer_size = size_code_gen_buffer(tb_size);
-tcg_ctx->code_gen_buffer = alloc_code_gen_buffer();
-if (tcg_ctx->code_gen_buffer == NULL) {
-fprintf(stderr, "Could not allocate dynamic translator buffer\n");
-exit(1);
-}
-}
-
 static bool tb_cmp(const void *ap, const void *bp)
 {
 const TranslationBlock *a = ap;
@@ -1154,11 +1157,16 @@ static void tb_htable_init(void)
size. */
 void tcg_exec_init(unsigned long tb_size)
 {
+bool ok;
+
 tcg_allowed = true;
 cpu_gen_init();
 page_init();
 tb_htable_init();
-code_gen_alloc(tb_size);
+
+ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size), &error_fatal);
+assert(ok);
+
 #if defined(CONFIG_SOFTMMU)
 /* There's no guest base to take into account, so go ahead and
initialize the prologue now.  */
-- 
2.25.1

[RFC PATCH v5 11/33] Hexagon (target/hexagon) register fields

2020-10-29 Thread Taylor Simpson

Declare bitfields within registers such as user status register (USR)

Signed-off-by: Taylor Simpson 
---
 target/hexagon/reg_fields.h | 36 
 target/hexagon/reg_fields_def.h | 41 +
 target/hexagon/reg_fields.c | 27 +++
 3 files changed, 104 insertions(+)
 create mode 100644 target/hexagon/reg_fields.h
 create mode 100644 target/hexagon/reg_fields_def.h
 create mode 100644 target/hexagon/reg_fields.c

diff --git a/target/hexagon/reg_fields.h b/target/hexagon/reg_fields.h
new file mode 100644
index 000..4ec7d7c
--- /dev/null
+++ b/target/hexagon/reg_fields.h
@@ -0,0 +1,36 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_REG_FIELDS_H
+#define HEXAGON_REG_FIELDS_H
+
+typedef struct {
+int offset;
+int width;
+} RegField;
+
+extern const RegField reg_field_info[];
+
+enum {
+#define DEF_REG_FIELD(TAG, START, WIDTH) \
+TAG,
+#include "reg_fields_def.h"
+NUM_REG_FIELDS
+#undef DEF_REG_FIELD
+};
+
+#endif
diff --git a/target/hexagon/reg_fields_def.h b/target/hexagon/reg_fields_def.h
new file mode 100644
index 000..27b2231
--- /dev/null
+++ b/target/hexagon/reg_fields_def.h
@@ -0,0 +1,41 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+/*
+ * For registers that have individual fields, explain them here
+ *   DEF_REG_FIELD(tag,
+ * bit start offset,
+ * width
+ */
+
+/* USR fields */
+DEF_REG_FIELD(USR_OVF,0, 1)
+DEF_REG_FIELD(USR_FPINVF, 1, 1)
+DEF_REG_FIELD(USR_FPDBZF, 2, 1)
+DEF_REG_FIELD(USR_FPOVFF, 3, 1)
+DEF_REG_FIELD(USR_FPUNFF, 4, 1)
+DEF_REG_FIELD(USR_FPINPF, 5, 1)
+
+DEF_REG_FIELD(USR_LPCFG,  8, 2)
+
+DEF_REG_FIELD(USR_FPRND, 22, 2)
+
+DEF_REG_FIELD(USR_FPINVE,25, 1)
+DEF_REG_FIELD(USR_FPDBZE,26, 1)
+DEF_REG_FIELD(USR_FPOVFE,27, 1)
+DEF_REG_FIELD(USR_FPUNFE,28, 1)
+DEF_REG_FIELD(USR_FPINPE,29, 1)
diff --git a/target/hexagon/reg_fields.c b/target/hexagon/reg_fields.c
new file mode 100644
index 000..65905d5
--- /dev/null
+++ b/target/hexagon/reg_fields.c
@@ -0,0 +1,27 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "reg_fields.h"
+
+const RegField reg_field_info[] = {
+#define DEF_REG_FIELD(TAG, START, WIDTH)\
+  { START, WIDTH },
+#include "reg_fields_def.h"
+  { 0, 0 }
+#undef DEF_REG_FIELD
+};
-- 
2.7.4

[PATCH v2 09/19] tcg: Make DisasContextBase.tb const

2020-10-29 Thread Richard Henderson

There is nothing within the translators that ought to be
changing the TranslationBlock data, so make it const.

This does not actually use the read-only copy of the
data structure that exists within the rx mirror.

Signed-off-by: Richard Henderson 
---
 include/exec/gen-icount.h  | 4 ++--
 include/exec/translator.h  | 2 +-
 include/tcg/tcg-op.h   | 2 +-
 accel/tcg/translator.c | 4 ++--
 target/arm/translate-a64.c | 2 +-
 tcg/tcg-op.c   | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index 822c43cfd3..aa4b44354a 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -32,7 +32,7 @@ static inline void gen_io_end(void)
 tcg_temp_free_i32(tmp);
 }
 
-static inline void gen_tb_start(TranslationBlock *tb)
+static inline void gen_tb_start(const TranslationBlock *tb)
 {
 TCGv_i32 count, imm;
 
@@ -71,7 +71,7 @@ static inline void gen_tb_start(TranslationBlock *tb)
 tcg_temp_free_i32(count);
 }
 
-static inline void gen_tb_end(TranslationBlock *tb, int num_insns)
+static inline void gen_tb_end(const TranslationBlock *tb, int num_insns)
 {
 if (tb_cflags(tb) & CF_USE_ICOUNT) {
 /* Update the num_insn immediate parameter now that we know
diff --git a/include/exec/translator.h b/include/exec/translator.h
index 638e1529c5..24232ead41 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -67,7 +67,7 @@ typedef enum DisasJumpType {
  * Architecture-agnostic disassembly context.
  */
 typedef struct DisasContextBase {
-TranslationBlock *tb;
+const TranslationBlock *tb;
 target_ulong pc_first;
 target_ulong pc_next;
 DisasJumpType is_jmp;
diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 5abf17fecc..cbe39a3b95 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -805,7 +805,7 @@ static inline void tcg_gen_insn_start(target_ulong pc, 
target_ulong a1,
  * be NULL and @idx should be 0.  Otherwise, @tb should be valid and
  * @idx should be one of the TB_EXIT_ values.
  */
-void tcg_gen_exit_tb(TranslationBlock *tb, unsigned idx);
+void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx);
 
 /**
  * tcg_gen_goto_tb() - output goto_tb TCG operation
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index fb1e19c585..a49a794065 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -133,8 +133,8 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 }
 
 /* The disas_log hook may use these values rather than recompute.  */
-db->tb->size = db->pc_next - db->pc_first;
-db->tb->icount = db->num_insns;
+tb->size = db->pc_next - db->pc_first;
+tb->icount = db->num_insns;
 
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 072754fa24..297782e6ef 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -410,7 +410,7 @@ static inline bool use_goto_tb(DisasContext *s, int n, 
uint64_t dest)
 
 static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
 {
-TranslationBlock *tb;
+const TranslationBlock *tb;
 
 tb = s->base.tb;
 if (use_goto_tb(s, n, dest)) {
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 4b8a473fad..e3dc0cb4cb 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2664,7 +2664,7 @@ void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, 
TCGv_i64 arg)
 
 /* QEMU specific operations.  */
 
-void tcg_gen_exit_tb(TranslationBlock *tb, unsigned idx)
+void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
 {
 uintptr_t val = (uintptr_t)tb + idx;
 
-- 
2.25.1

[PATCH v2 02/19] tcg: Move tcg prologue pointer out of TCGContext

2020-10-29 Thread Richard Henderson

This value is constant across all thread-local copies of TCGContext,
so we might as well move it out of thread-local storage.

Use the correct function pointer type, and name the variable
tcg_qemu_tb_exec, which means that we are able to remove the
macro that does the casting.

Replace HAVE_TCG_QEMU_TB_EXEC with CONFIG_TCG_INTERPRETER,
as this is somewhat clearer in intent.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h| 9 -
 tcg/tci/tcg-target.h | 2 --
 tcg/tcg.c| 9 -
 tcg/tci.c| 3 ++-
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 8804a8c4a2..5ff5bf2a73 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -621,7 +621,6 @@ struct TCGContext {
here, because there's too much arithmetic throughout that relies
on addition and subtraction working on bytes.  Rely on the GCC
extension that allows arithmetic on void*.  */
-void *code_gen_prologue;
 void *code_gen_epilogue;
 void *code_gen_buffer;
 size_t code_gen_buffer_size;
@@ -1220,11 +1219,11 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
 #define TB_EXIT_IDXMAX1
 #define TB_EXIT_REQUESTED 3
 
-#ifdef HAVE_TCG_QEMU_TB_EXEC
-uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr);
+#ifdef CONFIG_TCG_INTERPRETER
+uintptr_t tcg_qemu_tb_exec(CPUArchState *env, void *tb_ptr);
 #else
-# define tcg_qemu_tb_exec(env, tb_ptr) \
-((uintptr_t (*)(void *, void *))tcg_ctx->code_gen_prologue)(env, tb_ptr)
+typedef uintptr_t tcg_prologue_fn(CPUArchState *env, void *tb_ptr);
+extern tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
 void tcg_register_jit(void *buf, size_t buf_size);
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 6460449719..49f3291f8a 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -189,8 +189,6 @@ typedef enum {
 
 void tci_disas(uint8_t opc);
 
-#define HAVE_TCG_QEMU_TB_EXEC
-
 /* Flush the dcache at RW, and the icache at RX, as necessary. */
 static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3bf36e0cfe..8d63c714fb 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -161,6 +161,10 @@ static TCGContext **tcg_ctxs;
 static unsigned int n_tcg_ctxs;
 TCGv_env cpu_env = 0;
 
+#ifndef CONFIG_TCG_INTERPRETER
+tcg_prologue_fn *tcg_qemu_tb_exec;
+#endif
+
 struct tcg_region_tree {
 QemuMutex lock;
 GTree *tree;
@@ -1053,7 +1057,10 @@ void tcg_prologue_init(TCGContext *s)
 s->code_ptr = buf0;
 s->code_buf = buf0;
 s->data_gen_ptr = NULL;
-s->code_gen_prologue = buf0;
+
+#ifndef CONFIG_TCG_INTERPRETER
+tcg_qemu_tb_exec = (tcg_prologue_fn *)buf0;
+#endif
 
 /* Compute a high-water mark, at which we voluntarily flush the buffer
and start over.  The size here is arbitrary, significantly larger
diff --git a/tcg/tci.c b/tcg/tci.c
index 82039fd163..d996eb7cf8 100644
--- a/tcg/tci.c
+++ b/tcg/tci.c
@@ -475,8 +475,9 @@ static bool tci_compare64(uint64_t u0, uint64_t u1, TCGCond 
condition)
 #endif
 
 /* Interpret pseudo code in tb. */
-uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
+uintptr_t tcg_qemu_tb_exec(CPUArchState *env, void *v_tb_ptr)
 {
+uint8_t *tb_ptr = v_tb_ptr;
 tcg_target_ulong regs[TCG_TARGET_NB_REGS];
 long tcg_temps[CPU_TEMP_BUF_NLONGS];
 uintptr_t sp_value = (uintptr_t)(tcg_temps + CPU_TEMP_BUF_NLONGS);
-- 
2.25.1

[RFC PATCH v5 26/33] Hexagon (target/hexagon) TCG generation

2020-10-29 Thread Taylor Simpson

Include the generated files and set up the data structures

Signed-off-by: Taylor Simpson 
---
 target/hexagon/genptr.h |  25 ++
 target/hexagon/genptr.c | 234 
 2 files changed, 259 insertions(+)
 create mode 100644 target/hexagon/genptr.h
 create mode 100644 target/hexagon/genptr.c

diff --git a/target/hexagon/genptr.h b/target/hexagon/genptr.h
new file mode 100644
index 000..4e8f903
--- /dev/null
+++ b/target/hexagon/genptr.h
@@ -0,0 +1,25 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_GENPTR_H
+#define HEXAGON_GENPTR_H
+
+#include "insn.h"
+
+extern const SemanticInsn opcode_genptr[];
+
+#endif
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
new file mode 100644
index 000..ba233a4
--- /dev/null
+++ b/target/hexagon/genptr.c
@@ -0,0 +1,234 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#define QEMU_GENERATE
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "cpu.h"
+#include "internal.h"
+#include "tcg/tcg-op.h"
+#include "insn.h"
+#include "opcodes.h"
+#include "translate.h"
+#include "macros.h"
+#include "gen_tcg.h"
+
+static inline TCGv gen_read_reg(TCGv result, int num)
+{
+tcg_gen_mov_tl(result, hex_gpr[num]);
+return result;
+}
+
+static inline TCGv gen_read_preg(TCGv pred, uint8_t num)
+{
+tcg_gen_mov_tl(pred, hex_pred[num]);
+return pred;
+}
+
+static inline void gen_log_predicated_reg_write(int rnum, TCGv val, int slot)
+{
+TCGv one = tcg_const_tl(1);
+TCGv zero = tcg_const_tl(0);
+TCGv slot_mask = tcg_temp_new();
+
+tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
+   val, hex_new_value[rnum]);
+#if HEX_DEBUG
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum], slot_mask, zero,
+   one, hex_reg_written[rnum]);
+#endif
+
+tcg_temp_free(one);
+tcg_temp_free(zero);
+tcg_temp_free(slot_mask);
+}
+
+static inline void gen_log_reg_write(int rnum, TCGv val)
+{
+tcg_gen_mov_tl(hex_new_value[rnum], val);
+#if HEX_DEBUG
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movi_tl(hex_reg_written[rnum], 1);
+#endif
+}
+
+static void gen_log_predicated_reg_write_pair(int rnum, TCGv_i64 val, int slot)
+{
+TCGv val32 = tcg_temp_new();
+TCGv one = tcg_const_tl(1);
+TCGv zero = tcg_const_tl(0);
+TCGv slot_mask = tcg_temp_new();
+
+tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
+/* Low word */
+tcg_gen_extrl_i64_i32(val32, val);
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
+   val32, hex_new_value[rnum]);
+#if HEX_DEBUG
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum],
+   slot_mask, zero,
+   one, hex_reg_written[rnum]);
+#endif
+
+/* High word */
+tcg_gen_extrh_i64_i32(val32, val);
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum + 1],
+   slot_mask, zero,
+   val32, hex_new_value[rnum + 1]);
+#if HEX_DEBUG
+/* Do this so HELPER(debug_commit_end) will know */
+tcg_gen_movcond_tl(TCG_COND_EQ, hex_reg_written[rnum + 1],
+   slot_mask, zero,
+   one, hex_reg_written[rnum + 1]);
+#endif
+
+tcg_temp_free(val32);
+tcg_temp_free(one);
+tcg_temp

[PATCH v2 08/19] tcg: Adjust tb_target_set_jmp_target for split rwx

2020-10-29 Thread Richard Henderson

Pass both rx and rw addresses to tb_target_set_jmp_target.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  2 +-
 tcg/arm/tcg-target.h |  2 +-
 tcg/i386/tcg-target.h|  6 +++---
 tcg/mips/tcg-target.h|  2 +-
 tcg/ppc/tcg-target.h |  2 +-
 tcg/riscv/tcg-target.h   |  2 +-
 tcg/s390/tcg-target.h|  8 
 tcg/sparc/tcg-target.h   |  2 +-
 tcg/tci/tcg-target.h |  6 +++---
 accel/tcg/cpu-exec.c |  4 +++-
 tcg/aarch64/tcg-target.c.inc | 12 ++--
 tcg/mips/tcg-target.c.inc|  8 
 tcg/ppc/tcg-target.c.inc | 16 
 tcg/sparc/tcg-target.c.inc   | 14 +++---
 14 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index d0a6a059b7..91313d93be 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -158,7 +158,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 __builtin___clear_cache((char *)rx, (char *)(rx + len));
 }
 
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index fa88b24e43..b21a2fb6a1 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -144,7 +144,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 }
 
 /* not defined -- call should be eliminated at compile time */
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index 8323e72639..f52ba0ffec 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -211,11 +211,11 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 {
 }
 
-static inline void tb_target_set_jmp_target(uintptr_t tc_ptr,
-uintptr_t jmp_addr, uintptr_t addr)
+static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
+uintptr_t jmp_rw, uintptr_t addr)
 {
 /* patch the branch destination */
-qatomic_set((int32_t *)jmp_addr, addr - (jmp_addr + 4));
+qatomic_set((int32_t *)jmp_rw, addr - (jmp_rx + 4));
 /* no need to flush icache explicitly */
 }
 
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index 47b1226ee9..cd548dacec 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -216,7 +216,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 cacheflush((void *)rx, len, ICACHE);
 }
 
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #ifdef CONFIG_SOFTMMU
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index fbb6dc1b47..8f3e4c924a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -176,7 +176,7 @@ extern bool have_vsx;
 #define TCG_TARGET_HAS_cmpsel_vec   0
 
 void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len);
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 0fa6ae358e..e03fd17427 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -169,7 +169,7 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 }
 
 /* not defined -- call should be eliminated at compile time */
-void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
 
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index c3dc2e8938..c5a749e425 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -150,12 +150,12 @@ static inline void flush_idcache_range(uintptr_t rx, 
uintptr_t rw, size_t len)
 {
 }
 
-static inline void tb_target_set_jmp_target(uintptr_t tc_ptr,
-uintptr_t jmp_addr, uintptr_t addr)
+static inline void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
+uintptr_t jmp_rw, uintptr_t addr)
 {
 /* patch the branch destination */
-intptr_t disp = addr - (jmp_addr - 2);
-qatomic_set((int32_t *)jmp_addr, disp / 2);
+intptr_t disp = addr - (jmp_rx - 2);
+qatomic_set((int32_t *)jmp_rw, disp / 2);
 /* no need to flush icache explicitly */
 }
 
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target

[PATCH v2 6/8] target/sparc/win_helper: silence the compiler warnings

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
target/sparc/win_helper.c: In function ‘get_gregset’:
target/sparc/win_helper.c:304:9: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  304 | trace_win_helper_gregset_error(pstate);
  | ^~
target/sparc/win_helper.c:306:5: note: here
  306 | case 0:
  | ^~~~

Add the corresponding "fall through" comment to fix it.

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Reviewed-by: Artyom Tarasenko 
---
v1->v2: Combine the /* fall through */ to the preceding comments
(Base on Philippe's comments).

Cc: Philippe Mathieu-Daudé 
Cc: Mark Cave-Ayland 
Cc: Artyom Tarasenko 
---
 target/sparc/win_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/sparc/win_helper.c b/target/sparc/win_helper.c
index 8290a21142..e78660b60a 100644
--- a/target/sparc/win_helper.c
+++ b/target/sparc/win_helper.c
@@ -302,7 +302,7 @@ static inline uint64_t *get_gregset(CPUSPARCState *env, 
uint32_t pstate)
 switch (pstate) {
 default:
 trace_win_helper_gregset_error(pstate);
-/* pass through to normal set of global registers */
+/* fall through to normal set of global registers */
 case 0:
 return env->bgregs;
 case PS_AG:
-- 
2.27.0

[RFC PATCH v5 14/33] Hexagon (target/hexagon) instruction printing

2020-10-29 Thread Taylor Simpson

Signed-off-by: Taylor Simpson 
---
 target/hexagon/printinsn.h |  28 
 target/hexagon/printinsn.c | 158 +
 2 files changed, 186 insertions(+)
 create mode 100644 target/hexagon/printinsn.h
 create mode 100644 target/hexagon/printinsn.c

diff --git a/target/hexagon/printinsn.h b/target/hexagon/printinsn.h
new file mode 100644
index 000..0e629b2
--- /dev/null
+++ b/target/hexagon/printinsn.h
@@ -0,0 +1,28 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_PRINTINSN_H
+#define HEXAGON_PRINTINSN_H
+
+#include "qemu/osdep.h"
+#include "insn.h"
+
+extern void snprint_a_pkt_disas(char *buf, int n, Packet *pkt, uint32_t *words,
+target_ulong pc);
+extern void snprint_a_pkt_debug(char *buf, int n, Packet *pkt);
+
+#endif
diff --git a/target/hexagon/printinsn.c b/target/hexagon/printinsn.c
new file mode 100644
index 000..8315d56
--- /dev/null
+++ b/target/hexagon/printinsn.c
@@ -0,0 +1,158 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "opcodes.h"
+#include "printinsn.h"
+#include "insn.h"
+#include "reg_fields.h"
+#include "internal.h"
+
+static const char *sreg2str(unsigned int reg)
+{
+if (reg < TOTAL_PER_THREAD_REGS) {
+return hexagon_regnames[reg];
+} else {
+return "???";
+}
+}
+
+static const char *creg2str(unsigned int reg)
+{
+return sreg2str(reg + HEX_REG_SA0);
+}
+
+static void snprintinsn(char *buf, int n, Insn * insn)
+{
+switch (insn->opcode) {
+#define DEF_VECX_PRINTINFO(TAG, FMT, ...) DEF_PRINTINFO(TAG, FMT, __VA_ARGS__)
+#define DEF_PRINTINFO(TAG, FMT, ...) \
+case TAG: \
+snprintf(buf, n, FMT, __VA_ARGS__);\
+break;
+#include "printinsn_generated.h"
+#undef DEF_VECX_PRINTINFO
+#undef DEF_PRINTINFO
+}
+}
+
+void snprint_a_pkt_disas(char *buf, int n, Packet *pkt, uint32_t *words,
+ target_ulong pc)
+{
+char tmpbuf[128];
+buf[0] = '\0';
+bool has_endloop0 = false;
+bool has_endloop1 = false;
+bool has_endloop01 = false;
+
+for (int i = 0; i < pkt->num_insns; i++) {
+if (pkt->insn[i].part1) {
+continue;
+}
+
+/* We'll print the endloop's at the end of the packet */
+if (pkt->insn[i].opcode == J2_endloop0) {
+has_endloop0 = true;
+continue;
+}
+if (pkt->insn[i].opcode == J2_endloop1) {
+has_endloop1 = true;
+continue;
+}
+if (pkt->insn[i].opcode == J2_endloop01) {
+has_endloop01 = true;
+continue;
+}
+
+snprintf(tmpbuf, 127, "0x" TARGET_FMT_lx "\t", words[i]);
+strncat(buf, tmpbuf, n);
+
+if (i == 0) {
+strncat(buf, "{", n);
+}
+
+snprintinsn(tmpbuf, 127, &(pkt->insn[i]));
+strncat(buf, "\t", n);
+strncat(buf, tmpbuf, n);
+
+if (i < pkt->num_insns - 1) {
+/*
+ * Subinstructions are two instructions encoded
+ * in the same word. Print them on the same line.
+ */
+if (GET_ATTRIB(pkt->insn[i].opcode, A_SUBINSN)) {
+strncat(buf, "; ", n);
+snprintinsn(tmpbuf, 127, &(pkt->insn[i + 1]));
+strncat(buf, tmpbuf, n);
+i++;
+} else if (pkt->insn[i + 1].opcode != J2_endloop0 &&
+   pkt->insn[i + 1].opcode != J2_endloop1 &&
+   pkt->insn[i + 1].opcode != J2_e

[PATCH v2 01/19] tcg: Enhance flush_icache_range with separate data pointer

2020-10-29 Thread Richard Henderson

We are shortly going to have a split rw/rx jit buffer.  Depending
on the host, we need to flush the dcache at the rw data pointer and
flush the icache at the rx code pointer.

For now, the two passed pointers are identical, so there is no
effective change in behaviour.

Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  9 +++--
 tcg/arm/tcg-target.h |  8 ++--
 tcg/i386/tcg-target.h|  3 ++-
 tcg/mips/tcg-target.h|  8 ++--
 tcg/ppc/tcg-target.h |  2 +-
 tcg/riscv/tcg-target.h   |  8 ++--
 tcg/s390/tcg-target.h|  3 ++-
 tcg/sparc/tcg-target.h   |  8 +---
 tcg/tci/tcg-target.h |  3 ++-
 softmmu/physmem.c|  9 -
 tcg/tcg.c|  5 +++--
 tcg/aarch64/tcg-target.c.inc |  2 +-
 tcg/mips/tcg-target.c.inc|  2 +-
 tcg/ppc/tcg-target.c.inc | 21 +++--
 tcg/sparc/tcg-target.c.inc   |  4 ++--
 15 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 663dd0b95e..d0a6a059b7 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -148,9 +148,14 @@ typedef enum {
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-__builtin___clear_cache((char *)start, (char *)stop);
+/* TODO: Copy this from gcc to avoid 4 loops instead of 2. */
+if (rw != rx) {
+__builtin___clear_cache((char *)rw, (char *)(rw + len));
+}
+__builtin___clear_cache((char *)rx, (char *)(rx + len));
 }
 
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 17e771374d..fa88b24e43 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -134,9 +134,13 @@ enum {
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-__builtin___clear_cache((char *) start, (char *) stop);
+if (rw != rx) {
+__builtin___clear_cache((char *)rw, (char *)(rw + len));
+}
+__builtin___clear_cache((char *)rx, (char *)(rx + len));
 }
 
 /* not defined -- call should be eliminated at compile time */
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index abd4ac7fc0..8323e72639 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -206,7 +206,8 @@ extern bool have_avx2;
 #define TCG_TARGET_extract_i64_valid(ofs, len) \
 (((ofs) == 8 && (len) == 8) || ((ofs) + (len)) == 32)
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
 }
 
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index c6b091d849..47b1226ee9 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -207,9 +207,13 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_DEFAULT_MO (0)
 #define TCG_TARGET_HAS_MEMORY_BSWAP 1
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-cacheflush ((void *)start, stop-start, ICACHE);
+if (rx != rw) {
+cacheflush((void *)rw, len, DCACHE);
+}
+cacheflush((void *)rx, len, ICACHE);
 }
 
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index be10363956..fbb6dc1b47 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -175,7 +175,7 @@ extern bool have_vsx;
 #define TCG_TARGET_HAS_bitsel_vec   have_vsx
 #define TCG_TARGET_HAS_cmpsel_vec   0
 
-void flush_icache_range(uintptr_t start, uintptr_t stop);
+void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len);
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t);
 
 #define TCG_TARGET_DEFAULT_MO (0)
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 032439d806..0fa6ae358e 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -159,9 +159,13 @@ typedef enum {
 #define TCG_TARGET_HAS_mulsh_i641
 #endif
 
-static inline void flush_icache_range(uintptr_t start, uintptr_t stop)
+/* Flush the dcache at RW, and the icache at RX, as necessary. */
+static inline void flush_idcache_range(uintptr_t rx, uintptr_t rw, size_t len)
 {
-__builtin___clear_cache((char *)start, (char *)stop);
+if (rx != rw) {
+__builtin___clear_cache((char *)rw

[PATCH v2 8/8] target/ppc: replaced the TODO with LOG_UNIMP and add break for silence warnings

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
target/ppc/mmu_helper.c: In function ‘dump_mmu’:
target/ppc/mmu_helper.c:1351:12: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 1351 | if (ppc64_v3_radix(env_archcpu(env))) {
  |^
target/ppc/mmu_helper.c:1358:5: note: here
 1358 | default:
  | ^~~

Use "qemu_log_mask(LOG_UNIMP**)" instead of the TODO comment.
And add the break statement to fix it.

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
---
v1->v2: replace the TODO by a LOG_UNIMP call and add break statement(Base on 
Philippe's comments)

Cc: Thomas Huth 
Cc: David Gibson 
Cc: Philippe Mathieu-Daudé 
---
 target/ppc/mmu_helper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index 8972714775..12723362b7 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -1349,11 +1349,12 @@ void dump_mmu(CPUPPCState *env)
 break;
 case POWERPC_MMU_3_00:
 if (ppc64_v3_radix(env_archcpu(env))) {
-/* TODO - Unsupported */
+qemu_log_mask(LOG_UNIMP, "%s: the PPC64 MMU unsupported\n",
+  __func__);
 } else {
 dump_slb(env_archcpu(env));
-break;
 }
+break;
 #endif
 default:
 qemu_log_mask(LOG_UNIMP, "%s: unimplemented\n", __func__);
-- 
2.27.0

[RFC PATCH v5 23/33] Hexagon (target/hexagon) opcode data structures

2020-10-29 Thread Taylor Simpson

Signed-off-by: Taylor Simpson 
---
 target/hexagon/opcodes.h |  63 +
 target/hexagon/opcodes.c | 142 +++
 2 files changed, 205 insertions(+)
 create mode 100644 target/hexagon/opcodes.h
 create mode 100644 target/hexagon/opcodes.c

diff --git a/target/hexagon/opcodes.h b/target/hexagon/opcodes.h
new file mode 100644
index 000..1aa2074
--- /dev/null
+++ b/target/hexagon/opcodes.h
@@ -0,0 +1,63 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_OPCODES_H
+#define HEXAGON_OPCODES_H
+
+#include "qemu/bitmap.h"
+#include "attribs.h"
+
+typedef enum {
+#define OPCODE(IID) IID
+#include "opcodes_def_generated.h"
+XX_LAST_OPCODE
+#undef OPCODE
+} Opcode;
+
+typedef enum {
+NORMAL,
+HALF,
+SUBINSN_A,
+SUBINSN_L1,
+SUBINSN_L2,
+SUBINSN_S1,
+SUBINSN_S2,
+EXT_noext,
+EXT_mmvec,
+XX_LAST_ENC_CLASS
+} EncClass;
+
+extern const char * const opcode_names[];
+
+extern const char * const opcode_reginfo[];
+extern const char * const opcode_rregs[];
+extern const char * const opcode_wregs[];
+
+typedef struct {
+const char * const encoding;
+const EncClass enc_class;
+} OpcodeEncoding;
+
+extern const OpcodeEncoding opcode_encodings[XX_LAST_OPCODE];
+
+extern DECLARE_BITMAP(opcode_attribs[XX_LAST_OPCODE], A_ZZ_LASTATTRIB);
+
+extern void opcode_init(void);
+
+extern int opcode_which_immediate_is_extended(Opcode opcode);
+
+#endif
diff --git a/target/hexagon/opcodes.c b/target/hexagon/opcodes.c
new file mode 100644
index 000..20400f5
--- /dev/null
+++ b/target/hexagon/opcodes.c
@@ -0,0 +1,142 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+/*
+ * opcodes.c
+ *
+ * data tables generated automatically
+ * Maybe some functions too
+ */
+
+#include "qemu/osdep.h"
+#include "opcodes.h"
+#include "decode.h"
+
+#define VEC_DESCR(A, B, C) DESCR(A, B, C)
+#define DONAME(X) #X
+
+const char * const opcode_names[] = {
+#define OPCODE(IID) DONAME(IID)
+#include "opcodes_def_generated.h"
+NULL
+#undef OPCODE
+};
+
+const char * const opcode_reginfo[] = {
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)/* nothing */
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) REGINFO,
+#include "op_regs_generated.h"
+NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+
+const char * const opcode_rregs[] = {
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)/* nothing */
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) RREGS,
+#include "op_regs_generated.h"
+NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+
+const char * const opcode_wregs[] = {
+#define IMMINFO(TAG, SIGN, SIZE, SHAMT, SIGN2, SIZE2, SHAMT2)/* nothing */
+#define REGINFO(TAG, REGINFO, RREGS, WREGS) WREGS,
+#include "op_regs_generated.h"
+NULL
+#undef REGINFO
+#undef IMMINFO
+};
+
+const char * const opcode_short_semantics[] = {
+#define DEF_SHORTCODE(TAG, SHORTCODE)  [TAG] = #SHORTCODE,
+#include "shortcode_generated.h"
+#undef DEF_SHORTCODE
+NULL
+};
+
+DECLARE_BITMAP(opcode_attribs[XX_LAST_OPCODE], A_ZZ_LASTATTRIB);
+
+static void init_attribs(int tag, ...)
+{
+va_list ap;
+int attr;
+va_start(ap, tag);
+while ((attr = va_arg(ap, int)) != 0) {
+set_bit(attr, opcode_attribs[tag]);
+}
+}
+
+const OpcodeEncoding opcode_encodings[] = {
+#define DEF_ENC32(OPCODE, ENCSTR) \
+[OPCODE] = { .encoding = ENCSTR },
+
+#define DEF_ENC_SUBINSN(OPCODE, CLASS, ENCSTR) \
+[OPCODE] = { .encoding = ENCSTR, .enc_class = CLASS },
+
+#define DEF_EXT_ENC(OPCODE, CLASS, ENCSTR)

[RFC PATCH v5 06/33] Hexagon (target/hexagon) register names

2020-10-29 Thread Taylor Simpson

Signed-off-by: Taylor Simpson 
Reviewed-by: Richard Henderson 
---
 target/hexagon/hex_regs.h | 83 +++
 1 file changed, 83 insertions(+)
 create mode 100644 target/hexagon/hex_regs.h

diff --git a/target/hexagon/hex_regs.h b/target/hexagon/hex_regs.h
new file mode 100644
index 000..3b4249a
--- /dev/null
+++ b/target/hexagon/hex_regs.h
@@ -0,0 +1,83 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_REGS_H
+#define HEXAGON_REGS_H
+
+enum {
+HEX_REG_R00  = 0,
+HEX_REG_R01  = 1,
+HEX_REG_R02  = 2,
+HEX_REG_R03  = 3,
+HEX_REG_R04  = 4,
+HEX_REG_R05  = 5,
+HEX_REG_R06  = 6,
+HEX_REG_R07  = 7,
+HEX_REG_R08  = 8,
+HEX_REG_R09  = 9,
+HEX_REG_R10  = 10,
+HEX_REG_R11  = 11,
+HEX_REG_R12  = 12,
+HEX_REG_R13  = 13,
+HEX_REG_R14  = 14,
+HEX_REG_R15  = 15,
+HEX_REG_R16  = 16,
+HEX_REG_R17  = 17,
+HEX_REG_R18  = 18,
+HEX_REG_R19  = 19,
+HEX_REG_R20  = 20,
+HEX_REG_R21  = 21,
+HEX_REG_R22  = 22,
+HEX_REG_R23  = 23,
+HEX_REG_R24  = 24,
+HEX_REG_R25  = 25,
+HEX_REG_R26  = 26,
+HEX_REG_R27  = 27,
+HEX_REG_R28  = 28,
+HEX_REG_R29  = 29,
+HEX_REG_SP   = 29,
+HEX_REG_FP   = 30,
+HEX_REG_R30  = 30,
+HEX_REG_LR   = 31,
+HEX_REG_R31  = 31,
+HEX_REG_SA0  = 32,
+HEX_REG_LC0  = 33,
+HEX_REG_SA1  = 34,
+HEX_REG_LC1  = 35,
+HEX_REG_P3_0 = 36,
+HEX_REG_M0   = 38,
+HEX_REG_M1   = 39,
+HEX_REG_USR  = 40,
+HEX_REG_PC   = 41,
+HEX_REG_UGP  = 42,
+HEX_REG_GP   = 43,
+HEX_REG_CS0  = 44,
+HEX_REG_CS1  = 45,
+HEX_REG_UPCYCLELO= 46,
+HEX_REG_UPCYCLEHI= 47,
+HEX_REG_FRAMELIMIT   = 48,
+HEX_REG_FRAMEKEY = 49,
+HEX_REG_PKTCNTLO = 50,
+HEX_REG_PKTCNTHI = 51,
+/* Use reserved control registers for qemu execution counts */
+HEX_REG_QEMU_PKT_CNT  = 52,
+HEX_REG_QEMU_INSN_CNT = 53,
+HEX_REG_UTIMERLO  = 62,
+HEX_REG_UTIMERHI  = 63,
+};
+
+#endif
-- 
2.7.4

[RFC PATCH v5 03/33] Hexagon (include/elf.h) ELF machine definition

2020-10-29 Thread Taylor Simpson

Define EM_HEXAGON 164

Signed-off-by: Taylor Simpson 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
---
 include/elf.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/elf.h b/include/elf.h
index 7a418ee..f4fa3c1 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -176,6 +176,7 @@ typedef struct mips_elf_abiflags_v0 {
 
 #define EM_UNICORE32110 /* UniCore32 */
 
+#define EM_HEXAGON  164 /* Qualcomm Hexagon */
 #define EM_RX   173 /* Renesas RX family */
 
 #define EM_RISCV243 /* RISC-V */
-- 
2.7.4

[RFC PATCH v5 15/33] Hexagon (target/hexagon/arch.[ch]) utility functions

2020-10-29 Thread Taylor Simpson

Signed-off-by: Taylor Simpson 
---
 target/hexagon/arch.h |  35 ++
 target/hexagon/arch.c | 294 ++
 2 files changed, 329 insertions(+)
 create mode 100644 target/hexagon/arch.h
 create mode 100644 target/hexagon/arch.c

diff --git a/target/hexagon/arch.h b/target/hexagon/arch.h
new file mode 100644
index 000..cf14480
--- /dev/null
+++ b/target/hexagon/arch.h
@@ -0,0 +1,35 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#ifndef HEXAGON_ARCH_H
+#define HEXAGON_ARCH_H
+
+#include "qemu/osdep.h"
+#include "qemu/int128.h"
+
+extern uint64_t interleave(uint32_t odd, uint32_t even);
+extern uint64_t deinterleave(uint64_t src);
+extern uint32_t carry_from_add64(uint64_t a, uint64_t b, uint32_t c);
+extern int32_t conv_round(int32_t a, int n);
+extern void arch_fpop_start(CPUHexagonState *env);
+extern void arch_fpop_end(CPUHexagonState *env);
+extern int arch_sf_recip_common(float32 *Rs, float32 *Rt, float32 *Rd,
+int *adjust, float_status *fp_status);
+extern int arch_sf_invsqrt_common(float32 *Rs, float32 *Rd, int *adjust,
+  float_status *fp_status);
+
+#endif
diff --git a/target/hexagon/arch.c b/target/hexagon/arch.c
new file mode 100644
index 000..16002bf
--- /dev/null
+++ b/target/hexagon/arch.c
@@ -0,0 +1,294 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "fpu/softfloat.h"
+#include "cpu.h"
+#include "fma_emu.h"
+#include "arch.h"
+#include "macros.h"
+
+#define SF_BIAS127
+#define SF_MAXEXP  254
+#define SF_MANTBITS23
+#define float32_nanmake_float32(0x)
+
+#define BITS_MASK_8 0xULL
+#define PAIR_MASK_8 0xULL
+#define NYBL_MASK_8 0x0f0f0f0f0f0f0f0fULL
+#define BYTE_MASK_8 0x00ff00ff00ff00ffULL
+#define HALF_MASK_8 0xULL
+#define WORD_MASK_8 0xULL
+
+uint64_t interleave(uint32_t odd, uint32_t even)
+{
+/* Convert to long long */
+uint64_t myodd = odd;
+uint64_t myeven = even;
+/* First, spread bits out */
+myodd = (myodd | (myodd << 16)) & HALF_MASK_8;
+myeven = (myeven | (myeven << 16)) & HALF_MASK_8;
+myodd = (myodd | (myodd << 8)) & BYTE_MASK_8;
+myeven = (myeven | (myeven << 8)) & BYTE_MASK_8;
+myodd = (myodd | (myodd << 4)) & NYBL_MASK_8;
+myeven = (myeven | (myeven << 4)) & NYBL_MASK_8;
+myodd = (myodd | (myodd << 2)) & PAIR_MASK_8;
+myeven = (myeven | (myeven << 2)) & PAIR_MASK_8;
+myodd = (myodd | (myodd << 1)) & BITS_MASK_8;
+myeven = (myeven | (myeven << 1)) & BITS_MASK_8;
+/* Now OR together */
+return myeven | (myodd << 1);
+}
+
+uint64_t deinterleave(uint64_t src)
+{
+/* Get odd and even bits */
+uint64_t myodd = ((src >> 1) & BITS_MASK_8);
+uint64_t myeven = (src & BITS_MASK_8);
+
+/* Unspread bits */
+myeven = (myeven | (myeven >> 1)) & PAIR_MASK_8;
+myodd = (myodd | (myodd >> 1)) & PAIR_MASK_8;
+myeven = (myeven | (myeven >> 2)) & NYBL_MASK_8;
+myodd = (myodd | (myodd >> 2)) & NYBL_MASK_8;
+myeven = (myeven | (myeven >> 4)) & BYTE_MASK_8;
+myodd = (myodd | (myodd >> 4)) & BYTE_MASK_8;
+myeven = (myeven | (myeven >> 8)) & HALF_MASK_8;
+myodd = (myodd | (myodd >> 8)) & HALF_MASK_8;
+myeven = (myeven | (myeven >> 16)) & WORD_MASK_8;
+myodd = (myodd | (myodd >> 16)) & WORD_MASK_8;
+
+/* Return odd bits in upper half */
+return myeven | (myodd << 32);
+}
+
+uint32_t carry_from_add64(uint64_t a, uint64_t b, uint

[PATCH v2 5/8] target/sparc/translate: silence the compiler warnings

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
target/sparc/translate.c: In function ‘gen_st_asi’:
target/sparc/translate.c:2320:12: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
 2320 | if (!(dc->def->features & CPU_FEATURE_HYPV)) {
  |^
target/sparc/translate.c:2329:5: note: here
 2329 | case GET_ASI_DIRECT:
  | ^~~~

The "fall through" statement place is not correctly identified by the compiler.

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Reviewed-by: Artyom Tarasenko 
Reviewed-by: Philippe Mathieu-Daudé 
---
Cc: Mark Cave-Ayland 
Cc: Artyom Tarasenko 
---
 target/sparc/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 1a4efd4ed6..a3d9aaa46b 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2324,8 +2324,8 @@ static void gen_st_asi(DisasContext *dc, TCGv src, TCGv 
addr,
 }
 /* in OpenSPARC T1+ CPUs TWINX ASIs in store instructions
  * are ST_BLKINIT_ ASIs */
-/* fall through */
 #endif
+/* fall through */
 case GET_ASI_DIRECT:
 gen_address_mask(dc, addr);
 tcg_gen_qemu_st_tl(src, addr, da.mem_idx, da.memop);
-- 
2.27.0

[RFC PATCH v5 02/33] Hexagon (target/hexagon) README

2020-10-29 Thread Taylor Simpson

Gives an introduction and overview to the Hexagon target

Signed-off-by: Taylor Simpson 
---
 target/hexagon/README | 235 ++
 1 file changed, 235 insertions(+)
 create mode 100644 target/hexagon/README

diff --git a/target/hexagon/README b/target/hexagon/README
new file mode 100644
index 000..1d48eee
--- /dev/null
+++ b/target/hexagon/README
@@ -0,0 +1,235 @@
+Hexagon is Qualcomm's very long instruction word (VLIW) digital signal
+processor(DSP).
+
+The following versions of the Hexagon core are supported
+Scalar core: v67
+
https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
+
+We presented an overview of the project at the 2019 KVM Forum.
+
https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
+
+*** Tour of the code ***
+
+The qemu-hexagon implementation is a combination of qemu and the Hexagon
+architecture library (aka archlib).  The three primary directories with
+Hexagon-specific code are
+
+qemu/target/hexagon
+This has all the instruction and packet semantics
+qemu/target/hexagon/imported
+These files are imported with very little modification from archlib
+*.idef  Instruction semantics definition
+macros.def  Mapping of macros to instruction attributes
+encode*.def Encoding patterns for each instruction
+iclass.def  Instruction class definitions used to determine
+legal VLIW slots for each instruction
+qemu/linux-user/hexagon
+Helpers for loading the ELF file and making Linux system calls,
+signals, etc
+
+We start with scripts that generate a bunch of include files.  This
+is a two step process.  The first step is to use the C preprocessor to expand
+macros inside the architecture definition files.  This is done in
+target/hexagon/gen_semantics.c.  This step produces
+/target/hexagon/semantics_generated.pyinc.
+That file is consumed by the following python scripts to produce the indicated
+header files in /target/hexagon
+gen_opcodes_def.py  -> opcodes_def_generated.h
+gen_op_regs.py  -> op_regs_generated.h
+gen_printinsn.py-> printinsn_generated.h
+gen_op_attribs.py   -> op_attribs_generated.h
+gen_helper_protos.py-> helper_protos_generated.h
+gen_shortcode.py-> shortcode_generated.h
+gen_tcg_funcs.py-> tcg_funcs_generated.h
+gen_tcg_func_table.py   -> tcg_func_table_generated.h
+gen_helper_funcs.py -> helper_funcs_generated.h
+
+Qemu helper functions have 3 parts
+DEF_HELPER declaration indicates the signature of the helper
+gen_helper_ will generate a TCG call to the helper function
+The helper implementation
+
+Here's an example of the A2_add instruction.
+Instruction tagA2_add
+Assembly syntax"Rd32=add(Rs32,Rt32)"
+Instruction semantics  "{ RdV=RsV+RtV;}"
+
+By convention, the operands are identified by letter
+RdV is the destination register
+RsV, RtV are source registers
+
+The generator uses the operand naming conventions (see large comment in
+hex_common.py) to determine the signature of the helper function.  Here are the
+results for A2_add
+
+helper_protos_generated.h
+DEF_HELPER_3(A2_add, s32, env, s32, s32)
+
+tcg_funcs_generated.h
+static void generate_A2_add(
+CPUHexagonState *env,
+DisasContext *ctx,
+Insn *insn,
+Packet *pkt)
+{
+TCGv RdV = tcg_temp_local_new();
+const int RdN = insn->regno[0];
+TCGv RsV = hex_gpr[insn->regno[1]];
+TCGv RtV = hex_gpr[insn->regno[2]];
+gen_helper_A2_add(RdV, cpu_env, RsV, RtV);
+gen_log_reg_write(RdN, RdV);
+ctx_log_reg_write(ctx, RdN);
+tcg_temp_free(RdV);
+}
+
+helper_funcs_generated.h
+int32_t HELPER(A2_add)(CPUHexagonState *env, int32_t RsV, int32_t RtV)
+{
+uint32_t slot __attribute__((unused)) = 4;
+int32_t RdV = 0;
+{ RdV=RsV+RtV;}
+return RdV;
+}
+
+Note that generate_A2_add updates the disassembly context to be processed
+when the packet commits (see "Packet Semantics" below).
+
+The generator checks for fGEN_TCG_ macro.  This allows us to generate
+TCG code instead of a call to the helper.  If defined, the macro takes 1
+argument.
+C semantics (aka short code)
+
+This allows the code generator to override the auto-generated code.  In some
+cases this is necessary for correct execution.  We can also override for
+faster emulation.  For example, calling a helper for add is more

[RFC PATCH v5 08/33] Hexagon (target/hexagon) GDB Stub

2020-10-29 Thread Taylor Simpson

GDB register read and write routines

Signed-off-by: Taylor Simpson 
Reviewed-by: Richard Henderson 
---
 target/hexagon/internal.h |  3 +++
 target/hexagon/cpu.c  |  2 ++
 target/hexagon/gdbstub.c  | 47 +++
 3 files changed, 52 insertions(+)
 create mode 100644 target/hexagon/gdbstub.c

diff --git a/target/hexagon/internal.h b/target/hexagon/internal.h
index 327bad9..961318a 100644
--- a/target/hexagon/internal.h
+++ b/target/hexagon/internal.h
@@ -29,6 +29,9 @@
 } \
 } while (0)
 
+extern int hexagon_gdb_read_register(CPUState *cpu, GByteArray *buf, int reg);
+extern int hexagon_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
+
 extern void hexagon_debug(CPUHexagonState *env);
 
 extern const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS];
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index 5e0da3f..32aa982 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -280,6 +280,8 @@ static void hexagon_cpu_class_init(ObjectClass *c, void 
*data)
 cc->dump_state = hexagon_dump_state;
 cc->set_pc = hexagon_cpu_set_pc;
 cc->synchronize_from_tb = hexagon_cpu_synchronize_from_tb;
+cc->gdb_read_register = hexagon_gdb_read_register;
+cc->gdb_write_register = hexagon_gdb_write_register;
 cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS;
 cc->gdb_stop_before_watchpoint = true;
 cc->disas_set_info = hexagon_cpu_disas_set_info;
diff --git a/target/hexagon/gdbstub.c b/target/hexagon/gdbstub.c
new file mode 100644
index 000..e8c10b2
--- /dev/null
+++ b/target/hexagon/gdbstub.c
@@ -0,0 +1,47 @@
+/*
+ *  Copyright(c) 2019-2020 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "exec/gdbstub.h"
+#include "cpu.h"
+#include "internal.h"
+
+int hexagon_gdb_read_register(CPUState *cs, GByteArray *mem_buf, int n)
+{
+HexagonCPU *cpu = HEXAGON_CPU(cs);
+CPUHexagonState *env = &cpu->env;
+
+if (n < TOTAL_PER_THREAD_REGS) {
+return gdb_get_regl(mem_buf, env->gpr[n]);
+}
+
+g_assert_not_reached();
+}
+
+int hexagon_gdb_write_register(CPUState *cs, uint8_t *mem_buf, int n)
+{
+HexagonCPU *cpu = HEXAGON_CPU(cs);
+CPUHexagonState *env = &cpu->env;
+
+if (n < TOTAL_PER_THREAD_REGS) {
+env->gpr[n] = ldtul_p(mem_buf);
+return sizeof(target_ulong);
+}
+
+g_assert_not_reached();
+}
-- 
2.7.4

[PATCH v2 4/8] linux-user/mips/cpu_loop: silence the compiler warnings

2020-10-29 Thread Chen Qun

When using -Wimplicit-fallthrough in our CFLAGS, the compiler showed warning:
linux-user/mips/cpu_loop.c: In function ‘cpu_loop’:
linux-user/mips/cpu_loop.c:104:24: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  104 | if ((ret = get_user_ual(arg8, sp_reg + 28)) != 0) {
  |^
linux-user/mips/cpu_loop.c:107:17: note: here
  107 | case 7:
  | ^~~~
linux-user/mips/cpu_loop.c:108:24: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  108 | if ((ret = get_user_ual(arg7, sp_reg + 24)) != 0) {
  |^
linux-user/mips/cpu_loop.c:111:17: note: here
  111 | case 6:
  | ^~~~
linux-user/mips/cpu_loop.c:112:24: warning: this statement may fall through 
[-Wimplicit-fallthrough=]
  112 | if ((ret = get_user_ual(arg6, sp_reg + 20)) != 0) {
  |^
linux-user/mips/cpu_loop.c:115:17: note: here
  115 | case 5:
  | ^~~~

Add the corresponding "fall through" comment to fix it.

Reported-by: Euler Robot 
Signed-off-by: Chen Qun 
Reviewed-by: Thomas Huth 
---
Cc: Laurent Vivier 
---
 linux-user/mips/cpu_loop.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/linux-user/mips/cpu_loop.c b/linux-user/mips/cpu_loop.c
index 553e8ca7f5..cfe7ba5c47 100644
--- a/linux-user/mips/cpu_loop.c
+++ b/linux-user/mips/cpu_loop.c
@@ -104,18 +104,22 @@ void cpu_loop(CPUMIPSState *env)
 if ((ret = get_user_ual(arg8, sp_reg + 28)) != 0) {
 goto done_syscall;
 }
+/* fall through */
 case 7:
 if ((ret = get_user_ual(arg7, sp_reg + 24)) != 0) {
 goto done_syscall;
 }
+/* fall through */
 case 6:
 if ((ret = get_user_ual(arg6, sp_reg + 20)) != 0) {
 goto done_syscall;
 }
+/* fall through */
 case 5:
 if ((ret = get_user_ual(arg5, sp_reg + 16)) != 0) {
 goto done_syscall;
 }
+/* fall through */
 default:
 break;
 }
-- 
2.27.0

[RFC PATCH v5 33/33] Add Dockerfile for hexagon

2020-10-29 Thread Taylor Simpson

Signed-off-by: Alessandro Di Federico 
Signed-off-by: Taylor Simpson 
---
 .../debian-hexagon-cross.build-toolchain.sh| 141 +
 .../docker/dockerfiles/debian-hexagon-cross.docker |  18 +++
 2 files changed, 159 insertions(+)
 create mode 100755 
tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh
 create mode 100644 tests/docker/dockerfiles/debian-hexagon-cross.docker

diff --git a/tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh 
b/tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh
new file mode 100755
index 000..a08c6cd
--- /dev/null
+++ b/tests/docker/dockerfiles/debian-hexagon-cross.build-toolchain.sh
@@ -0,0 +1,141 @@
+#!/bin/bash
+
+set -e
+
+BASE=$(readlink -f ${PWD})
+
+TOOLCHAIN_INSTALL=$(readlink -f "$TOOLCHAIN_INSTALL")
+ROOTFS=$(readlink -f "$ROOTFS")
+
+TOOLCHAIN_BIN=${TOOLCHAIN_INSTALL}/bin
+HEX_SYSROOT=${TOOLCHAIN_INSTALL}/hexagon-unknown-linux-musl
+HEX_TOOLS_TARGET_BASE=${HEX_SYSROOT}/usr
+
+function cdp() {
+  DIR="$1"
+  mkdir -p "$DIR"
+  cd "$DIR"
+}
+
+function fetch() {
+  DIR="$1"
+  URL="$2"
+  TEMP="$(readlink -f "$PWD/tmp.tar.gz")"
+  wget --quiet "$URL" -O "$TEMP"
+  cdp "$DIR"
+  tar xaf "$TEMP" --strip-components=1
+  rm "$TEMP"
+  cd -
+}
+
+build_llvm_clang() {
+  fetch "$BASE/llvm-project" "$LLVM_URL"
+  cdp "$BASE/build-llvm"
+
+  cmake -G Ninja \
+-DCMAKE_BUILD_TYPE=Release \
+-DCMAKE_INSTALL_PREFIX=${TOOLCHAIN_INSTALL} \
+-DLLVM_ENABLE_LLD=ON \
+-DLLVM_TARGETS_TO_BUILD="X86;Hexagon" \
+-DLLVM_ENABLE_PROJECTS="clang;lld" \
+"$BASE/llvm-project/llvm"
+  ninja all install
+  cd ${TOOLCHAIN_BIN}
+  ln -sf clang hexagon-unknown-linux-musl-clang
+  ln -sf clang++ hexagon-unknown-linux-musl-clang++
+  ln -sf llvm-ar hexagon-unknown-linux-musl-ar
+  ln -sf llvm-objdump hexagon-unknown-linux-musl-objdump
+  ln -sf llvm-objcopy hexagon-unknown-linux-musl-objcopy
+  ln -sf llvm-readelf hexagon-unknown-linux-musl-readelf
+  ln -sf llvm-ranlib hexagon-unknown-linux-musl-ranlib
+
+  # workaround for now:
+  cat < hexagon-unknown-linux-musl.cfg
+-G0 --sysroot=${HEX_SYSROOT}
+EOF
+}
+
+build_clang_rt() {
+  cdp "$BASE/build-clang_rt"
+  cmake -G Ninja \
+-DCMAKE_BUILD_TYPE=Release \
+-DLLVM_CONFIG_PATH="$BASE/build-llvm/bin/llvm-config" \
+-DCMAKE_ASM_FLAGS="-G0 -mlong-calls -fno-pic 
--target=hexagon-unknown-linux-musl " \
+-DCMAKE_SYSTEM_NAME=Linux \
+-DCMAKE_C_COMPILER="${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang" \
+-DCMAKE_ASM_COMPILER="${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang" \
+-DCMAKE_INSTALL_PREFIX=${HEX_TOOLS_TARGET_BASE} \
+-DCMAKE_CROSSCOMPILING=ON \
+-DCMAKE_C_COMPILER_FORCED=ON \
+-DCMAKE_CXX_COMPILER_FORCED=ON \
+-DCOMPILER_RT_BUILD_BUILTINS=ON \
+-DCOMPILER_RT_BUILTINS_ENABLE_PIC=OFF \
+-DCMAKE_SIZEOF_VOID_P=4 \
+-DCOMPILER_RT_OS_DIR= \
+-DCAN_TARGET_hexagon=1 \
+-DCAN_TARGET_x86_64=0 \
+-DCOMPILER_RT_SUPPORTED_ARCH=hexagon \
+-DLLVM_ENABLE_PROJECTS="compiler-rt" \
+"$BASE/llvm-project/compiler-rt"
+  ninja install-compiler-rt
+}
+
+build_musl_headers() {
+  fetch "$BASE/musl" "$MUSL_URL"
+  cd "$BASE/musl"
+  make clean
+  CC=${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang \
+CROSS_COMPILE=hexagon-unknown-linux-musl \
+LIBCC=${HEX_TOOLS_TARGET_BASE}/lib/libclang_rt.builtins-hexagon.a \
+CROSS_CFLAGS="-G0 -O0 -mv65 -fno-builtin -fno-rounding-math 
--target=hexagon-unknown-linux-musl" \
+./configure --target=hexagon --prefix=${HEX_TOOLS_TARGET_BASE}
+  PATH=${TOOLCHAIN_BIN}:$PATH make CROSS_COMPILE= install-headers
+
+  cd ${HEX_SYSROOT}/..
+  ln -sf hexagon-unknown-linux-musl hexagon
+}
+
+build_kernel_headers() {
+  fetch "$BASE/linux" "$LINUX_URL"
+  mkdir -p "$BASE/build-linux"
+  cd "$BASE/linux"
+  make O=../build-linux ARCH=hexagon \
+   KBUILD_CFLAGS_KERNEL="-mlong-calls" \
+   CC=${TOOLCHAIN_BIN}/hexagon-unknown-linux-musl-clang \
+   LD=${TOOLCHAIN_BIN}/ld.lld \
+   KBUILD_VERBOSE=1 comet_defconfig
+  make mrproper
+
+  cd "$BASE/build-linux"
+  make \
+ARCH=hexagon \
+CC=${TOOLCHAIN_BIN}/clang \
+INSTALL_HDR_PATH=${HEX_TOOLS_TARGET_BASE} \
+V=1 \
+headers_install
+}
+
+build_musl() {
+  cd "$BASE/musl"
+  make clean
+  CROSS_COMPILE=hexagon-unknown-linux-musl- \
+AR=llvm-ar \
+RANLIB=llvm-ranlib \
+STRIP=llvm-strip \
+CC=clang \
+LIBCC=${HEX_TOOLS_TARGET_BASE}/lib/libclang_rt.builtins-hexagon.a \
+CFLAGS="-G0 -O0 -mv65 -fno-builtin -fno-rounding-math 
--target=hexagon-unknown-linux-musl" \
+./configure --target=hexagon --prefix=${HEX_TOOLS_TARGET_BASE}
+  PATH=${TOOLCHAIN_BIN}/:$PATH make -j CROSS_COMPILE= install
+  cd ${HEX_TOOLS_TARGET_BASE}/lib
+  ln -sf libc.so ld-musl-hexagon.so
+  ln -sf ld-musl-hexagon.so ld-musl-hexagon.so.1
+  cdp ${HEX_TOOLS_TARGET_BASE}/../lib
+  ln -sf ../usr/lib/ld-musl-hexagon.so.1
+}
+
+build_llvm_clang
+build_kernel_headers
+build_musl_headers
+build_clang_rt
+build_musl
diff

[PATCH v2 0/8] silence the compiler warnings

2020-10-29 Thread Chen Qun

Since v1:
- Patch1: Add comments to explain the two case of fall through. Addressed 
Richard Henderson and Thomas Huth review comment.
- Patch2: Addressed Peter Maydell review comment.
- Patch3: Add QEMU_NORETURN to cpu_exit_tb_from_sighandler() function to avoid 
the compiler warnings.
- Patch4: Addressed Thomas Huth review comment.
- Patch5: Addressed Artyom Tarasenko and Philippe Mathieu-Daudé review comment.
- Patch6: Combine the /* fall through */ to the preceding comments. Addressed  
Artyom Tarasenko review comment.
- Patch7: Add a "break" statement here instead of /* fall through */ comments.
- Patch8: Replace the TODO by a LOG_UNIMP call and add break statement
- Patch9: Discard this patch since a patch already exists for fix this 
issue(https://lore.kernel.org/qemu-devel/20200711154242.41222-1-ysato@users)



Chen Qun (8):
  target/i386: silence the compiler warnings in gen_shiftd_rm_T1
  hw/intc/arm_gicv3_kvm: silence the compiler warnings
  accel/tcg/user-exec: silence the compiler warnings
  linux-user/mips/cpu_loop: silence the compiler warnings
  target/sparc/translate: silence the compiler warnings
  target/sparc/win_helper: silence the compiler warnings
  ppc: Add a missing break for PPC6xx_INPUT_TBEN
  target/ppc: replaced the TODO with LOG_UNIMP and add break for silence
warnings

 accel/tcg/user-exec.c  | 3 ++-
 hw/intc/arm_gicv3_kvm.c| 8 
 hw/ppc/ppc.c   | 1 +
 linux-user/mips/cpu_loop.c | 4 
 target/i386/translate.c| 7 +--
 target/ppc/mmu_helper.c| 5 +++--
 target/sparc/translate.c   | 2 +-
 target/sparc/win_helper.c  | 2 +-
 8 files changed, 25 insertions(+), 7 deletions(-)

-- 
2.27.0

1 2 3 4 5 >

1 - 100 of 441 matches

Mail list logo