Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Victor Kamensky
On 25 January 2014 19:46, Victor Kamensky  wrote:
> On 25 January 2014 10:31, Christoffer Dall  
> wrote:
>> On Sat, Jan 25, 2014 at 04:23:00PM +, Peter Maydell wrote:
>>> On 25 January 2014 02:15, Alexander Graf  wrote:
>>> > Ok, let's go through the combinations for a 32-bit write of 0x01020304 on 
>>> > PPC and what data[] looks like
>
> Alex, could you please for the clarification purpose, tell
> us what instructions BE guest and LE guest run in below
> cases. Suppose r0 register has device address, and r1
> register has word as 0x01020304. And suppose we have
> regular memory instead of io at r0 address. How do you
> see 'unsigned char data[4]' will look like if 'data'
> address is at r0.
>
> For this, for a moment, I think, we can forget about KVM,
> just consider BE guest and LE guest running on regular
> h/w.
>
> Next we will consider cases as below when the same
> BE guest and LE guest runs on top of BE host or LE
> host KVM in different proposals combinations. Effectively
> defining non-virtual examples first and consider how they
> will work under KVM.
>
> I was thinking that that you consider
>
> BE guest:
>stw r1, 0, r0
> just write r1 at r0 address
>
> LE guest
>stwbrx r1, 0, r0
> stwbrx - word byteswap r1 value and write it at r0 address
>
> which would produce at r0 address
> data[] = {0x01, 0x02, 0x03, 0x04}
> for both above cases, and have the same effect
> on device if its register is mapped at r0 address.
>
> But judging by your other response in previous email, I
> think you have something different in mind. Please
> describe it.

Alex, I reread your tables again. I think that just
'stw r1, 0, r0' for both LE and BE will match them. As
far as data[] concerned in BE it will be data[] = {0x01,
0x02, 0x03, 0x04}, and in LE it wil lbe data[] = {0x04,
0x03, 0x02, 0x01}.

Do we agree that these are two different memory
transactions, as far as device concerned?

> Thanks,
> Victor
>
>>> > your proposal:
>>> >
>>> >   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>>> >   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>>> >   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>>> >   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>>> >
>>> > -> ldw_p() will give us the correct value to work with
>>> >
>>> > current proposal:
>>> >
>>> >   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>>> >   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>>> >   BE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>>> >   LE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>>> >
>>> > -> *(uint32_t*)data will give us the correct value to work with
>>>
>>> For completeness, ditto, ARM:
>>> Scott's proposal (you end up with the same values in
>>> the data array as for PPC, so userspace has to know the
>>> "default" endianness so it can do a byteswap

Actually userland need to mange mismatch between
native internal data of simulated device and device
endianity. If native endianity mismatches device qemu
need to do byteswap. Otherwise it will be just copy it
into mmio.data as bytes array.
In qemu there is already infrastructure that handles
that. My ARM BE KVM patches use Scott's mmio.
data[] definition and I did not do anything in qemu
to make that work.

>>> if the
>>> process endianness doesn't match it; on QEMU
>>> ldl_p() handles this for us, as you say):
>>>
>>>BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>>>LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>>>BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>>>LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }

Peter, thank you for adding ARM case. It really
helps to see the difference between definitions.

Don't you agree that Scott's definition has advantage
because its interpretation does not depend on
CPU type?

It is always the same for the same 'device access
operation', guest endianity, host endianity. If one
look at 'stw r1, 0, r0' in BE guest, and 'stwbrx r1,
0, r0' in LE guest, which is the same memory
transaction as device concerned. Table using
Scott's definition will always look like

LE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
LE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
BE guest, LE host: { 0x01, 0x02, 0x03, 0x04 }
LE guest, LE host: { 0x01, 0x02, 0x03, 0x04 }

if ARM would work with this device to do the
same thing, its table will look exactly the same.

>>>
>>> current proposal:
>>>
>>>BE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>>>LE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>>>BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>>>LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>>>
>>> The values in the data array are different than on
>>> PPC, reflecting the fact that the "default" endianness
>>> is different; userspace does
>>>
>>>  -> *(uint32_t*)data will give us the correct value to work with
>>>
>>> regardless of what the guest CPU arch is.
>>>
>>> > There are pros and cons for both approaches.
>>> >
>>> > Pro approach 1 is that it fits the way data[] is read today,
>>> > so no QEMU changes are required. However, it

Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Victor Kamensky
On 25 January 2014 10:31, Christoffer Dall  wrote:
> On Sat, Jan 25, 2014 at 04:23:00PM +, Peter Maydell wrote:
>> On 25 January 2014 02:15, Alexander Graf  wrote:
>> > Ok, let's go through the combinations for a 32-bit write of 0x01020304 on 
>> > PPC and what data[] looks like

Alex, could you please for the clarification purpose, tell
us what instructions BE guest and LE guest run in below
cases. Suppose r0 register has device address, and r1
register has word as 0x01020304. And suppose we have
regular memory instead of io at r0 address. How do you
see 'unsigned char data[4]' will look like if 'data'
address is at r0.

For this, for a moment, I think, we can forget about KVM,
just consider BE guest and LE guest running on regular
h/w.

Next we will consider cases as below when the same
BE guest and LE guest runs on top of BE host or LE
host KVM in different proposals combinations. Effectively
defining non-virtual examples first and consider how they
will work under KVM.

I was thinking that that you consider

BE guest:
   stw r1, 0, r0
just write r1 at r0 address

LE guest
   stwbrx r1, 0, r0
stwbrx - word byteswap r1 value and write it at r0 address

which would produce at r0 address
data[] = {0x01, 0x02, 0x03, 0x04}
for both above cases, and have the same effect
on device if its register is mapped at r0 address.

But judging by your other response in previous email, I
think you have something different in mind. Please
describe it.

Thanks,
Victor

>> > your proposal:
>> >
>> >   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>> >   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>> >   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>> >   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>> >
>> > -> ldw_p() will give us the correct value to work with
>> >
>> > current proposal:
>> >
>> >   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>> >   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>> >   BE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>> >   LE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>> >
>> > -> *(uint32_t*)data will give us the correct value to work with
>>
>> For completeness, ditto, ARM:
>> Scott's proposal (you end up with the same values in
>> the data array as for PPC, so userspace has to know the
>> "default" endianness so it can do a byteswap if the
>> process endianness doesn't match it; on QEMU
>> ldl_p() handles this for us, as you say):
>>
>>BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>>LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>>BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>>LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>>
>> current proposal:
>>
>>BE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>>LE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>>BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>>LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>>
>> The values in the data array are different than on
>> PPC, reflecting the fact that the "default" endianness
>> is different; userspace does
>>
>>  -> *(uint32_t*)data will give us the correct value to work with
>>
>> regardless of what the guest CPU arch is.
>>
>> > There are pros and cons for both approaches.
>> >
>> > Pro approach 1 is that it fits the way data[] is read today,
>> > so no QEMU changes are required. However, it means
>> > that user space needs to have awareness of the
>> > "default endianness".
>> > With approach 2 you don't care about endianness at all
>> > anymore - you just get a payload that the host process
>> > can read in.
>> >
>> > Obviously both approaches would work as long as they're
>> > properly defined :).
>>
>> Agreed with all of that.
>>
> Agreed as well.
>
> How do we settle on one versus the other?
>
> And if we agree on the current proposal (from Alex/Peter/Me) is the
> current suggested wording ok, or can someone suggest a better wording?
>
> We could for example include the the matrices as above for all known
> architectures, but that does seem overly verbose.
>
> -Christoffer
> ___
> kvmarm mailing list
> kvm...@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 69361] Host call trace and guest hang after create guest.

2014-01-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=69361

--- Comment #5 from Zhou, Chao  ---
Created attachment 123351
  --> https://bugzilla.kernel.org/attachment.cgi?id=123351&action=edit
host kernel config

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Christoffer Dall
On Sat, Jan 25, 2014 at 04:23:00PM +, Peter Maydell wrote:
> On 25 January 2014 02:15, Alexander Graf  wrote:
> > Ok, let's go through the combinations for a 32-bit write of 0x01020304 on 
> > PPC and what data[] looks like
> >
> > your proposal:
> >
> >   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
> >   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
> >   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
> >   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
> >
> > -> ldw_p() will give us the correct value to work with
> >
> > current proposal:
> >
> >   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
> >   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
> >   BE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
> >   LE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
> >
> > -> *(uint32_t*)data will give us the correct value to work with
> 
> For completeness, ditto, ARM:
> Scott's proposal (you end up with the same values in
> the data array as for PPC, so userspace has to know the
> "default" endianness so it can do a byteswap if the
> process endianness doesn't match it; on QEMU
> ldl_p() handles this for us, as you say):
> 
>BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
> 
> current proposal:
> 
>BE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>LE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
> 
> The values in the data array are different than on
> PPC, reflecting the fact that the "default" endianness
> is different; userspace does
> 
>  -> *(uint32_t*)data will give us the correct value to work with
> 
> regardless of what the guest CPU arch is.
> 
> > There are pros and cons for both approaches.
> >
> > Pro approach 1 is that it fits the way data[] is read today,
> > so no QEMU changes are required. However, it means
> > that user space needs to have awareness of the
> > "default endianness".
> > With approach 2 you don't care about endianness at all
> > anymore - you just get a payload that the host process
> > can read in.
> >
> > Obviously both approaches would work as long as they're
> > properly defined :).
> 
> Agreed with all of that.
> 
Agreed as well.

How do we settle on one versus the other?

And if we agree on the current proposal (from Alex/Peter/Me) is the
current suggested wording ok, or can someone suggest a better wording?

We could for example include the the matrices as above for all known
architectures, but that does seem overly verbose.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Peter Maydell
On 25 January 2014 02:15, Alexander Graf  wrote:
> Ok, let's go through the combinations for a 32-bit write of 0x01020304 on PPC 
> and what data[] looks like
>
> your proposal:
>
>   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>
> -> ldw_p() will give us the correct value to work with
>
> current proposal:
>
>   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
>   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
>   BE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }
>   LE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
>
> -> *(uint32_t*)data will give us the correct value to work with

For completeness, ditto, ARM:
Scott's proposal (you end up with the same values in
the data array as for PPC, so userspace has to know the
"default" endianness so it can do a byteswap if the
process endianness doesn't match it; on QEMU
ldl_p() handles this for us, as you say):

   BE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
   LE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }

current proposal:

   BE guest, BE host: { 0x04, 0x03, 0x02, 0x01 }
   LE guest, BE host: { 0x01, 0x02, 0x03, 0x04 }
   BE guest, LE host:  { 0x01, 0x02, 0x03, 0x04 }
   LE guest, LE host:  { 0x04, 0x03, 0x02, 0x01 }

The values in the data array are different than on
PPC, reflecting the fact that the "default" endianness
is different; userspace does

 -> *(uint32_t*)data will give us the correct value to work with

regardless of what the guest CPU arch is.

> There are pros and cons for both approaches.
>
> Pro approach 1 is that it fits the way data[] is read today,
> so no QEMU changes are required. However, it means
> that user space needs to have awareness of the
> "default endianness".
> With approach 2 you don't care about endianness at all
> anymore - you just get a payload that the host process
> can read in.
>
> Obviously both approaches would work as long as they're
> properly defined :).

Agreed with all of that.

thanks
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Alexander Graf

On 25.01.2014, at 16:36, Victor Kamensky  wrote:

> On 25 January 2014 01:20, Alexander Graf  wrote:
> 
>> It even
>> does it on BE PPC if you access devices through swizzling
>> buses, but we don't care as hypervisor. All we know in kvm is:
>> 
>>  - instruction used for access
>>  -> originating register (value)
>>  -> access size (len)
>>  -> virtual address (which we translate to phys)
>>  - guest endianness mode
> 
> Do you agree that in above if you just change guest endianness
> then device will see different (byteswapped) value so as far as
> device concerned it is completely different transactions, different
> effect. Why do we consider them? So if you want to see the
> same transaction and have the same effect on device, then when
> guest switches endianness it will need to add/drop byteswap,
> in order to have the same effect.

Yes. We consider it because KVM has to do the counter-swap manually based on 
the guest CPU's endianness setting because the only thing KVM knows are the 4 
bits I mentioned above.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Victor Kamensky
On 25 January 2014 01:20, Alexander Graf  wrote:
>
>
>> Am 25.01.2014 um 03:37 schrieb Victor Kamensky :
>>
>>> On 24 January 2014 18:15, Alexander Graf  wrote:
>>>
 On 25.01.2014, at 02:58, Scott Wood  wrote:

> On Sat, 2014-01-25 at 00:24 +, Peter Maydell wrote:
>> On 24 January 2014 23:51, Scott Wood  wrote:
>>> On Fri, 2014-01-24 at 15:39 -0800, Christoffer Dall wrote:
>>> diff --git a/Documentation/virtual/kvm/api.txt 
>>> b/Documentation/virtual/kvm/api.txt
>>> index 366bf4b..6dbd68c 100644
>>> --- a/Documentation/virtual/kvm/api.txt
>>> +++ b/Documentation/virtual/kvm/api.txt
>>> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which 
>>> could not be satisfied
>>> by kvm.  The 'data' member contains the written data if 'is_write' is
>>> true, and should be filled by application code otherwise.
>>>
>>> +The 'data' member byte order is host kernel native endianness, 
>>> regardless of
>>> +the endianness of the guest, and represents the the value as it would 
>>> go on the
>>> +bus in real hardware.  The host user space should always be able to do:
>>> + val = *(( *)mmio.data).
>>
>> Host userspace should be able to do that with what results?  It would
>> only produce a directly usable value if host endianness is the same as
>> the emulated device's endianness.
>
> With the result that it gets the value the CPU has sent out on
> the bus as the memory transaction.

 Doesn't that assume the host kernel endianness is the same as the bus
 (or rather, that the host CPU would not swap such an access before it
 hits the bus)?

 If you take the same hardware and boot a little endian host kernel one
 day, and a big endian host kernel the next, the bus doesn't change, and
 neither should the bytewise (assuming address invariance) contents of
 data[].  How data[] would look when read as a larger integer would of
 course change -- but that's due to how you're reading it.

 It's clear to say that a value in memory has been stored there in host
 endianness when the value is as you would want to see it in a CPU
 register, but it's less clear when you talk about it relative to values
 on a bus.  It's harder to correlate that to something that is software
 visible.

 I don't think there's any actual technical difference between your
 wording and mine when each wording is properly interpreted, but I
 suspect my wording is less likely to be misinterpreted (I could be
 wrong).

> Obviously if what userspace
> is emulating is a bus which has a byteswapping bridge or if it's
> being helpful to device emulation by providing "here's the value
> even though you think you're wired up backwards" then it needs
> to byteswap.

 Whether the emulated bus has "a byteswapping bridge" doesn't sound like
 something that depends on the endianness that the host CPU is currently
 running in.

>> How about a wording like this:
>>
>> The 'data' member contains, in its first 'len' bytes, the value as it
>> would appear if the guest had accessed memory rather than I/O.
>
> I think this is confusing, because now userspace authors have
> to figure out how to get back to "value X of size Y at address Z"
> by interpreting this text... Can you write out the equivalent of
> Christoffer's text "here's how you get the memory transaction
> value" for what you want?

 Userspace swaps the value if and only if userspace's endianness differs
 from the endianness with which the device interprets the data
 (regardless of whether said interpretation is considered natural or
 swapped relative to the way the bus is documented).  It's similar to how
 userspace would handle emulating DMA.

 KVM swaps the value if and only if the endianness of the guest access
 differs from that of the host, i.e. if it would have done swapping when
 emulating an ordinary memory access.

> (Also, value as it would appear to who?)

 As it would appear to anyone.  It works because data[] actually is
 memory.  Any difference in how data appears based on the reader's
 context would already be reflected when the reader performs the load.

> I think your wording implies that the order of bytes in data[] depend
> on the guest CPU "usual byte order", ie the order which the CPU
> does not do a byte-lane-swap for (LE for ARM, BE for PPC),
> and it would mean it would come out differently from
> my/Alex/Christoffer's proposal if the host kernel was the opposite
> endianness from that "usual" order.

 It doesn't depend on "usual" anything.  The only thing it implicitly
 says about guest byte order is that it's KVM's job to implement any
 swapping if the endianness of the guest access is diffe

Re: 3.13 hangs when I tried to start a KVM at a 32 bit stable Gentoo

2014-01-25 Thread Toralf Förster
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 01/24/2014 01:36 PM, Paolo Bonzini wrote:
> Il 23/01/2014 20:50, Toralf Förster ha scritto: | What makes the
> situation really annyoing - sometimes I just can | restart my wlan
> device it the system works normal, but sometimes | the whole system
> hangs and for those cases then sometimes not even | sysrq buttons
> do work.
> 
> Can you reproduce it with the wlan driver disabled completely?
> 
yes - root cause is not the wlan - that's just a victim.
> Paolo
> 

- -- 
MfG/Sincerely
Toralf Förster
pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iF4EAREIAAYFAlLjiF0ACgkQxOrN3gB26U64DwD/c60zqwUORstYkSD2I1AarWLO
/4QVwWo8hW8bUg6f3SEA/2Wv7jkJTtkfTUVEyZZxeEJaNP+RChsmGSoU77Dl1G7T
=VMnB
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Alexander Graf


> Am 25.01.2014 um 03:37 schrieb Victor Kamensky :
> 
>> On 24 January 2014 18:15, Alexander Graf  wrote:
>> 
>>> On 25.01.2014, at 02:58, Scott Wood  wrote:
>>> 
 On Sat, 2014-01-25 at 00:24 +, Peter Maydell wrote:
> On 24 January 2014 23:51, Scott Wood  wrote:
>> On Fri, 2014-01-24 at 15:39 -0800, Christoffer Dall wrote:
>> diff --git a/Documentation/virtual/kvm/api.txt 
>> b/Documentation/virtual/kvm/api.txt
>> index 366bf4b..6dbd68c 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which 
>> could not be satisfied
>> by kvm.  The 'data' member contains the written data if 'is_write' is
>> true, and should be filled by application code otherwise.
>> 
>> +The 'data' member byte order is host kernel native endianness, 
>> regardless of
>> +the endianness of the guest, and represents the the value as it would 
>> go on the
>> +bus in real hardware.  The host user space should always be able to do:
>> + val = *(( *)mmio.data).
> 
> Host userspace should be able to do that with what results?  It would
> only produce a directly usable value if host endianness is the same as
> the emulated device's endianness.
 
 With the result that it gets the value the CPU has sent out on
 the bus as the memory transaction.
>>> 
>>> Doesn't that assume the host kernel endianness is the same as the bus
>>> (or rather, that the host CPU would not swap such an access before it
>>> hits the bus)?
>>> 
>>> If you take the same hardware and boot a little endian host kernel one
>>> day, and a big endian host kernel the next, the bus doesn't change, and
>>> neither should the bytewise (assuming address invariance) contents of
>>> data[].  How data[] would look when read as a larger integer would of
>>> course change -- but that's due to how you're reading it.
>>> 
>>> It's clear to say that a value in memory has been stored there in host
>>> endianness when the value is as you would want to see it in a CPU
>>> register, but it's less clear when you talk about it relative to values
>>> on a bus.  It's harder to correlate that to something that is software
>>> visible.
>>> 
>>> I don't think there's any actual technical difference between your
>>> wording and mine when each wording is properly interpreted, but I
>>> suspect my wording is less likely to be misinterpreted (I could be
>>> wrong).
>>> 
 Obviously if what userspace
 is emulating is a bus which has a byteswapping bridge or if it's
 being helpful to device emulation by providing "here's the value
 even though you think you're wired up backwards" then it needs
 to byteswap.
>>> 
>>> Whether the emulated bus has "a byteswapping bridge" doesn't sound like
>>> something that depends on the endianness that the host CPU is currently
>>> running in.
>>> 
> How about a wording like this:
> 
> The 'data' member contains, in its first 'len' bytes, the value as it
> would appear if the guest had accessed memory rather than I/O.
 
 I think this is confusing, because now userspace authors have
 to figure out how to get back to "value X of size Y at address Z"
 by interpreting this text... Can you write out the equivalent of
 Christoffer's text "here's how you get the memory transaction
 value" for what you want?
>>> 
>>> Userspace swaps the value if and only if userspace's endianness differs
>>> from the endianness with which the device interprets the data
>>> (regardless of whether said interpretation is considered natural or
>>> swapped relative to the way the bus is documented).  It's similar to how
>>> userspace would handle emulating DMA.
>>> 
>>> KVM swaps the value if and only if the endianness of the guest access
>>> differs from that of the host, i.e. if it would have done swapping when
>>> emulating an ordinary memory access.
>>> 
 (Also, value as it would appear to who?)
>>> 
>>> As it would appear to anyone.  It works because data[] actually is
>>> memory.  Any difference in how data appears based on the reader's
>>> context would already be reflected when the reader performs the load.
>>> 
 I think your wording implies that the order of bytes in data[] depend
 on the guest CPU "usual byte order", ie the order which the CPU
 does not do a byte-lane-swap for (LE for ARM, BE for PPC),
 and it would mean it would come out differently from
 my/Alex/Christoffer's proposal if the host kernel was the opposite
 endianness from that "usual" order.
>>> 
>>> It doesn't depend on "usual" anything.  The only thing it implicitly
>>> says about guest byte order is that it's KVM's job to implement any
>>> swapping if the endianness of the guest access is different from the
>>> endianness of the host kernel access (whether it's due to the guest's
>>> mode, the way a page is mapped, the ins

Re: [PATCH v2] KVM: Specify byte order for KVM_EXIT_MMIO

2014-01-25 Thread Alexander Graf


> Am 25.01.2014 um 03:34 schrieb Christoffer Dall :
> 
>> On Sat, Jan 25, 2014 at 03:15:35AM +0100, Alexander Graf wrote:
>> 
>>> On 25.01.2014, at 02:58, Scott Wood  wrote:
>>> 
 On Sat, 2014-01-25 at 00:24 +, Peter Maydell wrote:
> On 24 January 2014 23:51, Scott Wood  wrote:
>> On Fri, 2014-01-24 at 15:39 -0800, Christoffer Dall wrote:
>> diff --git a/Documentation/virtual/kvm/api.txt 
>> b/Documentation/virtual/kvm/api.txt
>> index 366bf4b..6dbd68c 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -2565,6 +2565,11 @@ executed a memory-mapped I/O instruction which 
>> could not be satisfied
>> by kvm.  The 'data' member contains the written data if 'is_write' is
>> true, and should be filled by application code otherwise.
>> 
>> +The 'data' member byte order is host kernel native endianness, 
>> regardless of
>> +the endianness of the guest, and represents the the value as it would 
>> go on the
>> +bus in real hardware.  The host user space should always be able to do:
>> + val = *(( *)mmio.data).
> 
> Host userspace should be able to do that with what results?  It would
> only produce a directly usable value if host endianness is the same as
> the emulated device's endianness.
 
 With the result that it gets the value the CPU has sent out on
 the bus as the memory transaction.
>>> 
>>> Doesn't that assume the host kernel endianness is the same as the bus
>>> (or rather, that the host CPU would not swap such an access before it
>>> hits the bus)?
>>> 
>>> If you take the same hardware and boot a little endian host kernel one
>>> day, and a big endian host kernel the next, the bus doesn't change, and
>>> neither should the bytewise (assuming address invariance) contents of
>>> data[].  How data[] would look when read as a larger integer would of
>>> course change -- but that's due to how you're reading it.
>>> 
>>> It's clear to say that a value in memory has been stored there in host
>>> endianness when the value is as you would want to see it in a CPU
>>> register, but it's less clear when you talk about it relative to values
>>> on a bus.  It's harder to correlate that to something that is software
>>> visible.
>>> 
>>> I don't think there's any actual technical difference between your
>>> wording and mine when each wording is properly interpreted, but I
>>> suspect my wording is less likely to be misinterpreted (I could be
>>> wrong).
>>> 
 Obviously if what userspace
 is emulating is a bus which has a byteswapping bridge or if it's
 being helpful to device emulation by providing "here's the value
 even though you think you're wired up backwards" then it needs
 to byteswap.
>>> 
>>> Whether the emulated bus has "a byteswapping bridge" doesn't sound like
>>> something that depends on the endianness that the host CPU is currently
>>> running in.
>>> 
> How about a wording like this:
> 
> The 'data' member contains, in its first 'len' bytes, the value as it
> would appear if the guest had accessed memory rather than I/O.
 
 I think this is confusing, because now userspace authors have
 to figure out how to get back to "value X of size Y at address Z"
 by interpreting this text... Can you write out the equivalent of
 Christoffer's text "here's how you get the memory transaction
 value" for what you want?
>>> 
>>> Userspace swaps the value if and only if userspace's endianness differs
>>> from the endianness with which the device interprets the data
>>> (regardless of whether said interpretation is considered natural or
>>> swapped relative to the way the bus is documented).  It's similar to how
>>> userspace would handle emulating DMA.
>>> 
>>> KVM swaps the value if and only if the endianness of the guest access
>>> differs from that of the host, i.e. if it would have done swapping when
>>> emulating an ordinary memory access.
>>> 
 (Also, value as it would appear to who?)
>>> 
>>> As it would appear to anyone.  It works because data[] actually is
>>> memory.  Any difference in how data appears based on the reader's
>>> context would already be reflected when the reader performs the load.
>>> 
 I think your wording implies that the order of bytes in data[] depend
 on the guest CPU "usual byte order", ie the order which the CPU
 does not do a byte-lane-swap for (LE for ARM, BE for PPC),
 and it would mean it would come out differently from
 my/Alex/Christoffer's proposal if the host kernel was the opposite
 endianness from that "usual" order.
>>> 
>>> It doesn't depend on "usual" anything.  The only thing it implicitly
>>> says about guest byte order is that it's KVM's job to implement any
>>> swapping if the endianness of the guest access is different from the
>>> endianness of the host kernel access (whether it's due to the guest's
>>> mode, the way a page is