On 04.01.2012, at 21:08, Scott Wood wrote:

> On 01/03/2012 07:10 PM, Alexander Graf wrote:
>> Right now we transfer a static struct every time we want to get or set
>> registers. Unfortunately, over time we realize that there are more of
>> these than we thought of before and the extensibility and flexibility of
>> transferring a full struct every time is limited.
>> 
>> So this is a new approach to the problem. With these new ioctls, we can
>> get and set a single register that is identified by an ID. This allows for
>> very precise and limited transmittal of data. When we later realize that
>> it's a better idea to shove over multiple registers at once, we can reuse
>> most of the infrastructure and simply implement a GET_MANY_REGS / 
>> SET_MANY_REGS
>> interface.
> 
> If we end up replacing get/set sregs with this (either totally or just
> for new registers), we should consider a runtime facility for userspace
> to extract from the kernel a list of supported registers (with sizes)
> that are supported, so that userspace doesn't have to know about every
> register in order to migrate (this could also be useful if there is
> non-architecturally-readable state that needs to be preserved, such as a
> pending book3e doorbell exception, or the MCAR to be set when machine
> checks are unmasked).  It could also avoid the need for explicit
> capabilities in some situations when new registers are added (like
> KVM_CAP_PPC_HIOR).

Interesting idea. That doesn't quite match up with the way we treat migration 
today with all fields getting copied into the CPUState and then taken on the 
wire from there, but I can see how a more generic approach feels more elegant.

Yes, the MANY_REGS interface should be really clever if we want to use it 
extensively, allowing us to only write a byte stream of exactly the register 
sizes we care about - no need to waste 128 bytes for every register - and with 
proper enumeration of available registers.

But to be honest I'm unsure on what the best interface is really. We have 3 
different possible way of implementing register sharing between user space and 
kernel space:

  1) Current way with new CAP and struct enhancements or new structs every time 
we find something we didn't think about. I count ONE_REG in on this one.
  2) Something fancy like GET_MANY_REGS that would allow us to have a very 
flexible interface between kernel and user space for getting and setting 
registers, but we would still need an ioctl for them, so the registers 
shouldn't be read/set in the fast path anyway, at which point maybe using 
get/set individually with ONE_REG is still ok?
  3) Putting registers in the shared kvm_run struct so kernel and user space 
can access them; add a "dirty" bitmap in there so kernel space knows when a 
register is written from user space. This should be the preferred method for 
"fast path" registers - registers we need very often from user space.

Since we do have 1) today and 3) should be there for the fast path, I'm not 
sure how much we really do need 2). But the nice thing here is that we can 
cross that bridge when we get to it. This patch merely lays the groundwork.

>> The only downpoint I see to this one is that it needs to pad to 1024 bits
>> (hardware is already on 512 bit registers, so I wanted to leave some room)
>> which is slightly too much for transmitting only 64 bits. But if that's all
>> the tradeoff we have to do for getting an extensible interface, I'd say go
>> for it nevertheless.
> 
> Does it really make sense to consider these large things as single
> registers (even if that's how hw documents it)?  Do they need to be set
> atomically?  How do you get/set them in hardware if GPRs aren't that large?

For x86 SSE registers, you usually populate them using memory access IIRC, for 
altivec it's the same IIUC.

What would you suggest? Have 16 pseudo-registers for a single 1024bit register?

> 
>> +4.65 KVM_SET_ONE_REG
>> +
>> +Capability: KVM_CAP_ONE_REG
>> +Architectures: all
>> +Type: vcpu ioctl
>> +Parameters: struct kvm_one_reg (in)
>> +Returns: 0 on success, negative value on failure
>> +
>> +struct kvm_one_reg {
>> +       __u64 id;
>> +       union {
>> +               __u8 reg8;
>> +               __u16 reg16;
>> +               __u32 reg32;
>> +               __u64 reg64;
>> +               __u8 reg128[16];
>> +               __u8 reg256[32];
>> +               __u8 reg512[64];
>> +               __u8 reg1024[128];
>> +       } u;
>> +};
> 
> Do we need reg8/16/32, or can we simplify (and reduce potential size
> mismatch errors) by using __u64 for everything that doesn't need to be
> an array?

Hrm. Interesting idea. So you would basically reduce everything to __u64 by 
padding smaller registers and splitting bigger ones into separate IDs? That 
really is appealing, but might uglify the code quite a bit when remerging 
bigger registers. For the padding things should be good.

Btw, any reason you're only bringing up these really great ideas on my 2nd pull 
request after these patches were uncommented on the ML for quite a while? :)

> 
>> +/* Available with KVM_CAP_ONE_REG */
>> +
>> +#define KVM_ONE_REG_GENERIC         0x0000000000000000ULL
> 
> Generic registers?  Is the idea to use this in place of dedicated ioctls
> for certain KVM knobs?

Well, it felt logical to reserve some range for "other use".

> 
>> +/*
>> + * Architecture specific registers are to be defined in arch headers and
>> + * ORed with the arch identifier.
>> + */
>> +#define KVM_ONE_REG_PPC                     0x1000000000000000ULL
>> +#define KVM_ONE_REG_X86                     0x2000000000000000ULL
>> +#define KVM_ONE_REG_IA64            0x3000000000000000ULL
>> +#define KVM_ONE_REG_ARM                     0x4000000000000000ULL
>> +#define KVM_ONE_REG_S390            0x5000000000000000ULL
> 
> Might want to allow space for more than 16 architectures -- there's room
> to spare.

I don't think we'll get to 16 KVM-enabled architectures anytime soon. And if we 
do, we can just declare the full high byte as arch mask and then we have plenty 
space :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to