> -----Original Message----- > From: Marcelo Tosatti [mailto:mtosa...@redhat.com] > Sent: Friday, January 04, 2013 7:08 AM > To: Alexander Graf > Cc: jjhe...@linux.vnet.ibm.com; Christian Borntraeger; Anthony Liguori; qemu- > de...@nongnu.org qemu-devel; Bhushan Bharat-R65777 > Subject: Re: [Qemu-devel] [PATCH 7/7] KVM regsync: Fix > do_kvm_cpu_synchronize_state data integrity issue > > On Thu, Jan 03, 2013 at 08:09:22PM +0100, Alexander Graf wrote: > > > > On 03.01.2013, at 19:48, Jason J. Herne wrote: > > > > > On 01/03/2013 08:56 AM, Alexander Graf wrote: > > >>> static void do_kvm_cpu_synchronize_state(void *_args) > > >>> >{ > > >>> > struct kvm_cpu_syncstate_args *args = _args; > > >>> >+ CPUArchState *env = args->env; > > >>> >+ int register_level = args->register_level; > > >>> > > > >> This probably becomes more readable if we explicitly revert back to > unsynced state first: > > >> > > >> /* Write back local modifications at our current level */ if > > >> (register_level > env->kvm_vcpu_dirty) { > > >> kvm_arch_put_registers(...); > > >> env->kvm_vcpu_dirty = 0; > > >> } > > >> > > >> and then do the sync we are requested to do: > > >> > > >> if (!env->kvm_vcpu_dirty) { > > >> ... > > >> } > > > > > > I agree, but only if we add a second conditional to the if 1st statement > > > as > such: > > > > > > if (args->env->kvm_vcpu_dirty && register_level > > > > env->kvm_vcpu_dirty) > > > > > > This is to cover the case where the caller is asking for register level > > > "1" > and we're already dirty at level "2". In this case, nothing should happen and > we'll need the "args->env->kvm_vcpu_dirty" to ensure that is the case. > > > > As before, I'd prefer to make this explicit: > > > > > > > > static void do_kvm_cpu_synchronize_state(void *_args) { > > > struct kvm_cpu_syncstate_args *args = _args; > > > CPUArchState *env = args->env; > > > int register_level = args->register_level; > > > > if (register_level > env->kvm_vcpu_dirty) { > > /* We are more dirty than we need to - all is well */ > > return; > > } > > > > > > > > /* Write back local modifications at our current level */ > > > if (args->env->kvm_vcpu_dirty && register_level > env->kvm_vcpu_dirty) > > > { > > > kvm_arch_put_registers(env, env->kvm_vcpu_dirty); > > > env->kvm_vcpu_dirty = 0; > > > } > > > > > > if (!args->env->kvm_vcpu_dirty) { > > > kvm_arch_get_registers(env, register_level); > > > env->kvm_vcpu_dirty = register_level; > > > } > > > } > > > > > > Do you agree? Thanks for your time. :) > > > > Please also check out the discussions I've had with Bharat about his > > watchdog > patches. There we need a mechanism to synchronize registers only when we > actually need to, in order to avoid potential race conditions with a kernel > timer. > > > > That particular case doesn't work well with levels. We can have multiple > different potential race producers in the kernel that we need to avoid > individually, so we can't always synchronize all of them when only one of them > needs to be synchronized. > > > > The big question is what we should be doing about this. We basically have 3 > options: > > > > * implement levels, treat racy registers as "manually synchronized", as > Bharat's latest patch set does > > * implement levels, add a bitmap for additional special synchronization > > bits > > * replace levels by bitmap > > > > I'm quite frankly not sure which one of the 3 would be the best way forward. > > > > > > Alex > > Implement read/write accessors for individual registers, similarly to how its > done in the kernel. See kvm_cache_regs.h.
Read/write for individual registers can be heavy for cases where multiple register needed to be updated. Is not it? For cases where multiple register needed to be synchronized then I would like to elaborate the options as: Option 1: int timer_func(CPUArchState env) { cpu_synchronize_state(env); //update env->timer_registers env->updated_regs_type |= TIMER_REGS_TYPE_BIT; } Change the kvm_arch_put_registers() to add followings: int kvm_arch_put_registers(CPUArchState *env, int level) { If (env->updated_regs_type) { struct kvm_sregs sregs; If (env->updated_regs_type & TIMER_REGS_TYPE_BIT) { // update sregs->timer_registers; // may be we can have a bitmap to tell kernel for what actually updated } If (env->updated_regs_type & XX_CPU_REGS_TYPE) { // update respective registers in sregs } ioctl(env, KVM_GET_SREGS, &sregs); } } This mechanism will get all registers state but this is required only once per entry in QEMU. Option 2: Define read_regs_type() and write_regs_type() for cases where it requires multiple registers updates: int read_regs_type((CPUArchState env, void *regs_ptr, int reg_type) { -> reg_type are bitmap like: TIMER_REGS, DEBUG_REGS, PERFMON_REGS etc. Switch(reg_type) { Case TIMER_REGS: Struct *timer_regs = regs_ptr; Ioctl(KVM_GET_SREGS, timer_regs, CPU_reg_type); Break; Default: Printf(error); } Similarly will be the write function: int write_regs_type((CPUArchState env, int reg_type) { -> reg_type are bitmap like: TIMER_REGS, DEBUG_REGS, PERFMON_REGS etc. Switch(reg_type) { Case TIMER_REGS: Struct timer_regs; Timer_regs.reg1 = env->timer_regs->reg1; Timer_regs.reg2 = env->timer_regs->reg2; Ioctl(env, KVM_SET_SREGS, &timer_regs, CPU_reg_type); Break; Default: Printf(error); } Int timer_func(CPUxxState env) { Struct timer_regs; read_regs_type((env, &timer_regs,TIMER_REGS); //update env->timer_registers Write_regs_type(env, TIMER_REGS) } This will get one type of register_types and can cause multiple IOCTL per entry to QEMU. ------- Keep on using GET/SET_ONE_REG when only one registers needed to be updated. Thanks -Bharat > > This also avoids the programmer from mapping "register X,Y,Z" to "level M" > manually (which is quite bug prone). > > Can use KVM_GET/SET_ONEREG, then. >