On Fri, Apr 27, 2007 at 06:53:40PM +0200, Arnd Bergmann wrote:
> On Friday 27 April 2007, Carsten Otte wrote:
> > Add interface which allows a process to start a virtual machine.
> 
> Looks pretty good to me already.
> 
> It seems a lot closer to lguest than to kvm at the moment concerning
> the kernel interfaces (guest real address == host process, state
> is attached to the thread_info, ...).

Guess I should take a look at lguest then :)

> > +static DEFINE_MUTEX(s390host_init_mutex);
> > +
> > +static void s390host_get_data(struct s390host_data *data)
> > +{
> > +   atomic_inc(&data->count);
> > +}
> > +
> > +void s390host_put_data(struct s390host_data *data)
> > +{
> > +   int cpu;
> > +
> > +   if (atomic_dec_return(&data->count))
> > +           return;
> 
> These should probably use a struct kref instead of an atomic_t
> to count the references.

That makes sense.

> 
> > +static unsigned long
> > +s390host_create_io_area(unsigned long addr, unsigned long flags,
> > +                   unsigned long io_addr, struct s390host_data *data)
> > +{
[...]
> > +   down_write(&mm->mmap_sem);
> > +   ret = insert_vm_struct(mm, vma);
> > +   if (ret) {
> > +           kmem_cache_free(vm_area_cachep, vma);
> > +           goto out;
> > +   }
> > +   s390host_get_data(data);
> > +   mm->total_vm += 2;
> > +   vm_insert_page(vma, addr, virt_to_page(io_addr));
> > +
> > +   ret = split_vma(mm, vma, addr + PAGE_SIZE, 0);
> > +   if (ret)
> > +           goto out;
> 
> This open-coded mmap looks a bit scary. I can't see anything wrong in it,
> but it may conflict with a user application that tries to do other strange
> things with memory maps.

Well, there is at least already one bug in it. If the call to split_vma fails
there is no cleanup of the vm_insert_page that has already happened.

> Maybe you can either make this a real call to mmap(), or have it automatically
> mapped using the vdso mechanism.

So, yes we need to find something better, since this is the code that I
don't like at all.

> > +   __u16 cpu;
> 
> __u16 is a type for user space interfaces. In the kernel, use u16.

Yes.

> > +   if (current_thread_info()->sie_cpu != -1)
> > +           return -EINVAL;
> 
> -EBUSY?

Yes.

> > +   if (copy_from_user(&cpu, &sie_template->icpua, sizeof(u16)))
> > +           return -EFAULT;
> 
> get_user()?

Yes.

> > +   sca_block = data->sca_block;
> > +   if (sca_block->mcn & (1UL << (S390HOST_MAX_CPUS - 1 - cpu))) {
> 
> __test_bit()?

Yes, but I think we should probably add some function to access the mcn
bits. Each access looks a bit complicated.
 
> > +           ret = -EINVAL;
> 
> -EBUSY?

Yes.

> > +int sys_s390host_remove_cpu(void)
> > +{
> > +   struct sca_block *sca_block;
> > +   int cpu;
> > +
> > +   cpu = current_thread_info()->sie_cpu;
> > +   if (cpu == -1)
> > +           return -EINVAL;
> > +
> > +   mutex_lock(&s390host_init_mutex);
> > +   sca_block = current_thread_info()->s390host_data->sca_block;
> > +   sca_block->mcn &= ~(1UL << (S390HOST_MAX_CPUS - 1 - cpu));
> > +   current_thread_info()->sie_cpu = -1;
> > +   mutex_unlock(&s390host_init_mutex);
> > +   return 0;
> > +}
> 
> Shouldn't this free the resources that were allocated by add_cpu?

Yes, it was supposed to do that. But since it was only a prototype I was
too lazy to code it up and so I decided to free things on exit.

> If not, this syscall seems pretty useless and I'd simply not
> do it at all -- just wait until exit() to clean up.

This reveals that we are not clearing the mcn bit on exit without
pior call to this syscall. BUG().

> > +   if (signal_pending(current)) {
> > +           ret = -EINTR;
> > +           goto out;
> > +   }
> 
> Can't you handle interrupted system calls? I think it would be nicer
> to return -ERESTARTSYS here so you don't have to go back to your
> user space after a signal.

That should work and it is better.

> > +struct sie_kernel {
> > +   struct sie_block        sie_block;
> > +   s390_fp_regs    host_fpregs;
> > +   int             host_acrs[NUM_ACRS];
> > +} __attribute__((packed,aligned(4096)));
> 
> Why packed?

Not needed.

> > +#define SIE_UPDATE_PSW             (1UL << 0)
> > +#define SIE_FLUSH_TLB              (1UL << 1)
> > +#define SIE_ISKE           (1UL << 2)
> > +#define SIE_SSKE           (1UL << 3)
> > +#define SIE_BLOCK_UPDATE   (1UL << 4)
> > +#define SIE_VSMXM_LOCAL_UPDATE     (1UL << 5)
> > +#define SIE_VSMXM_DIST_UPDATE   (1UL << 6)
> 
> I find it much more readable to define these like
> 
> #define SIE_UPDATE_PSW                1
> #define SIE_FLUSH_TLB         2
> #define SIE_ISKE              4
> 
> especially when trying to interpret values in memory.

Hmm... don't know. Both make sense, but I have no preference on this.
 
> > +struct sie_user {
> > +   struct sie_block        sie_block;
> > +   psw_t                   psw;
> > +   unsigned long           gprs[NUM_GPRS];
> > +   s390_fp_regs            guest_fpregs;
> > +   int                     guest_acrs[NUM_ACRS];
> > +   struct sie_skey_parm    iske_parm;
> > +   struct sie_skey_parm    sske_parm;
> > +   int                     vsmxm_or_local;
> > +   int                     vsmxm_and_local;
> > +   int                     vsmxm_or;
> > +   int                     vsmxm_and;
> > +   int                     vsmxm_cpuid;
> > +} __attribute__((packed,aligned(4096)));
> 
> Is this data structure extensible? If it is, you probably need
> some sort of versioning information to make sure that user space
> doesn't rely on fields that the kernel doesn't know about.

I don't think we can put in some versioning information here. If
the kernel decides to increase the version then old userspace
code would break?
We rather need some mechanism so userpace can ask the kernel
"do you support feature x?" and dependent on the answer some
fields are used or unused.

> > +struct sca_entry {
> > +   atomic_t scn;
> > +   __u64   reserved;
> > +   __u64   sda;
> > +   __u64   reserved2[2];
> > +}__attribute__((packed));
> 
> Is this a hardware data structure? If not, you shouldn't pack it
> because the first word is only 32 bits and consequently all following
> ones are unaligned.

It's a hardware structure. Guess the unaligned u64s were generated when
this structure was extended to support 64bit.
But of course we shouldn't put an atomic_t in a hardware structure, since
it's implementation might change.
 
> > +/* function definitions */
> > +extern int sie64a(struct sie_kernel *);
> > +extern void s390host_put_data(struct s390host_data *);
> 
> These should be inside of '#ifdef __KERNEL__', like all declarations that user
> space doesn't need to see.

Yes, of course.

> 
> > @@ -52,6 +53,8 @@ struct thread_info {
> >     unsigned int            cpu;            /* current CPU */
> >     int                     preempt_count;  /* 0 => preemptable, <0 => BUG 
> > */
> >     struct restart_block    restart_block;
> > +   struct s390host_data    *s390host_data; /* s390host data */
> > +   int                     sie_cpu;        /* sie cpu number */
> >  };
> 
> What happens to these variables on execve?

They stay... We probably should do some cleanup here e.g. by adding some
code to init_new_context().

Thanks for reviewing!

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to