Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-07-13 Thread Wei Liu
On Fri, Jul 08, 2016 at 02:18:46PM +0100, Wei Liu wrote:
> To unblock Paulina on her series, I would be ok with the cast provided
> there is compile-time check to ensure the user-space structure is
> identical to the ioctl structure.
> 
> That would involve:
> 1. Introducing BUILD_BUG_ON, offsetof, alignof to libs/ if they are not
>already available.

I just checked all these.

BUILD_BUG_ON is not there. A patch for it is trivial. I will do that
soon.

offsetof can be found in stddef.h.

alignof is C11 (we require C99), but there is __alignof__ gcc extension,
which clang also supports. We've been using that for quite a long time,
so I don't think we need to do anything about it.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-07-08 Thread Wei Liu
To unblock Paulina on her series, I would be ok with the cast provided
there is compile-time check to ensure the user-space structure is
identical to the ioctl structure.

That would involve:
1. Introducing BUILD_BUG_ON, offsetof, alignof to libs/ if they are not
   already available.
2. BUILD_BUG_ON(sizeof(A) != sizeof(B))
3. BUILD_BUG_ON(offsetof(A, f1) != offsetof(B, f1)) (enumerate all
   fields)
4. BUILD_BUG_ON(alignof(A) != alignof(B))

Paulina, let me know if you would be interested in doing #1. Normally
this requires reading compiler manuals and some coding. I can give you
more details if you're up for the task, otherwise I will try to find
some time to do it myself.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-07-06 Thread Roger Pau Monné
On Wed, Jun 22, 2016 at 05:49:59PM +0100, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 03:52:43PM +0100, Wei Liu wrote:
> > On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> > > On 22/06/16 14:29, Wei Liu wrote:
> > > > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> > > >> On 22/06/16 12:21, Wei Liu wrote:
> > > >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> > >  On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > > > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > > > system call is invoked. In mini-os the operation is yet not
> > > > implemented. For other OSs there is a dummy implementation.
> > >  [...]
> > > > --- a/tools/libs/gnttab/linux.c
> > > > +++ b/tools/libs/gnttab/linux.c
> > > > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> > > >  return 0;
> > > >  }
> > > >  
> > > > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > > > +uint32_t count,
> > > > +xengnttab_grant_copy_segment_t *segs)
> > > > +{
> > > > +int i, rc;
> > > > +int fd = xgt->fd;
> > > > +struct ioctl_gntdev_grant_copy copy;
> > > > +
> > > > +copy.segments = calloc(count, sizeof(struct 
> > > > ioctl_gntdev_grant_copy_segment));
> > > > +copy.count = count;
> > > > +for (i = 0; i < count; i++)
> > > > +{
> > > > +copy.segments[i].flags = segs[i].flags;
> > > > +copy.segments[i].len = segs[i].len;
> > > > +if (segs[i].flags == GNTCOPY_dest_gref) 
> > > > +{
> > > > +copy.segments[i].dest.foreign.ref = 
> > > > segs[i].dest.foreign.ref;
> > > > +copy.segments[i].dest.foreign.domid = 
> > > > segs[i].dest.foreign.domid;
> > > > +copy.segments[i].dest.foreign.offset = 
> > > > segs[i].dest.foreign.offset;
> > > > +copy.segments[i].source.virt = segs[i].source.virt;
> > > > +} 
> > > > +else 
> > > > +{
> > > > +copy.segments[i].source.foreign.ref = 
> > > > segs[i].source.foreign.ref;
> > > > +copy.segments[i].source.foreign.domid = 
> > > > segs[i].source.foreign.domid;
> > > > +copy.segments[i].source.foreign.offset = 
> > > > segs[i].source.foreign.offset;
> > > > +copy.segments[i].dest.virt = segs[i].dest.virt;
> > > > +}
> > > > +}
> > > > +
> > > > +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> > > > +if (rc) 
> > > > +{
> > > > +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > > > +}
> > > > +else 
> > > > +{
> > > > +for (i = 0; i < count; i++)
> > > > +segs[i].status = copy.segments[i].status;
> > > > +}
> > > > +
> > > > +free(copy.segments);
> > > > +return rc;
> > > > +}
> > > 
> > >  I know Wei asked for this but you've replaced what should be a single
> > >  pointer assignment with a memory allocation and two loops over all 
> > >  the
> > >  segments.
> > > 
> > >  This is a hot path and the two structures (the libxengnttab one and 
> > >  the
> > >  Linux kernel one) are both part of their respective ABIs and won't
> > >  change so Wei's concern that they might change in the future is 
> > >  unfounded.
> > > 
> > > >>>
> > > >>> The fundamental question is: will the ABI between the library and the
> > > >>> kernel ever go mismatch?
> > > >>>
> > > >>> My answer is "maybe".  My rationale is that everything goes across
> > > >>> boundary of components need to be considered with caution. And I tend 
> > > >>> to
> > > >>> assume the worst things will happen.
> > > >>>
> > > >>> To guarantee that they will never go mismatch is to have
> > > >>>
> > > >>>typedef ioctl_gntdev_grant_copy_segment 
> > > >>> xengnttab_grant_copy_segment_t;
> > > >>>
> > > >>> But that's not how the code is written.
> > > >>>
> > > >>> I would like to hear a third opinion. Is my concern unfounded? Am I 
> > > >>> too
> > > >>> cautious? Is there any compelling argument that I missed?
> > > >>>
> > > >>> Somewhat related, can we have some numbers please? It could well be 
> > > >>> the
> > > >>> cost of the two loops is much cheaper than whatever is going on inside
> > > >>> the kernel / hypervisor. And it could turn out that the numbers render
> > > >>> this issue moot.
> > > >>
> > > >> I did some (very) adhoc measurements and with the worst case of single
> > > >> short segments for each ioctl, the optimized version of
> > > >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> > > >>
> > > >> This is enough of a difference that we should use the optimized 
> > > >> version.
> > > >>
> > > >> The unoptimized version also adds an 

Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-07-05 Thread George Dunlap
On Wed, Jun 22, 2016 at 3:52 PM, Wei Liu  wrote:
>> I think the best solution is to allow the osdep code to provide the
>> implementation of xengnttab_grant_copy_segment_t, allowing the Linux
>> code to do:
>>
>> typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t
>>
>> You should still provide the generic structure as well, for those
>> platforms that don't provide their own optimized version.
>>
>
> We can't do that (yet). This means we open the door for divergence on
> different platforms.
>
> Basically this approach requires each platform to do the same thing
> (typedef) This implies any application that uses libxengnttab will need
> to test what platform it runs on. It is just pushing the issue somewhere
> else.
>
> Still, I think I would wait a bit for other people to weight in because
> I'm not sure if my concern is wrong headed.

I tend to be sympathetic to David's argument here.  The library has to
provide some ABI to callers; and it has to know the appropriate Linux
ABI in order to translate from the library ABI to the Linux ABI.  If
it happens to know these are the same, I don't see a reason not to
"translate" it by just by casting the pointer.

If we want to declare the library ABI in a stand-alone fashion (i.e.,
instead of just doing a typedef, so that the library definition is the
same on all platforms), then having some compile-time checking to make
sure that the layouts of the two structures are identical makes sense.
Beyond that, I'm not sure what the extra copying really buys us.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Wei Liu
On Wed, Jun 22, 2016 at 03:52:43PM +0100, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> > On 22/06/16 14:29, Wei Liu wrote:
> > > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> > >> On 22/06/16 12:21, Wei Liu wrote:
> > >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> >  On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > > system call is invoked. In mini-os the operation is yet not
> > > implemented. For other OSs there is a dummy implementation.
> >  [...]
> > > --- a/tools/libs/gnttab/linux.c
> > > +++ b/tools/libs/gnttab/linux.c
> > > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> > >  return 0;
> > >  }
> > >  
> > > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > > +uint32_t count,
> > > +xengnttab_grant_copy_segment_t *segs)
> > > +{
> > > +int i, rc;
> > > +int fd = xgt->fd;
> > > +struct ioctl_gntdev_grant_copy copy;
> > > +
> > > +copy.segments = calloc(count, sizeof(struct 
> > > ioctl_gntdev_grant_copy_segment));
> > > +copy.count = count;
> > > +for (i = 0; i < count; i++)
> > > +{
> > > +copy.segments[i].flags = segs[i].flags;
> > > +copy.segments[i].len = segs[i].len;
> > > +if (segs[i].flags == GNTCOPY_dest_gref) 
> > > +{
> > > +copy.segments[i].dest.foreign.ref = 
> > > segs[i].dest.foreign.ref;
> > > +copy.segments[i].dest.foreign.domid = 
> > > segs[i].dest.foreign.domid;
> > > +copy.segments[i].dest.foreign.offset = 
> > > segs[i].dest.foreign.offset;
> > > +copy.segments[i].source.virt = segs[i].source.virt;
> > > +} 
> > > +else 
> > > +{
> > > +copy.segments[i].source.foreign.ref = 
> > > segs[i].source.foreign.ref;
> > > +copy.segments[i].source.foreign.domid = 
> > > segs[i].source.foreign.domid;
> > > +copy.segments[i].source.foreign.offset = 
> > > segs[i].source.foreign.offset;
> > > +copy.segments[i].dest.virt = segs[i].dest.virt;
> > > +}
> > > +}
> > > +
> > > +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> > > +if (rc) 
> > > +{
> > > +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > > +}
> > > +else 
> > > +{
> > > +for (i = 0; i < count; i++)
> > > +segs[i].status = copy.segments[i].status;
> > > +}
> > > +
> > > +free(copy.segments);
> > > +return rc;
> > > +}
> > 
> >  I know Wei asked for this but you've replaced what should be a single
> >  pointer assignment with a memory allocation and two loops over all the
> >  segments.
> > 
> >  This is a hot path and the two structures (the libxengnttab one and the
> >  Linux kernel one) are both part of their respective ABIs and won't
> >  change so Wei's concern that they might change in the future is 
> >  unfounded.
> > 
> > >>>
> > >>> The fundamental question is: will the ABI between the library and the
> > >>> kernel ever go mismatch?
> > >>>
> > >>> My answer is "maybe".  My rationale is that everything goes across
> > >>> boundary of components need to be considered with caution. And I tend to
> > >>> assume the worst things will happen.
> > >>>
> > >>> To guarantee that they will never go mismatch is to have
> > >>>
> > >>>typedef ioctl_gntdev_grant_copy_segment 
> > >>> xengnttab_grant_copy_segment_t;
> > >>>
> > >>> But that's not how the code is written.
> > >>>
> > >>> I would like to hear a third opinion. Is my concern unfounded? Am I too
> > >>> cautious? Is there any compelling argument that I missed?
> > >>>
> > >>> Somewhat related, can we have some numbers please? It could well be the
> > >>> cost of the two loops is much cheaper than whatever is going on inside
> > >>> the kernel / hypervisor. And it could turn out that the numbers render
> > >>> this issue moot.
> > >>
> > >> I did some (very) adhoc measurements and with the worst case of single
> > >> short segments for each ioctl, the optimized version of
> > >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> > >>
> > >> This is enough of a difference that we should use the optimized version.
> > >>
> > >> The unoptimized version also adds an additional failure path (the
> > >> calloc) which would be best avoided.
> > >>
> > > 
> > > Your test case includes a lot of  noise in libc allocator, so...
> > > 
> > > Can you give try the following patch (apply on top of Paulina's patch)?
> > > The basic idea is to provide scratch space for the structures. Note, the
> > > patch is 

Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Wei Liu
On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> On 22/06/16 14:29, Wei Liu wrote:
> > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> >> On 22/06/16 12:21, Wei Liu wrote:
> >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
>  On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > system call is invoked. In mini-os the operation is yet not
> > implemented. For other OSs there is a dummy implementation.
>  [...]
> > --- a/tools/libs/gnttab/linux.c
> > +++ b/tools/libs/gnttab/linux.c
> > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >  return 0;
> >  }
> >  
> > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > +uint32_t count,
> > +xengnttab_grant_copy_segment_t *segs)
> > +{
> > +int i, rc;
> > +int fd = xgt->fd;
> > +struct ioctl_gntdev_grant_copy copy;
> > +
> > +copy.segments = calloc(count, sizeof(struct 
> > ioctl_gntdev_grant_copy_segment));
> > +copy.count = count;
> > +for (i = 0; i < count; i++)
> > +{
> > +copy.segments[i].flags = segs[i].flags;
> > +copy.segments[i].len = segs[i].len;
> > +if (segs[i].flags == GNTCOPY_dest_gref) 
> > +{
> > +copy.segments[i].dest.foreign.ref = 
> > segs[i].dest.foreign.ref;
> > +copy.segments[i].dest.foreign.domid = 
> > segs[i].dest.foreign.domid;
> > +copy.segments[i].dest.foreign.offset = 
> > segs[i].dest.foreign.offset;
> > +copy.segments[i].source.virt = segs[i].source.virt;
> > +} 
> > +else 
> > +{
> > +copy.segments[i].source.foreign.ref = 
> > segs[i].source.foreign.ref;
> > +copy.segments[i].source.foreign.domid = 
> > segs[i].source.foreign.domid;
> > +copy.segments[i].source.foreign.offset = 
> > segs[i].source.foreign.offset;
> > +copy.segments[i].dest.virt = segs[i].dest.virt;
> > +}
> > +}
> > +
> > +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> > +if (rc) 
> > +{
> > +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > +}
> > +else 
> > +{
> > +for (i = 0; i < count; i++)
> > +segs[i].status = copy.segments[i].status;
> > +}
> > +
> > +free(copy.segments);
> > +return rc;
> > +}
> 
>  I know Wei asked for this but you've replaced what should be a single
>  pointer assignment with a memory allocation and two loops over all the
>  segments.
> 
>  This is a hot path and the two structures (the libxengnttab one and the
>  Linux kernel one) are both part of their respective ABIs and won't
>  change so Wei's concern that they might change in the future is 
>  unfounded.
> 
> >>>
> >>> The fundamental question is: will the ABI between the library and the
> >>> kernel ever go mismatch?
> >>>
> >>> My answer is "maybe".  My rationale is that everything goes across
> >>> boundary of components need to be considered with caution. And I tend to
> >>> assume the worst things will happen.
> >>>
> >>> To guarantee that they will never go mismatch is to have
> >>>
> >>>typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> >>>
> >>> But that's not how the code is written.
> >>>
> >>> I would like to hear a third opinion. Is my concern unfounded? Am I too
> >>> cautious? Is there any compelling argument that I missed?
> >>>
> >>> Somewhat related, can we have some numbers please? It could well be the
> >>> cost of the two loops is much cheaper than whatever is going on inside
> >>> the kernel / hypervisor. And it could turn out that the numbers render
> >>> this issue moot.
> >>
> >> I did some (very) adhoc measurements and with the worst case of single
> >> short segments for each ioctl, the optimized version of
> >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> >>
> >> This is enough of a difference that we should use the optimized version.
> >>
> >> The unoptimized version also adds an additional failure path (the
> >> calloc) which would be best avoided.
> >>
> > 
> > Your test case includes a lot of  noise in libc allocator, so...
> > 
> > Can you give try the following patch (apply on top of Paulina's patch)?
> > The basic idea is to provide scratch space for the structures. Note, the
> > patch is compile test only.
> [...]
> > +#define COPY_SEGMENT_CACHE_SIZE 1024
> 
> Arbitrary limit on number of segments.
> 
> > +copy.segments = xgt->osdep_data;
> 
> Not thread safe.
> 

Both issues are real, but this is just a gross hack to try to get some
numbers.

> I tried using alloca() which 

Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Paulina Szubarczyk
On Wed, 22 Jun 2016 12:24:16 +0100
Wei Liu  wrote:

> On Wed, Jun 22, 2016 at 11:53:00AM +0200, Paulina Szubarczyk wrote:
> [...]
> > > I know Wei asked for this but you've replaced what should be a single
> > > pointer assignment with a memory allocation and two loops over all the
> > > segments.
> > > 
> > > This is a hot path and the two structures (the libxengnttab one and the
> > > Linux kernel one) are both part of their respective ABIs and won't
> > > change so Wei's concern that they might change in the future is unfounded.
> > > 
> > > This change makes xengnttab_grant_copy() useless for our (XenServer's)
> > > use case.
> > > 
> > > David
> > 
> > As Wei and Ian are maintainers of toolstack if they agree on the previous
> > cast that was here I will revert the changes.
> > 
> 
> Do you have the most up to date numbers? How do they compare to the
> numbers in previous version? If there is degradation, how big is that in
> terms of percentage?
> 
> Wei.
> 
In the file [1] in the sheets with *-domU there are the results for the new
implementation with comparisons to grant map implementation which I run
yesterday/today. I also rebase to the newest staging version before the tests
and there is some improvement for both grant map and grant copy implementation
comparing to previous results, that is way I would not compare the results from
the previous test and I am going to run the test for the implementation with
casting the structures again today. 

But single test that I made take around 90 minutes, since there is 5 min warm
up and 1 min x 14 size of blocks for iodepth in range [1, 4, 8, 64, 256]. And I
usually did them at least three times..

[1]https://docs.google.com/spreadsheets/d/1E6AMiB8ceJpExL6jWpH9u2yy6DZxzhmDUyFf-eUuJ0c/edit?usp=sharing

Paulina


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread David Vrabel
On 22/06/16 14:29, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
>> On 22/06/16 12:21, Wei Liu wrote:
>>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
 On 22/06/16 09:38, Paulina Szubarczyk wrote:
> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> system call is invoked. In mini-os the operation is yet not
> implemented. For other OSs there is a dummy implementation.
 [...]
> --- a/tools/libs/gnttab/linux.c
> +++ b/tools/libs/gnttab/linux.c
> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>  return 0;
>  }
>  
> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> +uint32_t count,
> +xengnttab_grant_copy_segment_t *segs)
> +{
> +int i, rc;
> +int fd = xgt->fd;
> +struct ioctl_gntdev_grant_copy copy;
> +
> +copy.segments = calloc(count, sizeof(struct 
> ioctl_gntdev_grant_copy_segment));
> +copy.count = count;
> +for (i = 0; i < count; i++)
> +{
> +copy.segments[i].flags = segs[i].flags;
> +copy.segments[i].len = segs[i].len;
> +if (segs[i].flags == GNTCOPY_dest_gref) 
> +{
> +copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> +copy.segments[i].dest.foreign.domid = 
> segs[i].dest.foreign.domid;
> +copy.segments[i].dest.foreign.offset = 
> segs[i].dest.foreign.offset;
> +copy.segments[i].source.virt = segs[i].source.virt;
> +} 
> +else 
> +{
> +copy.segments[i].source.foreign.ref = 
> segs[i].source.foreign.ref;
> +copy.segments[i].source.foreign.domid = 
> segs[i].source.foreign.domid;
> +copy.segments[i].source.foreign.offset = 
> segs[i].source.foreign.offset;
> +copy.segments[i].dest.virt = segs[i].dest.virt;
> +}
> +}
> +
> +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> +if (rc) 
> +{
> +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> +}
> +else 
> +{
> +for (i = 0; i < count; i++)
> +segs[i].status = copy.segments[i].status;
> +}
> +
> +free(copy.segments);
> +return rc;
> +}

 I know Wei asked for this but you've replaced what should be a single
 pointer assignment with a memory allocation and two loops over all the
 segments.

 This is a hot path and the two structures (the libxengnttab one and the
 Linux kernel one) are both part of their respective ABIs and won't
 change so Wei's concern that they might change in the future is unfounded.

>>>
>>> The fundamental question is: will the ABI between the library and the
>>> kernel ever go mismatch?
>>>
>>> My answer is "maybe".  My rationale is that everything goes across
>>> boundary of components need to be considered with caution. And I tend to
>>> assume the worst things will happen.
>>>
>>> To guarantee that they will never go mismatch is to have
>>>
>>>typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
>>>
>>> But that's not how the code is written.
>>>
>>> I would like to hear a third opinion. Is my concern unfounded? Am I too
>>> cautious? Is there any compelling argument that I missed?
>>>
>>> Somewhat related, can we have some numbers please? It could well be the
>>> cost of the two loops is much cheaper than whatever is going on inside
>>> the kernel / hypervisor. And it could turn out that the numbers render
>>> this issue moot.
>>
>> I did some (very) adhoc measurements and with the worst case of single
>> short segments for each ioctl, the optimized version of
>> osdep_gnttab_grant_copy() looks to be ~5% faster.
>>
>> This is enough of a difference that we should use the optimized version.
>>
>> The unoptimized version also adds an additional failure path (the
>> calloc) which would be best avoided.
>>
> 
> Your test case includes a lot of  noise in libc allocator, so...
> 
> Can you give try the following patch (apply on top of Paulina's patch)?
> The basic idea is to provide scratch space for the structures. Note, the
> patch is compile test only.
[...]
> +#define COPY_SEGMENT_CACHE_SIZE 1024

Arbitrary limit on number of segments.

> +copy.segments = xgt->osdep_data;

Not thread safe.

I tried using alloca() which has <1% performance penalty but the failure
mode for alloca() is really bad so I would not recommend it.

I think the best solution is to allow the osdep code to provide the
implementation of xengnttab_grant_copy_segment_t, allowing the Linux
code to do:

typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t

You should still provide the generic structure as well, for those
platforms 

Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Wei Liu
On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> On 22/06/16 12:21, Wei Liu wrote:
> > On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> >> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> >>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> >>> system call is invoked. In mini-os the operation is yet not
> >>> implemented. For other OSs there is a dummy implementation.
> >> [...]
> >>> --- a/tools/libs/gnttab/linux.c
> >>> +++ b/tools/libs/gnttab/linux.c
> >>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >>>  return 0;
> >>>  }
> >>>  
> >>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> >>> +uint32_t count,
> >>> +xengnttab_grant_copy_segment_t *segs)
> >>> +{
> >>> +int i, rc;
> >>> +int fd = xgt->fd;
> >>> +struct ioctl_gntdev_grant_copy copy;
> >>> +
> >>> +copy.segments = calloc(count, sizeof(struct 
> >>> ioctl_gntdev_grant_copy_segment));
> >>> +copy.count = count;
> >>> +for (i = 0; i < count; i++)
> >>> +{
> >>> +copy.segments[i].flags = segs[i].flags;
> >>> +copy.segments[i].len = segs[i].len;
> >>> +if (segs[i].flags == GNTCOPY_dest_gref) 
> >>> +{
> >>> +copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> >>> +copy.segments[i].dest.foreign.domid = 
> >>> segs[i].dest.foreign.domid;
> >>> +copy.segments[i].dest.foreign.offset = 
> >>> segs[i].dest.foreign.offset;
> >>> +copy.segments[i].source.virt = segs[i].source.virt;
> >>> +} 
> >>> +else 
> >>> +{
> >>> +copy.segments[i].source.foreign.ref = 
> >>> segs[i].source.foreign.ref;
> >>> +copy.segments[i].source.foreign.domid = 
> >>> segs[i].source.foreign.domid;
> >>> +copy.segments[i].source.foreign.offset = 
> >>> segs[i].source.foreign.offset;
> >>> +copy.segments[i].dest.virt = segs[i].dest.virt;
> >>> +}
> >>> +}
> >>> +
> >>> +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> >>> +if (rc) 
> >>> +{
> >>> +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> >>> +}
> >>> +else 
> >>> +{
> >>> +for (i = 0; i < count; i++)
> >>> +segs[i].status = copy.segments[i].status;
> >>> +}
> >>> +
> >>> +free(copy.segments);
> >>> +return rc;
> >>> +}
> >>
> >> I know Wei asked for this but you've replaced what should be a single
> >> pointer assignment with a memory allocation and two loops over all the
> >> segments.
> >>
> >> This is a hot path and the two structures (the libxengnttab one and the
> >> Linux kernel one) are both part of their respective ABIs and won't
> >> change so Wei's concern that they might change in the future is unfounded.
> >>
> > 
> > The fundamental question is: will the ABI between the library and the
> > kernel ever go mismatch?
> > 
> > My answer is "maybe".  My rationale is that everything goes across
> > boundary of components need to be considered with caution. And I tend to
> > assume the worst things will happen.
> > 
> > To guarantee that they will never go mismatch is to have
> > 
> >typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> > 
> > But that's not how the code is written.
> > 
> > I would like to hear a third opinion. Is my concern unfounded? Am I too
> > cautious? Is there any compelling argument that I missed?
> > 
> > Somewhat related, can we have some numbers please? It could well be the
> > cost of the two loops is much cheaper than whatever is going on inside
> > the kernel / hypervisor. And it could turn out that the numbers render
> > this issue moot.
> 
> I did some (very) adhoc measurements and with the worst case of single
> short segments for each ioctl, the optimized version of
> osdep_gnttab_grant_copy() looks to be ~5% faster.
> 
> This is enough of a difference that we should use the optimized version.
> 
> The unoptimized version also adds an additional failure path (the
> calloc) which would be best avoided.
> 

Your test case includes a lot of  noise in libc allocator, so...

Can you give try the following patch (apply on top of Paulina's patch)?
The basic idea is to provide scratch space for the structures. Note, the
patch is compile test only.

---8<---
From e72c1abb9852f40db548ef208492c3283884 Mon Sep 17 00:00:00 2001
From: Wei Liu 
Date: Wed, 22 Jun 2016 14:22:48 +0100
Subject: [PATCH] xengnttab: provide osdep cache and use it in Linux grant copy

Signed-off-by: Wei Liu 
---
 tools/libs/gnttab/linux.c   | 35 +--
 tools/libs/gnttab/private.h |  2 ++
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/tools/libs/gnttab/linux.c b/tools/libs/gnttab/linux.c
index 62ad7bd..17d4d29 100644
--- a/tools/libs/gnttab/linux.c
+++ b/tools/libs/gnttab/linux.c
@@ -47,13 +47,28 @@
 

Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread David Vrabel
On 22/06/16 12:21, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
>>> system call is invoked. In mini-os the operation is yet not
>>> implemented. For other OSs there is a dummy implementation.
>> [...]
>>> --- a/tools/libs/gnttab/linux.c
>>> +++ b/tools/libs/gnttab/linux.c
>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>>>  return 0;
>>>  }
>>>  
>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
>>> +uint32_t count,
>>> +xengnttab_grant_copy_segment_t *segs)
>>> +{
>>> +int i, rc;
>>> +int fd = xgt->fd;
>>> +struct ioctl_gntdev_grant_copy copy;
>>> +
>>> +copy.segments = calloc(count, sizeof(struct 
>>> ioctl_gntdev_grant_copy_segment));
>>> +copy.count = count;
>>> +for (i = 0; i < count; i++)
>>> +{
>>> +copy.segments[i].flags = segs[i].flags;
>>> +copy.segments[i].len = segs[i].len;
>>> +if (segs[i].flags == GNTCOPY_dest_gref) 
>>> +{
>>> +copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
>>> +copy.segments[i].dest.foreign.domid = 
>>> segs[i].dest.foreign.domid;
>>> +copy.segments[i].dest.foreign.offset = 
>>> segs[i].dest.foreign.offset;
>>> +copy.segments[i].source.virt = segs[i].source.virt;
>>> +} 
>>> +else 
>>> +{
>>> +copy.segments[i].source.foreign.ref = 
>>> segs[i].source.foreign.ref;
>>> +copy.segments[i].source.foreign.domid = 
>>> segs[i].source.foreign.domid;
>>> +copy.segments[i].source.foreign.offset = 
>>> segs[i].source.foreign.offset;
>>> +copy.segments[i].dest.virt = segs[i].dest.virt;
>>> +}
>>> +}
>>> +
>>> +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
>>> +if (rc) 
>>> +{
>>> +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
>>> +}
>>> +else 
>>> +{
>>> +for (i = 0; i < count; i++)
>>> +segs[i].status = copy.segments[i].status;
>>> +}
>>> +
>>> +free(copy.segments);
>>> +return rc;
>>> +}
>>
>> I know Wei asked for this but you've replaced what should be a single
>> pointer assignment with a memory allocation and two loops over all the
>> segments.
>>
>> This is a hot path and the two structures (the libxengnttab one and the
>> Linux kernel one) are both part of their respective ABIs and won't
>> change so Wei's concern that they might change in the future is unfounded.
>>
> 
> The fundamental question is: will the ABI between the library and the
> kernel ever go mismatch?
> 
> My answer is "maybe".  My rationale is that everything goes across
> boundary of components need to be considered with caution. And I tend to
> assume the worst things will happen.
> 
> To guarantee that they will never go mismatch is to have
> 
>typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> 
> But that's not how the code is written.
> 
> I would like to hear a third opinion. Is my concern unfounded? Am I too
> cautious? Is there any compelling argument that I missed?
> 
> Somewhat related, can we have some numbers please? It could well be the
> cost of the two loops is much cheaper than whatever is going on inside
> the kernel / hypervisor. And it could turn out that the numbers render
> this issue moot.

I did some (very) adhoc measurements and with the worst case of single
short segments for each ioctl, the optimized version of
osdep_gnttab_grant_copy() looks to be ~5% faster.

This is enough of a difference that we should use the optimized version.

The unoptimized version also adds an additional failure path (the
calloc) which would be best avoided.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Wei Liu
On Wed, Jun 22, 2016 at 11:53:00AM +0200, Paulina Szubarczyk wrote:
[...]
> > I know Wei asked for this but you've replaced what should be a single
> > pointer assignment with a memory allocation and two loops over all the
> > segments.
> > 
> > This is a hot path and the two structures (the libxengnttab one and the
> > Linux kernel one) are both part of their respective ABIs and won't
> > change so Wei's concern that they might change in the future is unfounded.
> > 
> > This change makes xengnttab_grant_copy() useless for our (XenServer's)
> > use case.
> > 
> > David
> 
> As Wei and Ian are maintainers of toolstack if they agree on the previous cast
> that was here I will revert the changes.
> 

Do you have the most up to date numbers? How do they compare to the
numbers in previous version? If there is degradation, how big is that in
terms of percentage?

Wei.

> Paulina

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Wei Liu
On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > system call is invoked. In mini-os the operation is yet not
> > implemented. For other OSs there is a dummy implementation.
> [...]
> > --- a/tools/libs/gnttab/linux.c
> > +++ b/tools/libs/gnttab/linux.c
> > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >  return 0;
> >  }
> >  
> > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > +uint32_t count,
> > +xengnttab_grant_copy_segment_t *segs)
> > +{
> > +int i, rc;
> > +int fd = xgt->fd;
> > +struct ioctl_gntdev_grant_copy copy;
> > +
> > +copy.segments = calloc(count, sizeof(struct 
> > ioctl_gntdev_grant_copy_segment));
> > +copy.count = count;
> > +for (i = 0; i < count; i++)
> > +{
> > +copy.segments[i].flags = segs[i].flags;
> > +copy.segments[i].len = segs[i].len;
> > +if (segs[i].flags == GNTCOPY_dest_gref) 
> > +{
> > +copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> > +copy.segments[i].dest.foreign.domid = 
> > segs[i].dest.foreign.domid;
> > +copy.segments[i].dest.foreign.offset = 
> > segs[i].dest.foreign.offset;
> > +copy.segments[i].source.virt = segs[i].source.virt;
> > +} 
> > +else 
> > +{
> > +copy.segments[i].source.foreign.ref = 
> > segs[i].source.foreign.ref;
> > +copy.segments[i].source.foreign.domid = 
> > segs[i].source.foreign.domid;
> > +copy.segments[i].source.foreign.offset = 
> > segs[i].source.foreign.offset;
> > +copy.segments[i].dest.virt = segs[i].dest.virt;
> > +}
> > +}
> > +
> > +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> > +if (rc) 
> > +{
> > +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > +}
> > +else 
> > +{
> > +for (i = 0; i < count; i++)
> > +segs[i].status = copy.segments[i].status;
> > +}
> > +
> > +free(copy.segments);
> > +return rc;
> > +}
> 
> I know Wei asked for this but you've replaced what should be a single
> pointer assignment with a memory allocation and two loops over all the
> segments.
> 
> This is a hot path and the two structures (the libxengnttab one and the
> Linux kernel one) are both part of their respective ABIs and won't
> change so Wei's concern that they might change in the future is unfounded.
> 

The fundamental question is: will the ABI between the library and the
kernel ever go mismatch?

My answer is "maybe".  My rationale is that everything goes across
boundary of components need to be considered with caution. And I tend to
assume the worst things will happen.

To guarantee that they will never go mismatch is to have

   typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;

But that's not how the code is written.

I would like to hear a third opinion. Is my concern unfounded? Am I too
cautious? Is there any compelling argument that I missed?

Somewhat related, can we have some numbers please? It could well be the
cost of the two loops is much cheaper than whatever is going on inside
the kernel / hypervisor. And it could turn out that the numbers render
this issue moot.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Paulina Szubarczyk
On Wed, 22 Jun 2016 10:37:24 +0100
David Vrabel  wrote:

> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > system call is invoked. In mini-os the operation is yet not
> > implemented. For other OSs there is a dummy implementation.
> [...]
> > --- a/tools/libs/gnttab/linux.c
> > +++ b/tools/libs/gnttab/linux.c
> > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >  return 0;
> >  }
> >  
> > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > +uint32_t count,
> > +xengnttab_grant_copy_segment_t *segs)
> > +{
> > +int i, rc;
> > +int fd = xgt->fd;
> > +struct ioctl_gntdev_grant_copy copy;
> > +
> > +copy.segments = calloc(count, sizeof(struct
> > ioctl_gntdev_grant_copy_segment));
> > +copy.count = count;
> > +for (i = 0; i < count; i++)
> > +{
> > +copy.segments[i].flags = segs[i].flags;
> > +copy.segments[i].len = segs[i].len;
> > +if (segs[i].flags == GNTCOPY_dest_gref) 
> > +{
> > +copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> > +copy.segments[i].dest.foreign.domid =
> > segs[i].dest.foreign.domid;
> > +copy.segments[i].dest.foreign.offset =
> > segs[i].dest.foreign.offset;
> > +copy.segments[i].source.virt = segs[i].source.virt;
> > +} 
> > +else 
> > +{
> > +copy.segments[i].source.foreign.ref =
> > segs[i].source.foreign.ref;
> > +copy.segments[i].source.foreign.domid =
> > segs[i].source.foreign.domid;
> > +copy.segments[i].source.foreign.offset =
> > segs[i].source.foreign.offset;
> > +copy.segments[i].dest.virt = segs[i].dest.virt;
> > +}
> > +}
> > +
> > +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> > +if (rc) 
> > +{
> > +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > +}
> > +else 
> > +{
> > +for (i = 0; i < count; i++)
> > +segs[i].status = copy.segments[i].status;
> > +}
> > +
> > +free(copy.segments);
> > +return rc;
> > +}
> 
> I know Wei asked for this but you've replaced what should be a single
> pointer assignment with a memory allocation and two loops over all the
> segments.
> 
> This is a hot path and the two structures (the libxengnttab one and the
> Linux kernel one) are both part of their respective ABIs and won't
> change so Wei's concern that they might change in the future is unfounded.
> 
> This change makes xengnttab_grant_copy() useless for our (XenServer's)
> use case.
> 
> David

As Wei and Ian are maintainers of toolstack if they agree on the previous cast
that was here I will revert the changes.

Paulina

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread David Vrabel
On 22/06/16 09:38, Paulina Szubarczyk wrote:
> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> system call is invoked. In mini-os the operation is yet not
> implemented. For other OSs there is a dummy implementation.
[...]
> --- a/tools/libs/gnttab/linux.c
> +++ b/tools/libs/gnttab/linux.c
> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>  return 0;
>  }
>  
> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> +uint32_t count,
> +xengnttab_grant_copy_segment_t *segs)
> +{
> +int i, rc;
> +int fd = xgt->fd;
> +struct ioctl_gntdev_grant_copy copy;
> +
> +copy.segments = calloc(count, sizeof(struct 
> ioctl_gntdev_grant_copy_segment));
> +copy.count = count;
> +for (i = 0; i < count; i++)
> +{
> +copy.segments[i].flags = segs[i].flags;
> +copy.segments[i].len = segs[i].len;
> +if (segs[i].flags == GNTCOPY_dest_gref) 
> +{
> +copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> +copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> +copy.segments[i].dest.foreign.offset = 
> segs[i].dest.foreign.offset;
> +copy.segments[i].source.virt = segs[i].source.virt;
> +} 
> +else 
> +{
> +copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> +copy.segments[i].source.foreign.domid = 
> segs[i].source.foreign.domid;
> +copy.segments[i].source.foreign.offset = 
> segs[i].source.foreign.offset;
> +copy.segments[i].dest.virt = segs[i].dest.virt;
> +}
> +}
> +
> +rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, );
> +if (rc) 
> +{
> +GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> +}
> +else 
> +{
> +for (i = 0; i < count; i++)
> +segs[i].status = copy.segments[i].status;
> +}
> +
> +free(copy.segments);
> +return rc;
> +}

I know Wei asked for this but you've replaced what should be a single
pointer assignment with a memory allocation and two loops over all the
segments.

This is a hot path and the two structures (the libxengnttab one and the
Linux kernel one) are both part of their respective ABIs and won't
change so Wei's concern that they might change in the future is unfounded.

This change makes xengnttab_grant_copy() useless for our (XenServer's)
use case.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.

2016-06-22 Thread Paulina Szubarczyk
In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
system call is invoked. In mini-os the operation is yet not
implemented. For other OSs there is a dummy implementation.

Signed-off-by: Paulina Szubarczyk 
---
Changes since v2:
- dropped the changes in libxc/include/xenctrl_compat
- changed the MINOR version in Makefile
- replaced 'return -1' -> 'abort()'in libs/gnttab/gnttab_unimp.c
- moved the struct 'xengnttab_copy_grant_segment' to 
  libs/gnttab/include/xengnttab.h 
- added explicit assingment to ioctl_gntdev_grant_copy_segment 
  to the linux part

 tools/include/xen-sys/Linux/gntdev.h  | 21 
 tools/libs/gnttab/Makefile|  2 +-
 tools/libs/gnttab/gnttab_core.c   |  6 +
 tools/libs/gnttab/gnttab_unimp.c  |  6 +
 tools/libs/gnttab/include/xengnttab.h | 28 ++
 tools/libs/gnttab/libxengnttab.map|  5 
 tools/libs/gnttab/linux.c | 45 +++
 tools/libs/gnttab/minios.c|  6 +
 tools/libs/gnttab/private.h   |  4 
 9 files changed, 122 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/gntdev.h 
b/tools/include/xen-sys/Linux/gntdev.h
index caf6fb4..0ca07c9 100644
--- a/tools/include/xen-sys/Linux/gntdev.h
+++ b/tools/include/xen-sys/Linux/gntdev.h
@@ -147,4 +147,25 @@ struct ioctl_gntdev_unmap_notify {
 /* Send an interrupt on the indicated event channel */
 #define UNMAP_NOTIFY_SEND_EVENT 0x2
 
+struct ioctl_gntdev_grant_copy_segment {
+union {
+void *virt;
+struct {
+uint32_t ref;
+uint16_t offset;
+uint16_t domid;
+} foreign;
+} source, dest;
+uint16_t len;
+uint16_t flags;
+int16_t status;
+};
+
+#define IOCTL_GNTDEV_GRANT_COPY \
+_IOC(_IOC_NONE, 'G', 8, sizeof(struct ioctl_gntdev_grant_copy))
+struct ioctl_gntdev_grant_copy {
+unsigned int count;
+struct ioctl_gntdev_grant_copy_segment *segments;
+};
+
 #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
diff --git a/tools/libs/gnttab/Makefile b/tools/libs/gnttab/Makefile
index af64542..95c2cd8 100644
--- a/tools/libs/gnttab/Makefile
+++ b/tools/libs/gnttab/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR= 1
-MINOR= 0
+MINOR= 1
 SHLIB_LDFLAGS += -Wl,--version-script=libxengnttab.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/gnttab/gnttab_core.c b/tools/libs/gnttab/gnttab_core.c
index 5d0474d..968c833 100644
--- a/tools/libs/gnttab/gnttab_core.c
+++ b/tools/libs/gnttab/gnttab_core.c
@@ -113,6 +113,12 @@ int xengnttab_unmap(xengnttab_handle *xgt, void 
*start_address, uint32_t count)
 return osdep_gnttab_unmap(xgt, start_address, count);
 }
 
+int xengnttab_grant_copy(xengnttab_handle *xgt,
+ uint32_t count,
+ xengnttab_grant_copy_segment_t *segs)
+{
+return osdep_gnttab_grant_copy(xgt, count, segs);
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/gnttab_unimp.c b/tools/libs/gnttab/gnttab_unimp.c
index b3a4a20..829eced 100644
--- a/tools/libs/gnttab/gnttab_unimp.c
+++ b/tools/libs/gnttab/gnttab_unimp.c
@@ -78,6 +78,12 @@ int xengnttab_unmap(xengnttab_handle *xgt, void 
*start_address, uint32_t count)
 abort();
 }
 
+int xengnttab_copy_grant(xengnttab_handle *xgt,
+ uint32_t count,
+ xengnttab_copy_grant_segment_t *segs)
+{
+abort();
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/include/xengnttab.h 
b/tools/libs/gnttab/include/xengnttab.h
index 0431dcf..949fd9e 100644
--- a/tools/libs/gnttab/include/xengnttab.h
+++ b/tools/libs/gnttab/include/xengnttab.h
@@ -258,6 +258,34 @@ int xengnttab_unmap(xengnttab_handle *xgt, void 
*start_address, uint32_t count);
 int xengnttab_set_max_grants(xengnttab_handle *xgt,
  uint32_t nr_grants);
 
+struct xengnttab_grant_copy_segment {
+union xengnttab_copy_ptr {
+void *virt;
+struct {
+uint32_t ref;
+uint16_t offset;
+uint16_t domid;
+} foreign;
+} source, dest;
+uint16_t len;
+uint16_t flags;
+int16_t status;
+};
+
+typedef struct xengnttab_grant_copy_segment xengnttab_grant_copy_segment_t;
+
+/**
+ * Copy memory from or to grant references. The information of each operations
+ * are contained in 'xengnttab_grant_copy_segment_t'. The @flag value indicate
+ * the direction of an operation (GNTCOPY_source_gref\GNTCOPY_dest_gref).
+ *
+ * The sum of fields @offset[i] and @len[i] of 'xengnttab_grant_copy_segment_t'
+ * should not exceed XEN_PAGE_SIZE
+ */
+int xengnttab_grant_copy(xengnttab_handle *xgt,
+ uint32_t count,
+ xengnttab_grant_copy_segment_t *segs);
+
 /*
  * Grant Sharing Interface (allocating and granting pages to others)
  */
diff --git