[Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
To maintain consistency at all the places use qemu_madvise wrapper inplace of madvise call. Signed-off-by: Pankaj Gupta --- block/qcow2-cache.c | 2 +- migration/postcopy-ram.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index 1d25147..4991ca5 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -74,7 +74,7 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t; size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align); if (length > 0) { -madvise((uint8_t *) t + offset, length, MADV_DONTNEED); +qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED); } #endif } diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index a40dddb..558fec1 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -213,7 +213,7 @@ int postcopy_ram_discard_range(MigrationIncomingState *mis, uint8_t *start, size_t length) { trace_postcopy_ram_discard_range(start, length); -if (madvise(start, length, MADV_DONTNEED)) { +if (qemu_madvise(start, length, QEMU_MADV_DONTNEED)) { error_report("%s MADV_DONTNEED: %s", __func__, strerror(errno)); return -1; } -- 2.7.4
Re: [Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
Am 17.02.2017 um 09:06 hat Pankaj Gupta geschrieben: > To maintain consistency at all the places use qemu_madvise wrapper > inplace of madvise call. > > Signed-off-by: Pankaj Gupta Reviewed-by: Kevin Wolf Juan/Dave, if one of you can give an Acked-by, I can take this through my tree. Kevin
Re: [Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
* Kevin Wolf (kw...@redhat.com) wrote: > Am 17.02.2017 um 09:06 hat Pankaj Gupta geschrieben: > > To maintain consistency at all the places use qemu_madvise wrapper > > inplace of madvise call. > > > > Signed-off-by: Pankaj Gupta > > Reviewed-by: Kevin Wolf > > Juan/Dave, if one of you can give an Acked-by, I can take this through > my tree. NACK That's wrong; qemu_madvise can end up going through posix_madvise and using POSIX_MADV_DONTNEED, it has different semantics to the madvise(MADV_DONTNEED) and we need the semantics of madvise - i.e. it's guaranteed to throw away the pages, where as posix_madvise *may* throw away the pages if the kernel feels like it. Dave > Kevin -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
On Fri 17 Feb 2017 09:06:04 AM CET, Pankaj Gupta wrote: > To maintain consistency at all the places use qemu_madvise wrapper > inplace of madvise call. > if (length > 0) { > -madvise((uint8_t *) t + offset, length, MADV_DONTNEED); > +qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED); This was changed two months ago from qemu_madvise() to madvise(), is there any reason why you want to revert that change? Those two calls are not equivalent, please see commit 2f2c8d6b371cfc6689affb0b7e for an explanation. > -if (madvise(start, length, MADV_DONTNEED)) { > +if (qemu_madvise(start, length, QEMU_MADV_DONTNEED)) { > error_report("%s MADV_DONTNEED: %s", __func__, strerror(errno)); And this is the same case. Berto
Re: [Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
Thanks for your comments. I have below query. > > On Fri 17 Feb 2017 09:06:04 AM CET, Pankaj Gupta wrote: > > To maintain consistency at all the places use qemu_madvise wrapper > > inplace of madvise call. > > > if (length > 0) { > > -madvise((uint8_t *) t + offset, length, MADV_DONTNEED); > > +qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED); > > This was changed two months ago from qemu_madvise() to madvise(), is > there any reason why you want to revert that change? Those two calls are > not equivalent, please see commit 2f2c8d6b371cfc6689affb0b7e for an > explanation. > > > -if (madvise(start, length, MADV_DONTNEED)) { > > +if (qemu_madvise(start, length, QEMU_MADV_DONTNEED)) { > > error_report("%s MADV_DONTNEED: %s", __func__, strerror(errno)); I checked history of only change related to 'postcopy'. For my linux machine: ./config-host.mak CONFIG_MADVISE=y CONFIG_POSIX_MADVISE=y As both these options are set for Linux, every time we call call 'qemu_madvise' ==>"madvise(addr, len, advice);" will be compiled/called. I don't understand why '2f2c8d6b371cfc6689affb0b7e' explicitly changed for :"#ifdef CONFIG_LINUX" I think its better to write generic function maybe in a wrapper then to conditionally set something at different places. int qemu_madvise(void *addr, size_t len, int advice) { if (advice == QEMU_MADV_INVALID) { errno = EINVAL; return -1; } #if defined(CONFIG_MADVISE) return madvise(addr, len, advice); #elif defined(CONFIG_POSIX_MADVISE) return posix_madvise(addr, len, advice); #else errno = EINVAL; return -1; #endif } > > And this is the same case. > > Berto >
Re: [Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
* Pankaj Gupta (pagu...@redhat.com) wrote: > > Thanks for your comments. I have below query. > > > > On Fri 17 Feb 2017 09:06:04 AM CET, Pankaj Gupta wrote: > > > To maintain consistency at all the places use qemu_madvise wrapper > > > inplace of madvise call. > > > > > if (length > 0) { > > > -madvise((uint8_t *) t + offset, length, MADV_DONTNEED); > > > +qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED); > > > > This was changed two months ago from qemu_madvise() to madvise(), is > > there any reason why you want to revert that change? Those two calls are > > not equivalent, please see commit 2f2c8d6b371cfc6689affb0b7e for an > > explanation. > > > > > -if (madvise(start, length, MADV_DONTNEED)) { > > > +if (qemu_madvise(start, length, QEMU_MADV_DONTNEED)) { > > > error_report("%s MADV_DONTNEED: %s", __func__, strerror(errno)); > > I checked history of only change related to 'postcopy'. > > For my linux machine: > > ./config-host.mak > > CONFIG_MADVISE=y > CONFIG_POSIX_MADVISE=y > > As both these options are set for Linux, every time we call call > 'qemu_madvise' ==>"madvise(addr, len, advice);" will > be compiled/called. I don't understand why '2f2c8d6b371cfc6689affb0b7e' > explicitly changed for :"#ifdef CONFIG_LINUX" > I think its better to write generic function maybe in a wrapper then to > conditionally set something at different places. No; the problem is that the behaviours are different. You're right that the current build on Linux defines MADVISE and thus we are safe because qemu_madvise takes teh CONFIG_MADVISE/madvise route - but we need to be explicit that it's only the madvise() route that's safe, not any of the calls implemented by qemu_madvise, because if in the future someone was to rearrange qemu_madvise to prefer posix_madvise postcopy would break in a very subtle way. IMHO it might even be better to remove the definition of QEMU_MADV_DONTNEED altogether and make a name that wasn't ambiguous between the two, since the posix definition is so different. Dave > int qemu_madvise(void *addr, size_t len, int advice) > { > if (advice == QEMU_MADV_INVALID) { > errno = EINVAL; > return -1; > } > #if defined(CONFIG_MADVISE) > return madvise(addr, len, advice); > #elif defined(CONFIG_POSIX_MADVISE) > return posix_madvise(addr, len, advice); > #else > errno = EINVAL; > return -1; > #endif > } > > > > > And this is the same case. > > > > Berto > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-block] [PATCH] block, migration: Use qemu_madvise inplace of madvise
On Fri 17 Feb 2017 12:30:28 PM CET, Pankaj Gupta wrote: >> > To maintain consistency at all the places use qemu_madvise wrapper >> > inplace of madvise call. >> >> > -madvise((uint8_t *) t + offset, length, MADV_DONTNEED); >> > +qemu_madvise((uint8_t *) t + offset, length, QEMU_MADV_DONTNEED); >> >> Those two calls are not equivalent, please see commit >> 2f2c8d6b371cfc6689affb0b7e for an explanation. > I don't understand why '2f2c8d6b371cfc6689affb0b7e' explicitly changed > for :"#ifdef CONFIG_LINUX" I think its better to write generic > function maybe in a wrapper then to conditionally set something at > different places. The problem with qemu_madvise(QEMU_MADV_DONTNEED) is that it can mean different things depending on the platform: posix_madvise(POSIX_MADV_DONTNEED) madvise(MADV_DONTNEED) The first call is standard but it doesn't do what we need, so we cannot use it. The second call -- madvise(MADV_DONTNEED) -- is not standard, and it doesn't do the same in all platforms. The only platform in which it does what we need is Linux, hence the #ifdef CONFIG_LINUX and #if defined(__linux__) that you see in the code. I agree with David's comment that maybe it's better to remove QEMU_MADV_DONTNEED altogether since it's not reliable. Berto