On Thu, 2019-01-10 at 12:26 +1100, Dave Chinner wrote:
> On Wed, Jan 09, 2019 at 08:17:31PM +0530, Pankaj Gupta wrote:
> > This patch series has implementation for "virtio pmem".
> > "virtio pmem" is fake persistent memory(nvdimm) in guest
> > which allows to bypass the guest page cache. This
On Tue, 2017-11-21 at 10:26 -0800, Dan Williams wrote:
> On Tue, Nov 21, 2017 at 10:19 AM, Rik van Riel
> wrote:
> > On Fri, 2017-11-03 at 14:21 +0800, Xiao Guangrong wrote:
> > > On 11/03/2017 12:30 AM, Dan Williams wrote:
> > > >
> > > > Good point,
On Fri, 2017-11-03 at 14:21 +0800, Xiao Guangrong wrote:
> On 11/03/2017 12:30 AM, Dan Williams wrote:
> >
> > Good point, I was assuming that the mmio flush interface would be
> > discovered separately from the NFIT-defined memory range. Perhaps
> > via
> > PCI in the guest? This piece of the pro
On Thu, 2017-10-12 at 18:18 -0400, Pankaj Gupta wrote:
> >
> > On Thu, Oct 12, 2017 at 2:25 PM, Pankaj Gupta
> > wrote:
> > >
> > > > > This patch adds virtio-pmem driver for KVM guest.
> > > > > Guest reads the persistent memory range information
> > > > > over virtio bus from Qemu and re
On Wed, 2017-07-26 at 14:40 -0700, Dan Williams wrote:
> On Wed, Jul 26, 2017 at 2:27 PM, Rik van Riel
> wrote:
> > On Wed, 2017-07-26 at 09:47 -0400, Pankaj Gupta wrote:
> > > >
> > >
> > > Just want to summarize here(high level):
> > >
&g
On Wed, 2017-07-26 at 09:47 -0400, Pankaj Gupta wrote:
> >
> Just want to summarize here(high level):
>
> This will require implementing new 'virtio-pmem' device which
> presents
> a DAX address range(like pmem) to guest with read/write(direct
> access)
> & device flush functionality. Also, qemu
On Tue, 2017-07-25 at 07:46 -0700, Dan Williams wrote:
> On Tue, Jul 25, 2017 at 7:27 AM, Pankaj Gupta
> wrote:
> >
> > Looks like only way to send flush(blk dev) from guest to host with
> > nvdimm
> > is using flush hint addresses. Is this the correct interface I am
> > looking?
> >
> > blkdev_
On Sun, 2017-07-23 at 09:01 -0700, Dan Williams wrote:
> [ adding Ross and Jan ]
>
> On Sun, Jul 23, 2017 at 7:04 AM, Rik van Riel
> wrote:
> >
> > The goal is to increase density of guests, by moving page
> > cache into the host (where it can be easily reclaim
On Sat, 2017-07-22 at 12:34 -0700, Dan Williams wrote:
> On Fri, Jul 21, 2017 at 8:58 AM, Stefan Hajnoczi > wrote:
> >
> > Maybe the NVDIMM folks can comment on this idea.
>
> I think it's unworkable to use the flush hints as a guest-to-host
> fsync mechanism. That mechanism was designed to flush
On Fri, 2017-07-21 at 09:29 -0400, Pankaj Gupta wrote:
> > >
> > > - Flush hint address traps from guest to host and do an
> > > entire fsync
> > > on backing file which itself is costly.
> > >
> > > - Can be used to flush specific pages on host backing disk.
> > > We can
> > >
On Tue, 2017-06-20 at 21:26 +0300, Michael S. Tsirkin wrote:
> On Tue, Jun 20, 2017 at 01:29:00PM -0400, Rik van Riel wrote:
> > I agree with that. Let me go into some more detail of
> > what Nitesh is implementing:
> >
> > 1) In arch_free_page, the being-freed page i
On Tue, 2017-06-20 at 18:49 +0200, David Hildenbrand wrote:
> On 20.06.2017 18:44, Rik van Riel wrote:
> > Nitesh Lal (on the CC list) is working on a way
> > to efficiently batch recently freed pages for
> > free page hinting to the hypervisor.
> >
> > If th
On Mon, 2017-06-12 at 07:10 -0700, Dave Hansen wrote:
> The hypervisor is going to throw away the contents of these pages,
> right? As soon as the spinlock is released, someone can allocate a
> page, and put good data in it. What keeps the hypervisor from
> throwing
> away good data?
That looks
On Thu, 2017-05-11 at 14:17 -0400, Stefan Hajnoczi wrote:
> On Wed, May 10, 2017 at 09:26:00PM +0530, Pankaj Gupta wrote:
> > * For live migration use case, if host side backing file is
> > shared storage, we need to flush the page cache for the disk
> > image at the destination (new fadvise
X" device flushing' project for feedback.
> > > Got the idea during discussion with 'Rik van Riel'.
> >
> > CCing NVDIMM folks.
> >
> > >
> > > Also, request answers to 'Questions' section.
> > >
> > > Abstr
On Mon, 2017-02-27 at 11:10 +, Stefan Hajnoczi wrote:
> On Thu, Feb 23, 2017 at 10:59:22AM +, Daniel P. Berrange wrote:
> > When using a memory-backend object with prealloc turned on, QEMU
> > will memset() the first byte in every memory page to zero. While
> > this might have been acceptab
On Wed, 2016-04-20 at 13:46 +0200, Kevin Wolf wrote:
> Am 20.04.2016 um 12:40 hat Ric Wheeler geschrieben:
> >
> > On 04/20/2016 05:24 AM, Kevin Wolf wrote:
> > >
> > > Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
> > > >
> > > > On 04/19/2016 10:09 AM, Jeff Cody wrote:
> > > > >
> > > >
; > > here at LSF (the kernel summit for file and storage people)
> > > > > and got
> > > > > a non-public confirmation that individual storage devices (s-
> > > > > ata
> > > > > drives or scsi) can also dump cache state w
On Tue, 2016-04-19 at 15:02 +, Li, Liang Z wrote:
> >
> > On Tue, 2016-04-19 at 22:34 +0800, Liang Li wrote:
> > >
> > > The free page bitmap will be sent to QEMU through virtio
> > > interface and
> > > used for live migration optimization.
> > > Drop the cache before building the free page
On Tue, 2016-04-19 at 22:34 +0800, Liang Li wrote:
> The free page bitmap will be sent to QEMU through virtio interface
> and used for live migration optimization.
> Drop the cache before building the free page bitmap can get more
> free pages. Whether dropping the cache is decided by user.
>
How
On Wed, 2016-03-09 at 20:04 +0300, Roman Kagan wrote:
> On Wed, Mar 09, 2016 at 05:41:39PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Mar 09, 2016 at 05:28:54PM +0300, Roman Kagan wrote:
> > > For (1) I've been trying to make a point that skipping clean
> > > pages is
> > > much more likely to re
the same fd.
>
> Naturally, this makes the guard page bigger with hugetlbfs.
>
> Based on patch by Greg Kurz.
>
> Cc: Rik van Riel
> CC: Greg Kurz
> Signed-off-by: Michael S. Tsirkin
Acked-by: Rik van Riel
--
All rights reversed
ization in
11feeb498086a3a5907b8148bdf1786a9b18fc55. The cacheline was already
modified in order to set PG_tail so this won't affect the boot time of
large memory systems.
Reported-by: andy123
Signed-off-by: Andrea Arcangeli
Acked-by: Rik van Riel
possible that more pages than necessary are isolated but the check
still fails and I missed that this fix was not picked up before RC1. This
same problem has been identified in 3.7-RC1 by Tony Prisk and should be
addressed by the following patch.
Signed-off-by: Mel Gorman
Tested-by: Tony Prisk
On 09/21/2012 06:46 AM, Mel Gorman wrote:
Hi Andrew,
Richard Davies and Shaohua Li have both reported lock contention
problems in compaction on the zone and LRU locks as well as
significant amounts of time being spent in compaction. This series
aims to reduce lock contention and scanning rates t
e we cycle through more
slowly can continue, even when this particular
zone is experiencing problems, so I guess this
is desired behaviour...
Acked-by: Rik van Riel
en to ignore the cached
information?" If it's ignored too often, the scanning rates will still
be excessive. If the information is too stale then allocations will fail
that might have otherwise succeeded. In this patch
Big hammer, but I guess it is effective...
Acked-by: Rik van Riel
r next patches easier...
Acked-by: Rik van Riel
ere are no free pages in the pageblock then the lock will not be
acquired at all which reduces contention on zone->lock.
Signed-off-by: Mel Gorman
Acked-by: Rik van Riel
uge then the LRU lock will not be acquired at all
which reduces contention on zone->lru_lock.
Signed-off-by: Mel Gorman
Acked-by: Rik van Riel
ock is even contended.
[minc...@kernel.org: Putback pages isolated for migration if aborting]
[a...@linux-foundation.org: compact_zone_order requires non-NULL arg contended]
Signed-off-by: Andrea Arcangeli
Signed-off-by: Shaohua Li
Signed-off-by: Mel Gorman
Acked-by: Rik van Riel
On 09/15/2012 11:55 AM, Richard Davies wrote:
Hi Rik, Mel and Shaohua,
Thank you for your latest patches. I attach my latest perf report for a slow
boot with all of these applied.
Mel asked for timings of the slow boots. It's very hard to give anything
useful here! A normal boot would be a minu
ne. This can lead to less
efficient compaction when one thread has wrapped around to the
end of the zone, and another simultaneous compactor has not done
so yet. However, it should ensure that we do not suffer quadratic
behaviour any more.
Signed-off-by: Rik van Riel
Reported-by: Richard Davies
di
ppears to
have re-introduced quadratic behaviour in that the value
of zone->compact_cached_free_pfn is never advanced until
the compaction run wraps around the start of the zone.
This merely moved the starting point for the quadratic behaviour
further into the zone, but the behaviour has still been o
re that we do not suffer quadratic
behaviour any more.
Signed-off-by: Rik van Riel
Reported-by: Richard Davies
diff --git a/mm/compaction.c b/mm/compaction.c
index 771775d..0656759 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -431,6 +431,24 @@ static bool suitable_migration_target(s
On 08/25/2012 01:45 PM, Richard Davies wrote:
Are you talking about these patches?
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=c67fe3752abe6ab47639e2f9b836900c3dc3da84
http://marc.info/?l=linux-mm&m=134521289221259
If so, I believe those are in 3.6.0-rc3, so I teste
On 08/22/2012 10:41 AM, Richard Davies wrote:
Avi Kivity wrote:
Richard Davies wrote:
I can trigger the slow boots without KSM and they have the same profile,
with _raw_spin_lock_irqsave and isolate_freepages_block at the top.
I reduced to 3x 20GB 8-core VMs on a 128GB host (rather than 3x 40G
On 03/09/2012 01:27 PM, Liu, Jinsong wrote:
As for 'tsc deadline' feature exposing, my patch (as attached) just obey qemu
general cpuid exposing method, and also satisfied your target I think.
One question.
Why is TSC_DEADLINE not exposed in the cpuid allowed feature
bits in do_cpuid_ent() i
may be outdated.
Ohhh, nice catch.
Signed-off-by: Gleb Natapov
Acked-by: Rik van Riel
--
All rights reversed
On 12/01/2010 02:35 PM, Peter Zijlstra wrote:
On Wed, 2010-12-01 at 14:24 -0500, Rik van Riel wrote:
Even if we equalized the amount of CPU time each VCPU
ends up getting across some time interval, that is no
guarantee they get useful work done, or that the time
gets fairly divided to _user
On 12/01/2010 02:07 PM, Peter Zijlstra wrote:
On Wed, 2010-12-01 at 12:26 -0500, Rik van Riel wrote:
On 12/01/2010 12:22 PM, Peter Zijlstra wrote:
The pause loop exiting& directed yield patches I am working on
preserve inter-vcpu fairness by round robining among the vcpus
inside one
On 12/01/2010 12:22 PM, Peter Zijlstra wrote:
On Wed, 2010-12-01 at 09:17 -0800, Chris Wright wrote:
Directed yield and fairness don't mix well either. You can end up
feeding the other tasks more time than you'll ever get back.
If the directed yield is always to another task in your cgroup the
Paul Brook wrote:
On Saturday 29 July 2006 15:59, Rik van Riel wrote:
Fabrice Bellard wrote:
Hi,
Using O_SYNC for disk image access is not acceptable: QEMU relies on the
host OS to ensure that the data is written correctly.
This means that write ordering is not preserved, and on a power
Fabrice Bellard wrote:
Hi,
Using O_SYNC for disk image access is not acceptable: QEMU relies on the
host OS to ensure that the data is written correctly.
This means that write ordering is not preserved, and on a power
failure any data written by qemu (or Xen fully virt) guests may
not be pres
Paul Brook wrote:
With a proper async API, is there any reason why we would want this to be
tunable? I don't think there's much of a benefit of prematurely claiming
a write is complete especially once the SCSI emulation can support
multiple simultaneous requests.
You're right. This O_SYNC band
Anthony Liguori wrote:
Right now Fabrice is working on rewriting the block API to be
asynchronous. There's been quite a lot of discussion about why using
threads isn't a good idea for this
Agreed, AIO is the way to go in the long run.
With a proper async API, is there any reason why we woul
Rik van Riel wrote:
This is the simple approach to making sure that disk writes actually
hit disk before we tell the guest OS that IO has completed. Thanks
to DMA_MULTI_THREAD the performance still seems to be adequate.
Hah, and of course that bit is only found in Xen's qemu-dm. Doh!
I
on should make the performance overhead
of synchronous writes bearable, or at least comparable to native
hardware.
Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
--- xen-unstable-10712/tools/ioemu/block-bochs.c.osync 2006-07-28 02:15:56.0 -0400
+++ xen-unstable-10712/tools/ioemu/blo
48 matches
Mail list logo