Christoph Hellwig writes:
> Hi Ying,
>
>> Any update to this regression?
>
> Not really. We've optimized everything we could in XFS without
> dropping the architecture that we really want to move to. Now we're
> waiting for some MM behavior to be fixed that this unconvered. But
> in the end wi
Hi Ying,
> Any update to this regression?
Not really. We've optimized everything we could in XFS without
dropping the architecture that we really want to move to. Now we're
waiting for some MM behavior to be fixed that this unconvered. But
in the end will probabkly stuck with a slight regressi
Hi, Christoph,
"Huang, Ying" writes:
> Hi, Christoph,
>
> "Huang, Ying" writes:
>
>> Christoph Hellwig writes:
>>
>>> Snipping the long contest:
>>>
>>> I think there are three observations here:
>>>
>>> (1) removing the mark_page_accessed (which is the only significant
>>> change in the
Mel Gorman writes:
> On Fri, Sep 02, 2016 at 09:32:58AM +1000, Dave Chinner wrote:
>> On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
>> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
>> > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
>> > > > On Wed,
On Fri, Sep 02, 2016 at 09:32:58AM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> > > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > > > On Wed, Aug 17, 2016 at 04:49:07PM
On Fri, Aug 19, 2016 at 04:08:34PM +0100, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> > On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > > Yes, we could try to batch the lock
On Wed, Aug 24, 2016 at 08:40:37AM -0700, Huang, Ying wrote:
> Mel Gorman writes:
>
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> >> > Yes, we could try to batch the locking like DaveC already suggested
> >> > (ie we could move the locking to the caller, and then make
> >> > s
Hi, Mel,
Mel Gorman writes:
> On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
>> > Yes, we could try to batch the locking like DaveC already suggested
>> > (ie we could move the locking to the caller, and then make
>> > shrink_page_list() just try to keep the lock held for a few page
Hi, Christoph,
"Huang, Ying" writes:
> Christoph Hellwig writes:
>
>> Snipping the long contest:
>>
>> I think there are three observations here:
>>
>> (1) removing the mark_page_accessed (which is the only significant
>> change in the parent commit) hurts the
>> aim7/1BRD_48G-xfs-d
On Sat, Aug 20, 2016 at 09:48:39AM +1000, Dave Chinner wrote:
> On Fri, Aug 19, 2016 at 11:49:46AM +0100, Mel Gorman wrote:
> > On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> > > It *could* be as simple/stupid as just saying "let's allocate the page
> > > cache for new pages from
On Fri, Aug 19, 2016 at 4:48 PM, Dave Chinner wrote:
>
> Well, it depends on the speed of the storage. The higher the speed
> of the storage, the less we care about stalling on dirty pages
> during reclaim
Actually, that's largely true independently of the speed of the storage, I feel.
On really
On Fri, Aug 19, 2016 at 11:49:46AM +0100, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> > It *could* be as simple/stupid as just saying "let's allocate the page
> > cache for new pages from the current node" - and if the process that
> > dirties pages just st
On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > (ie we could move the locki
On Thu, Aug 18, 2016 at 03:25:40PM -0700, Linus Torvalds wrote:
> >> In fact, looking at the __page_cache_alloc(), we already have that
> >> "spread pages out" logic. I'm assuming Dave doesn't actually have that
> >> bit set (I don't think it's the default), but I'm also envisioning
> >> that maybe
On Thu 18-08-16 15:25:40, Linus Torvalds wrote:
[...]
> So just for testing purposes, you could try changing that
>
> return alloc_pages(gfp, 0);
>
> in __page_cache_alloc() into something like
>
> return alloc_pages_node(cpu_to_node(raw_smp_processor_id())), gfp, 0);
That would
On Thu, Aug 18, 2016 at 10:55:01AM -0700, Linus Torvalds wrote:
> On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman
> wrote:
> > On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> >> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
> >>
> >
> > That is a terrifying "fix" for t
On Thu, Aug 18, 2016 at 2:19 PM, Dave Chinner wrote:
>
> For streaming or use-once IO it makes a lot of sense to restrict the
> locality of the page cache. The faster the IO device, the less dirty
> page buffering we need to maintain full device bandwidth. And the
> larger the machine the greater
On Thu, Aug 18, 2016 at 6:24 AM, Mel Gorman wrote:
> On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
>> FWIW, I just remembered about /proc/sys/vm/zone_reclaim_mode.
>>
>
> That is a terrifying "fix" for this problem. It just happens to work
> because there is no spillover to other n
On Thu, Aug 18, 2016 at 05:11:11PM +1000, Dave Chinner wrote:
> On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> > On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > > Yes, we could try to batch the locking like DaveC already suggested
> > > > (ie we could move the locki
On Thu, Aug 18, 2016 at 01:45:17AM +0100, Mel Gorman wrote:
> On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > > Yes, we could try to batch the locking like DaveC already suggested
> > > (ie we could move the locking to the caller, and then make
> > > shrink_page_list() just try to k
On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> On Tue, Aug 16, 2016 at 10:47:36AM -0700, Linus Torvalds wrote:
> > I've always preferred to see direct reclaim as the primary model for
> > reclaim, partly in order to throttle the actual "bad" process, but
> > also because "kswapd uses
On Wed, Aug 17, 2016 at 04:49:07PM +0100, Mel Gorman wrote:
> > Yes, we could try to batch the locking like DaveC already suggested
> > (ie we could move the locking to the caller, and then make
> > shrink_page_list() just try to keep the lock held for a few pages if
> > the mapping doesn't change)
On Wed 17-08-16 17:48:25, Michal Hocko wrote:
[...]
> I will try to catch up with the rest of the email thread but from a
> quick glance it just feels like we are doing more more work under the
> lock.
Hmm, so it doesn't seem to be more work in __remove_mapping as pointed
out in http://lkml.kernel
On Mon, Aug 15, 2016 at 07:03:00AM +0200, Ingo Molnar wrote:
>
> * Linus Torvalds wrote:
>
> > Make sure you actually use "perf record -e cycles:pp" or something
> > that uses PEBS to get real profiles using CPU performance counters.
>
> Btw., 'perf record -e cycles:pp' is the default now for m
On Tue, Aug 16, 2016 at 10:47:36AM -0700, Linus Torvalds wrote:
> I've always preferred to see direct reclaim as the primary model for
> reclaim, partly in order to throttle the actual "bad" process, but
> also because "kswapd uses lots of CPU time" is such a nasty thing to
> even begin guessing ab
On Tue 16-08-16 10:47:36, Linus Torvalds wrote:
> Mel,
> thanks for taking a look. Your theory sounds more complete than mine,
> and since Dave is able to see the problem with 4.7, it would be nice
> to hear about the 4.6 behavior and commit ede37713737 in particular.
>
> That one seems more like
On Tue, Aug 16, 2016 at 3:02 PM, Dave Chinner wrote:
>>
>> What does your profile show for when you actually dig into
>> __remove_mapping() itself?, Looking at your flat profile, I'm assuming
>> you get
>
> - 22.26% 0.93% [kernel] [k] __remove_mapping
>- 3.86% __remove_mapping
On Mon, Aug 15, 2016 at 06:51:42PM -0700, Linus Torvalds wrote:
> Anyway, including the direct reclaim call paths gets
> __remove_mapping() a bit higher, and _raw_spin_lock_irqsave climbs to
> 0.26%. But perhaps more importlantly, looking at what __remove_mapping
> actually *does* (apart from the s
Mel,
thanks for taking a look. Your theory sounds more complete than mine,
and since Dave is able to see the problem with 4.7, it would be nice
to hear about the 4.6 behavior and commit ede37713737 in particular.
That one seems more likely to affect contention than the zone/node one
I found durin
On Mon, Aug 15, 2016 at 04:48:36PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
> wrote:
> >
> > None of this code is all that new, which is annoying. This must have
> > gone on forever,
>
> ... ooh.
>
> Wait, I take that back.
>
> We actually have some very re
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
Snipping the long contest:
I think there are three observations here:
(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-3000-performance/
On Tue, Aug 16, 2016 at 07:22:40AM +1000, Dave Chinner wrote:
On Mon, Aug 15, 2016 at 10:14:55PM +0800, Fengguang Wu wrote:
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
>Snipping the long contest:
>
>I think there are three observations here:
>
>(1) removing
On Mon, Aug 15, 2016 at 5:19 PM, Dave Chinner wrote:
>
>> None of this code is all that new, which is annoying. This must have
>> gone on forever,
>
> Yes, it has been. Just worse than I've notice before, probably
> because of all the stuff put under the tree lock in the past couple
> of years.
S
On Mon, Aug 15, 2016 at 5:38 PM, Dave Chinner wrote:
>
> Same in 4.7 (flat profile numbers climbed higher after this
> snapshot was taken, as can be seen by the callgraph numbers):
Ok, so it's not the zone-vs-node thing. It's just that nobody has
looked at that load in recent times.
Where "rece
On Mon, Aug 15, 2016 at 5:17 PM, Dave Chinner wrote:
>
> Read the code, Linus?
I am. It's how I came up with my current pet theory.
But I don't actually have enough sane numbers to make it much more
than a cute pet theory. It *might* explain why you see tons of kswap
time and bad lock contention
On Mon, Aug 15, 2016 at 04:48:36PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
> wrote:
> >
> > None of this code is all that new, which is annoying. This must have
> > gone on forever,
>
> ... ooh.
>
> Wait, I take that back.
>
> We actually have some very re
On Mon, Aug 15, 2016 at 04:20:55PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner wrote:
> >
> > 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
> >9.90% [kernel] [k] copy_user_generic_string
> >3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_
On Mon, Aug 15, 2016 at 05:15:47PM -0700, Linus Torvalds wrote:
> DaveC - does the spinlock contention go away if you just go back to
> 4.7? If so, I think it's the new zone thing. But it would be good to
> verify - maybe it's something entirely different and it goes back much
> further.
Same in 4
On Mon, Aug 15, 2016 at 04:01:00PM -0700, Linus Torvalds wrote:
> On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner wrote:
> >
> > Right, but that does not make the profile data useless,
>
> Yes it does. Because it basically hides everything that happens inside
> the lock, which is what causes the co
On Mon, Aug 15, 2016 at 10:22:43AM -0700, Huang, Ying wrote:
> Hi, Chinner,
>
> Dave Chinner writes:
>
> > On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> >> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying wrote:
> >> >
> >> > Here it is,
> >>
> >> Thanks.
> >>
> >> Appended is
On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
wrote:
>
> But I'll try to see what happens
> on my profile, even if I can't recreate the contention itself, just
> trying to see what happens inside of that region.
Yeah, since I run my machines on encrypted disks, my profile shows 60%
kthread, but
On Mon, Aug 15, 2016 at 4:20 PM, Linus Torvalds
wrote:
>
> None of this code is all that new, which is annoying. This must have
> gone on forever,
... ooh.
Wait, I take that back.
We actually have some very recent changes that I didn't even think
about that went into this very merge window.
In
On Mon, Aug 15, 2016 at 3:42 PM, Dave Chinner wrote:
>
> 31.18% [kernel] [k] __pv_queued_spin_lock_slowpath
>9.90% [kernel] [k] copy_user_generic_string
>3.65% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock
>2.62% [kernel] [k] __block_commit_write.isra.29
>2.26%
On Mon, Aug 15, 2016 at 3:22 PM, Dave Chinner wrote:
>
> Right, but that does not make the profile data useless,
Yes it does. Because it basically hides everything that happens inside
the lock, which is what causes the contention in the first place.
So stop making inane and stupid arguments, Dav
On Tue, Aug 16, 2016 at 08:22:11AM +1000, Dave Chinner wrote:
> On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote:
> > On Aug 14, 2016 10:00 PM, "Dave Chinner" wrote:
> > >
> > > > What does it say if you annotate that _raw_spin_unlock_irqrestore()
> > function?
> > >
> > >
On Sun, Aug 14, 2016 at 10:12:20PM -0700, Linus Torvalds wrote:
> On Aug 14, 2016 10:00 PM, "Dave Chinner" wrote:
> >
> > > What does it say if you annotate that _raw_spin_unlock_irqrestore()
> function?
> >
> >¿
> >¿Disassembly of section load0:
> >¿
> >¿
On Mon, Aug 15, 2016 at 10:14:55PM +0800, Fengguang Wu wrote:
> Hi Christoph,
>
> On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
> >Snipping the long contest:
> >
> >I think there are three observations here:
> >
> >(1) removing the mark_page_accessed (which is the only signifi
Christoph Hellwig writes:
> Snipping the long contest:
>
> I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
> I'd sti
Hi, Chinner,
Dave Chinner writes:
> On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
>> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying wrote:
>> >
>> > Here it is,
>>
>> Thanks.
>>
>> Appended is a munged "after" list, with the "before" values in
>> parenthesis. It actually looks
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
Snipping the long contest:
I think there are three observations here:
(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-30
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner wrote:
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
I don't recall having ever seen the mapping tree_lock as a contention
point before, but it's not like I've tried
* Linus Torvalds wrote:
> Make sure you actually use "perf record -e cycles:pp" or something
> that uses PEBS to get real profiles using CPU performance counters.
Btw., 'perf record -e cycles:pp' is the default now for modern versions
of perf tooling (on most x86 systems) - if you do 'perf reco
On Sun, Aug 14, 2016 at 07:53:40PM -0700, Linus Torvalds wrote:
> On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner wrote:
> >>
> >> Maybe your symbol table came from a old kernel, and functions moved
> >> around enough that the profile attributions ended up bogus.
> >
> > No, I don't think so. I don'
On Sun, Aug 14, 2016 at 7:28 PM, Dave Chinner wrote:
>>
>> Maybe your symbol table came from a old kernel, and functions moved
>> around enough that the profile attributions ended up bogus.
>
> No, I don't think so. I don't install symbol tables on my test VMs,
> I let /proc/kallsyms do that work
On Sun, Aug 14, 2016 at 06:37:33PM -0700, Linus Torvalds wrote:
> On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner wrote:
> >>
> >> Does this attached patch help your contention numbers?
> >
> > No. If anything, it makes it worse. Without the patch, I was
> > measuring 36-37% in _raw_spin_unlock_irqr
On Sun, Aug 14, 2016 at 5:48 PM, Dave Chinner wrote:
>>
>> Does this attached patch help your contention numbers?
>
> No. If anything, it makes it worse. Without the patch, I was
> measuring 36-37% in _raw_spin_unlock_irqrestore. With the patch, it
> is 42-43%. Write throughtput is the same at ~50
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner wrote:
> > On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> >>
> >> I don't recall having ever seen the mapping tree_lock as a contention
> >> point before, but it's not
Hi Christoph,
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
Snipping the long contest:
I think there are three observations here:
(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-30
On Sun, Aug 14, 2016 at 06:17:24PM +0200, Christoph Hellwig wrote:
> Snipping the long contest:
>
> I think there are three observations here:
>
> (1) removing the mark_page_accessed (which is the only significant
> change in the parent commit) hurts the
> aim7/1BRD_48G-xfs-disk_rr-30
Snipping the long contest:
I think there are three observations here:
(1) removing the mark_page_accessed (which is the only significant
change in the parent commit) hurts the
aim7/1BRD_48G-xfs-disk_rr-3000-performance/ivb44 test.
I'd still rather stick to the filemap version and
Hi Christoph,
On Sun, Aug 14, 2016 at 06:51:28AM +0800, Fengguang Wu wrote:
Hi Christoph,
On Sun, Aug 14, 2016 at 12:15:08AM +0200, Christoph Hellwig wrote:
Hi Fengguang,
feel free to try this git tree:
git://git.infradead.org/users/hch/vfs.git iomap-fixes
I just queued some test jobs fo
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
> > Which says "no change". Oh well, back to the drawing board...
>
> I don't see how it would change thing much - for all relevant calculations
> we convert to block
Hi Christoph,
On Sun, Aug 14, 2016 at 12:15:08AM +0200, Christoph Hellwig wrote:
Hi Fengguang,
feel free to try this git tree:
git://git.infradead.org/users/hch/vfs.git iomap-fixes
I just queued some test jobs for it.
% queue -q vip -t ivb44 -b hch-vfs/iomap-fixes aim7-fs-1brd.yaml fs=xfs
Hi Fengguang,
feel free to try this git tree:
git://git.infradead.org/users/hch/vfs.git iomap-fixes
Hi Linus,
On Fri, Aug 12, 2016 at 11:03:33AM -0700, Linus Torvalds wrote:
On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner wrote:
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
I don't recall having ever seen the mapping tree_lock as a contention
point before, but it's not like
Hi Christoph,
On Sat, Aug 13, 2016 at 11:48:25PM +0200, Christoph Hellwig wrote:
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
Below is a patch I hacked up this morning to do just that. It passes
xfstests, but I've not done any real benchmarking with it. If the
reduced lo
On Sat, Aug 13, 2016 at 02:30:54AM +0200, Christoph Hellwig wrote:
> Below is a patch I hacked up this morning to do just that. It passes
> xfstests, but I've not done any real benchmarking with it. If the
> reduced lookup overhead in it doesn't help enough we'll need to some
> sort of look aside
On Fri, Aug 12, 2016 at 08:02:08PM +1000, Dave Chinner wrote:
> Which says "no change". Oh well, back to the drawing board...
I don't see how it would change thing much - for all relevant calculations
we convert to block units first anyway.
But the whole xfs_iomap_write_delay is a giant mess anyw
On Thu, Aug 11, 2016 at 8:56 PM, Dave Chinner wrote:
> On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
>>
>> I don't recall having ever seen the mapping tree_lock as a contention
>> point before, but it's not like I've tried that load either. So it
>> might be a regression (going b
On Fri, Aug 12, 2016 at 04:51:24PM +0800, Ye Xiaolong wrote:
> On 08/12, Ye Xiaolong wrote:
> >On 08/12, Dave Chinner wrote:
>
> [snip]
>
> >>lkp-folk: the patch I've just tested it attached below - can you
> >>feed that through your test and see if it fixes the regression?
> >>
> >
> >Hi, Dave
>
On 08/12, Ye Xiaolong wrote:
>On 08/12, Dave Chinner wrote:
[snip]
>>lkp-folk: the patch I've just tested it attached below - can you
>>feed that through your test and see if it fixes the regression?
>>
>
>Hi, Dave
>
>I am verifying your fix patch in lkp environment now, will send the
>result onc
On 08/12, Dave Chinner wrote:
>On Thu, Aug 11, 2016 at 10:02:39PM -0700, Linus Torvalds wrote:
>> On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner wrote:
>> >
>> > That's why running aim7 as your "does the filesystem scale"
>> > benchmark is somewhat irrelevant to scaling applications on high
>> > pe
On Thu, Aug 11, 2016 at 10:02:39PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner wrote:
> >
> > That's why running aim7 as your "does the filesystem scale"
> > benchmark is somewhat irrelevant to scaling applications on high
> > performance systems these days
>
> Ye
On Thu, Aug 11, 2016 at 9:16 PM, Dave Chinner wrote:
>
> That's why running aim7 as your "does the filesystem scale"
> benchmark is somewhat irrelevant to scaling applications on high
> performance systems these days
Yes, don't get me wrong - I'm not at all trying to say that AIM7 is a
good bench
On Thu, Aug 11, 2016 at 08:20:53PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig wrote:
> >
> > I can look at that, but indeed optimizing this patch seems a bit
> > stupid.
>
> The "write less than a full block to the end of the file" is actually
> a reasonably
On Thu, Aug 11, 2016 at 07:27:52PM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner wrote:
> >
> > So, removing mark_page_accessed() made the spinlock contention
> > *worse*.
> >
> > 36.51% [kernel] [k] _raw_spin_unlock_irqrestore
> >6.27% [kernel] [k] copy_us
On Thu, Aug 11, 2016 at 7:52 PM, Christoph Hellwig wrote:
>
> I can look at that, but indeed optimizing this patch seems a bit
> stupid.
The "write less than a full block to the end of the file" is actually
a reasonably common case.
It may not make for a great filesystem benchmark, but it also i
On Fri, Aug 12, 2016 at 12:23:29PM +1000, Dave Chinner wrote:
> Christoph, maybe there's something we can do to only trigger
> speculative prealloc growth checks if the new file size crosses the end of
> the currently allocated block at the EOF. That would chop out a fair
> chunk of the xfs_bmapi_r
On Thu, Aug 11, 2016 at 7:23 PM, Dave Chinner wrote:
>
> And, as usual, that's the answer. Here's the reproducer:
>
> # sudo mkfs.xfs -f -m crc=0 /dev/pmem1
> # sudo mount -o noatime /dev/pmem1 /mnt/scratch
> # sudo xfs_io -f -c "pwrite 0 512m -b 1" /mnt/scratch/fooey
Heh. Ok, so 1 byte or 1kB at
On Thu, Aug 11, 2016 at 5:54 PM, Dave Chinner wrote:
>
> So, removing mark_page_accessed() made the spinlock contention
> *worse*.
>
> 36.51% [kernel] [k] _raw_spin_unlock_irqrestore
>6.27% [kernel] [k] copy_user_generic_string
>3.73% [kernel] [k] _raw_spin_unlock_irq
>3.55% [
On Fri, Aug 12, 2016 at 10:54:42AM +1000, Dave Chinner wrote:
> I'm now going to test Christoph's theory that this is an "overwrite
> doing lots of block mapping" issue. More on that to follow.
Ok, so going back to the profiles, I can say it's not an overwrite
issue, because there is delayed alloc
On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> On Wed, Aug 10, 2016 at 05:33:20PM -0700, Huang, Ying wrote:
> We need to know what is happening that is different - there's a good
> chance the mapping trace events will tell us. Huang, can you get
> a raw event trace from the test?
>
On Thu, Aug 11, 2016 at 09:55:33AM -0700, Linus Torvalds wrote:
> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig wrote:
> >
> > The one liner below (not tested yet) to simply remove it should fix that
> > up. I also noticed we have a spurious pagefault_disable/enable, I
> > need to dig into t
On Thu, Aug 11, 2016 at 3:16 PM, Al Viro wrote:
>
> Huh? The very first thing it does is
> char *kaddr = kmap_atomic(page), *p = kaddr + offset;
>
> If _that_ does not disable pagefaults, we are very deep in shit.
Right you are - it does, even with highmem disabled. Never mind, those
pag
On Thu, Aug 11, 2016 at 01:35:00PM -0700, Linus Torvalds wrote:
> The thing is, iov_iter_copy_from_user_atomic() doesn't itself enforce
> non-blocking user accesses, it depends on the caller blocking page
> faults.
Huh? The very first thing it does is
char *kaddr = kmap_atomic(page), *p
I'll need to dig into what AIM7 actually does in this benchmark, which
isn't too easy as I'm on a business trip currently, but from the list
below it looks like it keeps overwriting and overwriting a file that's
already been allocated. This is a pretty stupid workload, but fortunately
it should al
On Thu, Aug 11, 2016 at 2:16 PM, Huang, Ying wrote:
>
> Test result is as follow,
Thanks. No change.
> raw perf data:
I redid my munging, with the old (good) percentages in parenthesis:
intel_idle: 17.66 (16.88)
copy_user_enhanced_fast_string:
Christoph Hellwig writes:
> On Thu, Aug 11, 2016 at 12:51:31PM -0700, Linus Torvalds wrote:
>> Ok. It does seem to also reset the active file page counts back, so
>> that part did seem to be related, but yeah, from a performance
>> standpoint that was clearly not a major issue.
>>
>> Let's hope
On Thu, Aug 11, 2016 at 1:00 PM, Christoph Hellwig wrote:
>
> I can't really think of any reason why the pagefaul_disable() would
> sіgnificantly change performance.
No, you're right, we prefault the page anyway.
And quite frankly, looking at it, I think the pagefault_disable/enable
is actually
On Thu, Aug 11, 2016 at 12:51:31PM -0700, Linus Torvalds wrote:
> Ok. It does seem to also reset the active file page counts back, so
> that part did seem to be related, but yeah, from a performance
> standpoint that was clearly not a major issue.
>
> Let's hope Dave can figure out something based
On Thu, Aug 11, 2016 at 10:51 AM, Huang, Ying wrote:
>>
>
> Here is the test result for the debug patch. It appears that the aim7
> score is a little better, but the regression is not recovered.
Ok. It does seem to also reset the active file page counts back, so
that part did seem to be related,
Linus Torvalds writes:
> On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig wrote:
>>
>> The one liner below (not tested yet) to simply remove it should fix that
>> up. I also noticed we have a spurious pagefault_disable/enable, I
>> need to dig into the history of that first, though.
>
> Hopef
On Thu, Aug 11, 2016 at 8:57 AM, Christoph Hellwig wrote:
>
> The one liner below (not tested yet) to simply remove it should fix that
> up. I also noticed we have a spurious pagefault_disable/enable, I
> need to dig into the history of that first, though.
Hopefully the pagefault_disable/enable
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> The biggest difference is that we have "mark_page_accessed()" show up
> after, and not before. There was also a lot of LRU noise in the
> non-profile data. I wonder if that is the reason here: the old model
> of using generic_perform
On Wed, Aug 10, 2016 at 06:00:24PM -0700, Linus Torvalds wrote:
> On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying wrote:
> >
> > Here it is,
>
> Thanks.
>
> Appended is a munged "after" list, with the "before" values in
> parenthesis. It actually looks fairly similar.
>
> The biggest difference is
On Thu, Aug 11, 2016 at 10:36:59AM +0800, Ye Xiaolong wrote:
> On 08/11, Dave Chinner wrote:
> >On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> >> I need to see these events:
> >>
> >>xfs_file*
> >>xfs_iomap*
> >>xfs_get_block*
> >>
> >> For both kernels. An example tr
On 08/11, Dave Chinner wrote:
>On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
>> I need to see these events:
>>
>> xfs_file*
>> xfs_iomap*
>> xfs_get_block*
>>
>> For both kernels. An example trace from 4.8-rc1 running the command
>> `xfs_io -f -c 'pwrite 0 512k -b 1
On Thu, Aug 11, 2016 at 11:16:12AM +1000, Dave Chinner wrote:
> I need to see these events:
>
> xfs_file*
> xfs_iomap*
> xfs_get_block*
>
> For both kernels. An example trace from 4.8-rc1 running the command
> `xfs_io -f -c 'pwrite 0 512k -b 128k' /mnt/scratch/fooey doing an
> o
On Wed, Aug 10, 2016 at 05:33:20PM -0700, Huang, Ying wrote:
> Linus Torvalds writes:
>
> > On Wed, Aug 10, 2016 at 5:11 PM, Huang, Ying wrote:
> >>
> >> Here is the comparison result with perf-profile data.
> >
> > Heh. The diff is actually harder to read than just showing A/B
> > state.The fac
On Wed, Aug 10, 2016 at 5:33 PM, Huang, Ying wrote:
>
> Here it is,
Thanks.
Appended is a munged "after" list, with the "before" values in
parenthesis. It actually looks fairly similar.
The biggest difference is that we have "mark_page_accessed()" show up
after, and not before. There was also a
1 - 100 of 104 matches
Mail list logo