Re: Regression on linux-next (next-20240712)

2024-07-17 Thread Andrew Morton
On Wed, 17 Jul 2024 13:00:58 +0200 David Hildenbrand  wrote:

> > Could you please check why the patch causes this regression and provide a 
> > fix if necessary?
> 
> This is know.
> 
> There is a discussion along the original patch [1] on how to do it 
> differently. But likely we'll tackle it differently [2]. So this patch 
> should be dropped for -- which I think already happened because I cannot 
> spot that patch in mm-unstable anymore.

Yes, I dropped it on July 15.


Re: [PATCH 2/2] xfs: disable large folio support in xfile_create

2024-02-07 Thread Andrew Morton
On Thu, 11 Jan 2024 18:22:50 -0800 "Darrick J. Wong"  wrote:

> On Thu, Jan 11, 2024 at 10:45:53PM +, Matthew Wilcox wrote:
> > On Thu, Jan 11, 2024 at 02:00:53PM -0800, Andrew Morton wrote:
> > > On Wed, 10 Jan 2024 12:04:51 -0800 "Darrick J. Wong"  
> > > wrote:
> > > 
> > > > > > Fixing this will require a bit of an API change, and prefeably 
> > > > > > sorting out
> > > > > > the hwpoison story for pages vs folio and where it is placed in the 
> > > > > > shmem
> > > > > > API.  For now use this one liner to disable large folios.
> > > > > > 
> > > > > > Reported-by: Darrick J. Wong 
> > > > > > Signed-off-by: Christoph Hellwig 
> > > > > 
> > > > > Can someone who knows more about shmem.c than I do please review
> > > > > https://lore.kernel.org/linux-xfs/20240103084126.513354-4-...@lst.de/
> > > > > so that I can feel slightly more confident as hch and I sort through 
> > > > > the
> > > > > xfile.c issues?
> > > > > 
> > > > > For this patch,
> > > > > Reviewed-by: Darrick J. Wong 
> > > > 
> > > > ...except that I'm still getting 2M THPs even with this enabled, so I
> > > > guess either we get to fix it now, or create our own private tmpfs mount
> > > > so that we can pass in huge=never, similar to what i915 does. :(
> > > 
> > > What is "this"?  Are you saying that $Subject doesn't work, or that the
> > > above-linked please-review patch doesn't work?
> > 
> > shmem pays no attention to the mapping_large_folio_support() flag,
> > so the proposed fix doesn't work.  It ought to, but it has its own way
> > of doing it that predates mapping_large_folio_support existing.
> 
> Yep.  It turned out to be easier to fix xfile.c to deal with large
> folios than I thought it would be.  Or so I think.  We'll see what
> happens on fstestscloud overnight.

Where do we stand with this?  Should I merge these two patches into
6.8-rcX, cc:stable?


Re: [PATCH 2/2] xfs: disable large folio support in xfile_create

2024-01-11 Thread Andrew Morton
On Wed, 10 Jan 2024 12:04:51 -0800 "Darrick J. Wong"  wrote:

> > > Fixing this will require a bit of an API change, and prefeably sorting out
> > > the hwpoison story for pages vs folio and where it is placed in the shmem
> > > API.  For now use this one liner to disable large folios.
> > > 
> > > Reported-by: Darrick J. Wong 
> > > Signed-off-by: Christoph Hellwig 
> > 
> > Can someone who knows more about shmem.c than I do please review
> > https://lore.kernel.org/linux-xfs/20240103084126.513354-4-...@lst.de/
> > so that I can feel slightly more confident as hch and I sort through the
> > xfile.c issues?
> > 
> > For this patch,
> > Reviewed-by: Darrick J. Wong 
> 
> ...except that I'm still getting 2M THPs even with this enabled, so I
> guess either we get to fix it now, or create our own private tmpfs mount
> so that we can pass in huge=never, similar to what i915 does. :(

What is "this"?  Are you saying that $Subject doesn't work, or that the
above-linked please-review patch doesn't work?


Re: disable large folios for shmem file used by xfs xfile

2024-01-10 Thread Andrew Morton
On Wed, 10 Jan 2024 10:21:07 +0100 Christoph Hellwig  wrote:

> Hi all,
> 
> Darrick reported that the fairly new XFS xfile code blows up when force
> enabling large folio for shmem.  This series fixes this quickly by disabling
> large folios for this particular shmem file for now until it can be fixed
> properly, which will be a lot more invasive.
> 

Patches seems reasonable as a short-term only-affects-xfs thing.

I assume that kernels which contain 137db333b29186 ("xfs: teach xfile
to pass back direct-map pages to caller") want this, so a Fixes: that
and a cc:stable are appropriate?



Re: [Intel-gfx] [PATCH v4 1/1] drm/i915: Move abs_diff() to math.h

2023-08-03 Thread Andrew Morton
On Thu,  3 Aug 2023 16:19:18 +0300 Andy Shevchenko 
 wrote:

> abs_diff() belongs to math.h. Move it there.
> This will allow others to use it.
> 
> ...
>
> --- a/include/linux/math.h
> +++ b/include/linux/math.h
> @@ -155,6 +155,13 @@ __STRUCT_FRACT(u32)
>   __builtin_types_compatible_p(typeof(x), unsigned type), \
>   ({ signed type __x = (x); __x < 0 ? -__x : __x; }), other)
>  
> +#define abs_diff(a, b) ({\
> + typeof(a) __a = (a);\
> + typeof(b) __b = (b);\
> + (void)(&__a == &__b);   \
> + __a > __b ? (__a - __b) : (__b - __a);  \
> +})

Can we document it please?

Also, the open-coded type comparison could be replaced with __typecheck()?

And why the heck isn't __typecheck() in typecheck.h, to be included by
minmax.h.

etcetera.  Sigh.  I'll grab it, but please at least send along some
kerneldoc?




Re: [Intel-gfx] [PATCH 0/4] log2: make is_power_of_2() more generic

2023-03-30 Thread Andrew Morton
On Thu, 30 Mar 2023 21:53:03 + David Laight  wrote:

> > But wouldn't all these issues be addressed by simply doing
> > 
> > #define is_power_of_2(n) (n != 0 && ((n & (n - 1)) == 0))
> > 
> > ?
> > 
> > (With suitable tweaks to avoid evaluating `n' more than once)
> 
> I think you need to use the 'horrid tricks' from min() to get
> a constant expression from constant inputs.

This

--- a/include/linux/log2.h~a
+++ a/include/linux/log2.h
@@ -41,11 +41,11 @@ int __ilog2_u64(u64 n)
  * *not* considered a power of two.
  * Return: true if @n is a power of 2, otherwise false.
  */
-static inline __attribute__((const))
-bool is_power_of_2(unsigned long n)
-{
-   return (n != 0 && ((n & (n - 1)) == 0));
-}
+#define is_power_of_2(_n)  \
+   ({  \
+   typeof(_n) n = (_n);\
+   n != 0 && ((n & (n - 1)) == 0); \
+   })
 
 /**
  * __roundup_pow_of_two() - round up to nearest power of two
_

worked for me in a simple test.

--- a/fs/open.c~b
+++ a/fs/open.c
@@ -1564,3 +1564,10 @@ int stream_open(struct inode *inode, str
 }
 
 EXPORT_SYMBOL(stream_open);
+
+#include 
+
+int foo(void)
+{
+   return is_power_of_2(43);
+}
_


foo:
# fs/open.c:1573: }
xorl%eax, %eax  #
ret 


Is there some more tricky situation where it breaks?


Re: [Intel-gfx] [PATCH 0/4] log2: make is_power_of_2() more generic

2023-03-30 Thread Andrew Morton
On Thu, 30 Mar 2023 13:42:39 +0300 Jani Nikula  wrote:

> is_power_of_2() only works for types <= sizeof(unsigned long) and it's
> also not a constant expression. There are a number of places in kernel
> where is_power_of_2() is called on u64, which fails on 32-bit
> builds. Try to remedy that. While at it, make it a constant expression
> when possible.

Yes, the current `is_power_of_2(unsigned long n)' isn't very general.

But wouldn't all these issues be addressed by simply doing

#define is_power_of_2(n) (n != 0 && ((n & (n - 1)) == 0))

?

(With suitable tweaks to avoid evaluating `n' more than once)




Re: [Intel-gfx] [linux-next:master] BUILD REGRESSION 8232539f864ca60474e38eb42d451f5c26415856

2023-02-25 Thread Andrew Morton
On Sun, 26 Feb 2023 01:10:55 +0800 kernel test robot  wrote:

> tree/branch: 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> branch HEAD: 8232539f864ca60474e38eb42d451f5c26415856  Add linux-next 
> specific files for 20230225
> 
> Error/Warning reports:
> 
> mm/page_alloc.c:257:1: sparse: sparse: symbol 'check_pages_enabled' was not 
> declared. Should it be static?

It should!

--- a/mm/page_alloc.c~mm-page_alloc-reduce-page-alloc-free-sanity-checks-fix
+++ b/mm/page_alloc.c
@@ -254,7 +254,7 @@ DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_FREE_DEFAULT_ON, 
init_on_free);
 EXPORT_SYMBOL(init_on_free);
 
 /* perform sanity checks on struct pages being allocated or freed */
-DEFINE_STATIC_KEY_MAYBE(CONFIG_DEBUG_VM, check_pages_enabled);
+static DEFINE_STATIC_KEY_MAYBE(CONFIG_DEBUG_VM, check_pages_enabled);
 
 static bool _init_on_alloc_enabled_early __read_mostly
= IS_ENABLED(CONFIG_INIT_ON_ALLOC_DEFAULT_ON);
_



Re: [Intel-gfx] [PATCH 00/19] Introduce __xchg, non-atomic xchg

2022-12-22 Thread Andrew Morton
On Thu, 22 Dec 2022 12:46:16 +0100 Andrzej Hajda  
wrote:

> Hi all,
> 
> I hope there will be place for such tiny helper in kernel.
> Quick cocci analyze shows there is probably few thousands places
> where it could be useful.

So to clarify, the intent here is a simple readability cleanup for
existing open-coded exchange operations.  The intent is *not* to
identify existing xchg() sites which are unnecessarily atomic and to
optimize them by using the non-atomic version.

Have you considered the latter?

> I am not sure who is good person to review/ack such patches,

I can take 'em.

> so I've used my intuition to construct to/cc lists, sorry for mistakes.
> This is the 2nd approach of the same idea, with comments addressed[0].
> 
> The helper is tiny and there are advices we can leave without it, so
> I want to present few arguments why it would be good to have it:
> 
> 1. Code readability/simplification/number of lines:
> 
> Real example from drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c:
> -   previous_min_rate = evport->qos.min_rate;
> -   evport->qos.min_rate = min_rate;
> +   previous_min_rate = __xchg(evport->qos.min_rate, min_rate);
> 
> For sure the code is more compact, and IMHO more readable.
> 
> 2. Presence of similar helpers in other somehow related languages/libs:
> 
> a) Rust[1]: 'replace' from std::mem module, there is also 'take'
> helper (__xchg(&x, 0)), which is the same as private helper in
> i915 - fetch_and_zero, see latest patch.
> b) C++ [2]: 'exchange' from utility header.
> 
> If the idea is OK there are still 2 qestions to answer:
> 
> 1. Name of the helper, __xchg follows kernel conventions,
> but for me Rust names are also OK.

I like replace(), or, shockingly, exchange().

But...   Can we simply make swap() return the previous value?

previous_min_rate = swap(&evport->qos.min_rate, min_rate);




Re: [Intel-gfx] mm/huge_memory: do not clobber swp_entry_t during THP split

2022-10-25 Thread Andrew Morton
On Tue, 25 Oct 2022 11:03:38 +0100 Mel Gorman  
wrote:

> > If so I
> > can temporarily put it in until it arrives via the next rc - assuming that
> > would be the flow from upstream pov?
> > 
> 
> I expect it to. It's currently in the akpm/mm.git branch
> mm/mm-hotfixes-unstable where I expect it to flow to mm/mm-hotfixes-stable
> in due course before sending to Linus. I can't make promises about the
> timing as that's determined by Andrew.

This is now in mainline, 71e2d666ef85.


Re: [Intel-gfx] i915: crash with 5.19-rc2

2022-08-10 Thread Andrew Morton
On Wed, 10 Aug 2022 17:09:37 +0100 Matthew Wilcox  wrote:

> On Wed, Aug 10, 2022 at 08:55:32AM -0700, Hugh Dickins wrote:
> > This is not a bug in zram or i915, but what Matthew fixes in
> > https://lore.kernel.org/lkml/20220730042518.1264767-1-wi...@infradead.org/
> 
> Thanks for tracking that down, Hugh.  Nice to know it's a crash rather
> than a data corruption.  The fix is in Andrew's tree, so I think it's
> already destined for upstream soon.

Yes, that's in the hotfixes queue for sending upstream Fridayish.

> Andrew, I have two fixes that I don't see in your tree:

Is it expected to be in my tree?  It's a huge v1 patch series on which
I wasn't cc'ed?

> https://lore.kernel.org/linux-mm/20220808193430.3378317-2-wi...@infradead.org/T/#u
> https://lore.kernel.org/linux-mm/20220808193430.3378317-4-wi...@infradead.org/T/#u
> 
> The first is of minor importance, the second I believe Hugh has hit in
> his testing.

In that case the second patch should be pulled out of that series, have
its changelog made to describe the runtime effects and have a Cc:stable
added, please.



Re: [Intel-gfx] [PATCH RFC v2] mm: Add f_ops->populate()

2022-03-06 Thread Andrew Morton
On Sun,  6 Mar 2022 05:26:55 +0200 Jarkko Sakkinen  wrote:

> Sometimes you might want to use MAP_POPULATE to ask a device driver to
> initialize the device memory in some specific manner. SGX driver can use
> this to request more memory by issuing ENCLS[EAUG] x86 opcode for each
> page in the address range.

Why is this useful?  Please fully describe the benefit to kernel users.
Convince us that the benefit justifies the code churn, maintenance
cost and larger kernel footprint.

Do we know of any other drivers which might use this?

> Add f_ops->populate() with the same parameters as f_ops->mmap() and make
> it conditionally called inside call_mmap(). Update call sites
> accodingly.

spello

> -static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
> +static inline int call_mmap(struct file *file, struct vm_area_struct *vma, 
> bool do_populate)
>  {
> - return file->f_op->mmap(file, vma);
> + int ret = file->f_op->mmap(file, vma);
> +
> + if (!ret && do_populate && file->f_op->populate)
> + ret = file->f_op->populate(file, vma);
> +
> + return ret;
>  }

Should this still be inlined?




Re: [Intel-gfx] remove alloc_vm_area v2

2020-09-25 Thread Andrew Morton
On Thu, 24 Sep 2020 15:58:42 +0200 Christoph Hellwig  wrote:

> this series removes alloc_vm_area, which was left over from the big
> vmalloc interface rework.  It is a rather arkane interface, basicaly
> the equivalent of get_vm_area + actually faulting in all PTEs in
> the allocated area.  It was originally addeds for Xen (which isn't
> modular to start with), and then grew users in zsmalloc and i915
> which seems to mostly qualify as abuses of the interface, especially
> for i915 as a random driver should not set up PTE bits directly.
> 
> Note that the i915 patches apply to the drm-tip branch of the drm-tip
> tree, as that tree has recent conflicting commits in the same area.

Is the drm-tip material in linux-next yet?  I'm still seeing a non-trivial
reject in there at present.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] mm: Track page table modifications in __apply_to_page_range()

2020-08-21 Thread Andrew Morton
On Fri, 21 Aug 2020 14:37:46 +0200 Joerg Roedel  wrote:

> The __apply_to_page_range() function is also used to change and/or
> allocate page-table pages in the vmalloc area of the address space.
> Make sure these changes get synchronized to other page-tables in the
> system by calling arch_sync_kernel_mappings() when necessary.
> 
> Tested-by: Chris Wilson  #x86-32
> Cc:  # v5.8+

I'm trying to figure out how you figured out that this is 5.8+.  Has a
particular misbehaving commit been identified?

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] mm: Track page table modifications in __apply_to_page_range()

2020-08-21 Thread Andrew Morton
On Fri, 21 Aug 2020 14:37:46 +0200 Joerg Roedel  wrote:

> The __apply_to_page_range() function is also used to change and/or
> allocate page-table pages in the vmalloc area of the address space.
> Make sure these changes get synchronized to other page-tables in the
> system by calling arch_sync_kernel_mappings() when necessary.

There's no description here of the user-visible effects of the bug. 
Please always provide this, especially when proposing a -stable
backport.  Take pity upon all the downstream kernel maintainers who are
staring at this wondering whether they should risk adding it to their
kernels.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH hmm v2 1/5] mm/hmm: make CONFIG_DEVICE_PRIVATE into a select

2020-05-09 Thread Andrew Morton
On Fri,  1 May 2020 15:20:44 -0300 Jason Gunthorpe  wrote:

> From: Jason Gunthorpe 
> 
> There is no reason for a user to select this or not directly - it should
> be selected by drivers that are going to use the feature, similar to how
> CONFIG_HMM_MIRROR works.
> 
> Currently all drivers provide a feature kconfig that will disable use of
> DEVICE_PRIVATE in that driver, allowing users to avoid enabling this if
> they don't want the overhead.
> 

I'm not too sure what's going on here, but i386 allmodconfig broke.

kernel/resource.c: In function '__request_free_mem_region':
kernel/resource.c:1653:28: error: 'PA_SECTION_SHIFT' undeclared (first use in 
this function); did you mean 'SECTIONS_PGSHIFT'?
  size = ALIGN(size, 1UL << PA_SECTION_SHIFT);

because in current mainline, allmodconfig produces
CONFIG_DEVICE_PRIVATE=n but in current linux-next, allmodconfig
produces CONFIG_DEVICE_PRIVATE=y.  But CONFIG_SPARSEMEM=n so the build
breaks.

Bisection fingers this commit, but reverting it doesn't seem to fix
things.  Could you take a look please?

I'm seeing this from menuconfig:

WARNING: unmet direct dependencies detected for DEVICE_PRIVATE
  Depends on [n]: ZONE_DEVICE [=n]
  Selected by [m]:
  - DRM_NOUVEAU_SVM [=y] && HAS_IOMEM [=y] && DRM_NOUVEAU [=m] && MMU [=y] && 
STAGING [=y]
  - TEST_HMM [=m] && RUNTIME_TESTING_MENU [=y] && TRANSPARENT_HUGEPAGE [=y]

`select' rather sucks this way - easy to break dependencies.  Quite a
number of years ago the Kconfig gurus were saying "avoid", but I don't
recall the details.



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] list: Prevent compiler reloads inside 'safe' list iteration

2020-03-11 Thread Andrew Morton
On Tue, 10 Mar 2020 08:47:49 -0700 "Paul E. McKenney"  
wrote:

> On Tue, Mar 10, 2020 at 03:05:57PM +, David Laight wrote:
> > From: Marco Elver
> > > Sent: 10 March 2020 14:10
> > ...
> > > FWIW, for writes we're already being quite generous, in that plain
> > > aligned writes up to word-size are assumed to be "atomic" with the
> > > default (conservative) config, i.e. marking such writes is optional.
> > > Although, that's a generous assumption that is not always guaranteed
> > > to hold 
> > > (https://lore.kernel.org/lkml/20190821103200.kpufwtviqhpbuv2n@willie-the-truck/).
> > 
> > Remind me to start writing everything in assembler.
> 
> Been there, done that.  :-/
> 
> > That and to mark all structure members 'volatile'.
> 
> Indeed.  READ_ONCE() and WRITE_ONCE() get this same effect, but without
> pessimizing non-concurrent accesses to those same members.  Plus KCSAN
> knows about READ_ONCE(), WRITE_ONCE(), and also volatile members.
> 

So I take it from all the above that we should do this.

Did anyone actually review the code? :)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/52] mm/sl[uo]b: export __kmalloc_track(_node)_caller

2020-02-19 Thread Andrew Morton
On Wed, 19 Feb 2020 11:20:31 +0100 Daniel Vetter  wrote:

> tracker in drm for stuff that's tied to the lifetime of a drm_device,
> not the underlying struct device. Kinda like devres, but for drm.
> 
> ...
>
> Ack for merging through drm trees very much appreciated.

Acked-by: Andrew Morton 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4] string-choice: add yesno(), onoff(), enableddisabled(), plural() helpers

2019-10-23 Thread Andrew Morton
On Wed, 23 Oct 2019 16:13:08 +0300 Jani Nikula  wrote:

> The kernel has plenty of ternary operators to choose between constant
> strings, such as condition ? "yes" : "no", as well as value == 1 ? "" :
> "s":
> 
> $ git grep '? "yes" : "no"' | wc -l
> 258
> $ git grep '? "on" : "off"' | wc -l
> 204
> $ git grep '? "enabled" : "disabled"' | wc -l
> 196
> $ git grep '? "" : "s"' | wc -l
> 25
> 
> Additionally, there are some occurences of the same in reverse order,
> split to multiple lines, or otherwise not caught by the simple grep.
> 
> Add helpers to return the constant strings. Remove existing equivalent
> and conflicting functions in i915, cxgb4, and USB core. Further
> conversion can be done incrementally.
> 
> The main goal here is to abstract recurring patterns, and slightly clean
> up the code base by not open coding the ternary operators.

Fair enough.

> --- /dev/null
> +++ b/include/linux/string-choice.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __STRING_CHOICE_H__
> +#define __STRING_CHOICE_H__
> +
> +#include 
> +
> +static inline const char *yesno(bool v)
> +{
> + return v ? "yes" : "no";
> +}
> +
> +static inline const char *onoff(bool v)
> +{
> + return v ? "on" : "off";
> +}
> +
> +static inline const char *enableddisabled(bool v)
> +{
> + return v ? "enabled" : "disabled";
> +}
> +
> +static inline const char *plural(long v)
> +{
> + return v == 1 ? "" : "s";
> +}
> +
> +#endif /* __STRING_CHOICE_H__ */

These aren't very good function names.  Better to create a kernel-style
namespace such as "choice_" and then add the expected underscores:

choice_yes_no()
choice_enabled_disabled()
choice_plural()

(Example: note that slabinfo.c already has an "onoff()").


Also, I worry that making these functions inline means that each .o
file will contain its own copy of the strings ("yes", "no", "enabled",
etc) if the .c file calls the relevant helper.  I'm not sure if the
linker is smart enough (yet) to fix this up.  If not, we will end up
with a smaller kernel by uninlining these functions. 
lib/string-choice.c would suit.

And doing this will cause additional savings: calling a single-arg
out-of-line function generates less .text than calling yesno().  When I
did this: 

--- 
a/include/linux/string-choice.h~string-choice-add-yesno-onoff-enableddisabled-plural-helpers-fix
+++ a/include/linux/string-choice.h
@@ -8,10 +8,7 @@
 
 #include 
 
-static inline const char *yesno(bool v)
-{
-   return v ? "yes" : "no";
-}
+const char *yesno(bool v);
 
 static inline const char *onoff(bool v)
 {

The text segment of drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.o
(78 callsites) shrunk by 118 bytes.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [Bug 205065] New: workqueue: PF_MEMALLOC task 173(kswapd0) is flushing !WQ_MEM_RECLAIM events:gen6_pm_rps_work [i915]

2019-10-02 Thread Andrew Morton

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 01 Oct 2019 17:06:35 + bugzilla-dae...@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=205065
> 
> Bug ID: 205065
>Summary: workqueue: PF_MEMALLOC task 173(kswapd0) is flushing
> !WQ_MEM_RECLAIM events:gen6_pm_rps_work [i915]
>Product: Memory Management
>Version: 2.5
> Kernel Version: Ubuntu 5.3.0-13.14-generic 5.3.0
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: a...@linux-foundation.org
>   Reporter: romanescu.2...@gmail.com
> Regression: No

Maybe Tejun can help interpret this ;)

> Open bug in launchpad.net
> https://bugs.launchpad.net/bugs/1846241
> 
> dmesg:
> [ 9998.518472] [ cut here ]
> [ 9998.518505] workqueue: PF_MEMALLOC task 173(kswapd0) is flushing
> !WQ_MEM_RECLAIM events:gen6_pm_rps_work [i915]
> [ 9998.518516] WARNING: CPU: 5 PID: 173 at kernel/workqueue.c:2598
> check_flush_dependency+0xa7/0x140
> [ 9998.518517] Modules linked in: usbhid rfcomm ccm cmac bnep
> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio
> nls_iso8859_1 snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc
> snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi
> snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine x86_pkg_temp_thermal
> intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core
> kvm irqbypass snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event intel_rapl_msr
> snd_rawmidi ath9k_htc mei_hdcp ath9k_common btusb uvcvideo crct10dif_pclmul
> ath9k_hw btrtl videobuf2_vmalloc crc32_pclmul videobuf2_memops btbcm ath
> snd_seq btintel videobuf2_v4l2 i915 ghash_clmulni_intel mac80211
> videobuf2_common hid_sensor_incl_3d bluetooth hid_sensor_magn_3d
> hid_sensor_gyro_3d hid_sensor_accel_3d hid_sensor_rotation videodev
> snd_seq_device hid_sensor_trigger snd_timer cfg80211 drm_kms_helper mc
> industrialio_triggered_buffer ecdh_generic kfifo_buf ecc
> [ 9998.518560] processor_thermal_device hid_sensor_iio_common libarc4
> aesni_intel spi_pxa2xx_platform drm snd intel_rapl_common industrialio
> aes_x86_64 crypto_simd i2c_algo_bit fb_sys_fops input_leds cryptd glue_helper
> joydev dw_dmac intel_cstate syscopyarea sysfillrect intel_xhci_usb_role_switch
> mei_me intel_rapl_perf serio_raw wmi_bmof intel_wmi_thunderbolt dw_dmac_core
> soundcore idma64 8250_dw cros_ec_ishtp mei hid_multitouch roles virt_dma
> cros_ec_core sysimgblt intel_pch_thermal intel_soc_dts_iosf hp_wmi
> soc_button_array int3403_thermal intel_vbtn int340x_thermal_zone sparse_keymap
> hp_accel acpi_pad lis3lv02d hp_wireless input_polldev int3400_thermal
> acpi_thermal_rel mac_hid sch_fq_codel parport_pc ppdev sunrpc lp parport
> ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq libcrc32c
> hid_sensor_custom hid_sensor_hub intel_ishtp_loader intel_ishtp_hid 
> hid_generic
> psmouse sdhci_pci cqhci i2c_i801 ahci intel_lpss_pci intel_ish_ipc sdhci
> i2c_hid libahci intel_lpss intel_ishtp hid wmi
> [ 9998.518603] pinctrl_sunrisepoint video pinctrl_intel
> [ 9998.518609] CPU: 5 PID: 173 Comm: kswapd0 Not tainted 5.3.0-13-generic
> #14-Ubuntu
> [ 9998.518610] Hardware name: HP HP Pavilion x360 Convertible 14-cd0xxx/8486,
> BIOS F.34 04/29/2019
> [ 9998.518614] RIP: 0010:check_flush_dependency+0xa7/0x140
> [ 9998.518616] Code: 8d 8a 70 0a 00 00 4d 89 e0 48 8d 8b b0 00 00 00 4c 89 ca
> 48 c7 c7 e8 93 d3 a1 48 89 45 e0 c6 05 c4 29 75 01 01 e8 f4 13 fe ff <0f> 0b 
> 48
> 8b 45 e0 eb 0f 4c 89 ef e8 39 8e 00 00 41 f6 45 25 08 75
> [ 9998.518618] RSP: 0018:ba44c034f7f0 EFLAGS: 00010086
> [ 9998.518620] RAX:  RBX: 9d76a400ce00 RCX:
> 
> [ 9998.518621] RDX: 0063 RSI: a2581fe3 RDI:
> 0046
> [ 9998.518623] RBP: ba44c034f810 R08: a2581f80 R09:
> 0063
> [ 9998.518624] R10: a2582360 R11: a2581fcb R12:
> c1092710
> [ 9998.518625] R13: 9d76a30b5e00 R14: 0001 R15:
> 9d76a5df0700
> [ 9998.518627] FS: () GS:9d76a5d4()
> knlGS:
> [ 9998.518628] CS: 0010 DS:  ES:  CR0: 80050033
> [ 9998.518630] CR2: 7f877e503000 CR3: 00019be0a002 CR4:
> 003606e0
> [ 9998.518631] Call Trace:
> [ 9998.518636] __flush_work+0x97/0x1d0
> [ 9998.518641] ? enqueue_hrtimer+0x3d/0x90
> [ 9998.518644] __cancel_work_timer+0x10e/0x190
> [ 9998.518648] ? _cond_resched+0x19/0x30
> [ 9998.518651] ? synchronize_irq+0x3e/0xb0
> [ 9998.518654] cancel_work_sync+0x10/0x20
> [ 9998.518692] gen6_disable_rps_interrupts+0x95/0xc0 [i915]
> [ 9998.518732] gen6_rps_idle+0x1f/0xf0 [i915]
> [ 9998.518772] intel_gt_

Re: [Intel-gfx] [PATCH 3/4] kernel.h: Add non_block_start/end()

2019-08-22 Thread Andrew Morton
On Tue, 20 Aug 2019 22:24:40 +0200 Daniel Vetter  wrote:

> Hi Peter,
> 
> Iirc you've been involved at least somewhat in discussing this. -mm folks
> are a bit undecided whether these new non_block semantics are a good idea.
> Michal Hocko still is in support, but Andrew Morton and Jason Gunthorpe
> are less enthusiastic. Jason said he's ok with merging the hmm side of
> this if scheduler folks ack. If not, then I'll respin with the
> preempt_disable/enable instead like in v1.

I became mollified once Michel explained the rationale.  I think it's
OK.  It's very specific to the oom reaper and hopefully won't be used
more widely(?).

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

2019-08-15 Thread Andrew Morton
On Thu, 15 Aug 2019 10:44:29 +0200 Michal Hocko  wrote:

> > I continue to struggle with this.  It introduces a new kernel state
> > "running preemptibly but must not call schedule()".  How does this make
> > any sense?
> > 
> > Perhaps a much, much more detailed description of the oom_reaper
> > situation would help out.
>  
> The primary point here is that there is a demand of non blockable mmu
> notifiers to be called when the oom reaper tears down the address space.
> As the oom reaper is the primary guarantee of the oom handling forward
> progress it cannot be blocked on anything that might depend on blockable
> memory allocations. These are not really easy to track because they
> might be indirect - e.g. notifier blocks on a lock which other context
> holds while allocating memory or waiting for a flusher that needs memory
> to perform its work. If such a blocking state happens that we can end up
> in a silent hang with an unusable machine.
> Now we hope for reasonable implementations of mmu notifiers (strong
> words I know ;) and this should be relatively simple and effective catch
> all tool to detect something suspicious is going on.
> 
> Does that make the situation more clear?

Yes, thanks, much.  Maybe a code comment along the lines of

  This is on behalf of the oom reaper, specifically when it is
  calling the mmu notifiers.  The problem is that if the notifier were
  to block on, for example, mutex_lock() and if the process which holds
  that mutex were to perform a sleeping memory allocation, the oom
  reaper is now blocked on completion of that memory allocation.

btw, do we need task_struct.non_block_count?  Perhaps the oom reaper
thread could set a new PF_NONBLOCK (which would be more general than
PF_OOM_REAPER).  If we run out of PF_ flags, use (current == oom_reaper_th).

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/5] mm: Check if mmu notifier callbacks are allowed to fail

2019-08-14 Thread Andrew Morton
On Wed, 14 Aug 2019 22:20:23 +0200 Daniel Vetter  wrote:

> Just a bit of paranoia, since if we start pushing this deep into
> callchains it's hard to spot all places where an mmu notifier
> implementation might fail when it's not allowed to.
> 
> Inspired by some confusion we had discussing i915 mmu notifiers and
> whether we could use the newly-introduced return value to handle some
> corner cases. Until we realized that these are only for when a task
> has been killed by the oom reaper.
> 
> An alternative approach would be to split the callback into two
> versions, one with the int return value, and the other with void
> return value like in older kernels. But that's a lot more churn for
> fairly little gain I think.
> 
> Summary from the m-l discussion on why we want something at warning
> level: This allows automated tooling in CI to catch bugs without
> humans having to look at everything. If we just upgrade the existing
> pr_info to a pr_warn, then we'll have false positives. And as-is, no
> one will ever spot the problem since it's lost in the massive amounts
> of overall dmesg noise.
> 
> ...
>
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -179,6 +179,8 @@ int __mmu_notifier_invalidate_range_start(struct 
> mmu_notifier_range *range)
>   pr_info("%pS callback failed with %d in 
> %sblockable context.\n",
>   mn->ops->invalidate_range_start, _ret,
>   !mmu_notifier_range_blockable(range) ? 
> "non-" : "");
> + WARN_ON(mmu_notifier_range_blockable(range) ||
> + ret != -EAGAIN);
>   ret = _ret;
>   }
>   }

A problem with WARN_ON(a || b) is that if it triggers, we don't know
whether it was because of a or because of b.  Or both.  So I'd suggest

WARN_ON(a);
WARN_ON(b);

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()

2019-08-14 Thread Andrew Morton
On Wed, 14 Aug 2019 22:20:24 +0200 Daniel Vetter  wrote:

> In some special cases we must not block, but there's not a
> spinlock, preempt-off, irqs-off or similar critical section already
> that arms the might_sleep() debug checks. Add a non_block_start/end()
> pair to annotate these.
> 
> This will be used in the oom paths of mmu-notifiers, where blocking is
> not allowed to make sure there's forward progress. Quoting Michal:
> 
> "The notifier is called from quite a restricted context - oom_reaper -
> which shouldn't depend on any locks or sleepable conditionals. The code
> should be swift as well but we mostly do care about it to make a forward
> progress. Checking for sleepable context is the best thing we could come
> up with that would describe these demands at least partially."
> 
> Peter also asked whether we want to catch spinlocks on top, but Michal
> said those are less of a problem because spinlocks can't have an
> indirect dependency upon the page allocator and hence close the loop
> with the oom reaper.

I continue to struggle with this.  It introduces a new kernel state
"running preemptibly but must not call schedule()".  How does this make
any sense?

Perhaps a much, much more detailed description of the oom_reaper
situation would help out.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] mmotm 2019-07-04-15-01 uploaded (gpu/drm/i915/oa/)

2019-07-04 Thread Andrew Morton
On Thu, 04 Jul 2019 22:22:41 -0700 Joe Perches  wrote:

> > So when comparing a zero-length file with a non-existent file, diff
> > produces no output.
> 
> Why use the -N option ?
> 
> $ diff --help
> [...]
>   -N, --new-file  treat absent files as empty
> 
> otherwise
> 
> $ cd $(mktemp -d -p .)
> $ touch x
> $ diff -u x y
> diff: y: No such file or directory

Without -N diff fails and exits with an error.  -N does what's desired
as long as the non-missing file isn't empty.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] mmotm 2019-07-04-15-01 uploaded (gpu/drm/i915/oa/)

2019-07-04 Thread Andrew Morton
On Fri, 5 Jul 2019 13:14:35 +1000 Stephen Rothwell  
wrote:

> > I checked next-20190704 tag.
> > 
> > I see the empty file
> > drivers/gpu/drm/i915/oa/Makefile
> > 
> > Did someone delete it?
> 
> Commit
> 
>   5ed7a0cf3394 ("drm/i915: Move OA files to separate folder")
> 
> from the drm-intel tree seems to have created it as an empty file.

hrm.

diff(1) doesn't seem to know how to handle a zero-length file.

y:/home/akpm> mkdir foo
y:/home/akpm> cd foo
y:/home/akpm/foo> touch x
y:/home/akpm/foo> diff -uN x y
y:/home/akpm/foo> date > x
y:/home/akpm/foo> diff -uN x y
--- x   2019-07-04 21:58:37.815028211 -0700
+++ y   1969-12-31 16:00:00.0 -0800
@@ -1 +0,0 @@
-Thu Jul  4 21:58:37 PDT 2019

So when comparing a zero-length file with a non-existent file, diff
produces no output.


This'll make things happy.  And I guess it should be done to protect
all the valuable intellectual property in that file.

--- /dev/null
+++ a/drivers/gpu/drm/i915/oa/Makefile
@@ -0,0 +1 @@
+# SPDX-License-Identifier: GPL-2.0
_

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH V2] include: linux: Regularise the use of FIELD_SIZEOF macro

2019-06-11 Thread Andrew Morton
On Tue, 11 Jun 2019 15:00:10 -0600 Andreas Dilger  wrote:

> >> to FIELD_SIZEOF
> > 
> > As Alexey has pointed out, C structs and unions don't have fields -
> > they have members.  So this is an opportunity to switch everything to
> > a new member_sizeof().
> > 
> > What do people think of that and how does this impact the patch footprint?
> 
> I did a check, and FIELD_SIZEOF() is used about 350x, while sizeof_field()
> is about 30x, and SIZEOF_FIELD() is only about 5x.

Erk.  Sorry, I should have grepped.

> That said, I'm much more in favour of "sizeof_field()" or "sizeof_member()"
> than FIELD_SIZEOF().  Not only does that better match "offsetof()", with
> which it is closely related, but is also closer to the original "sizeof()".
> 
> Since this is a rather trivial change, it can be split into a number of
> patches to get approval/landing via subsystem maintainers, and there is no
> huge urgency to remove the original macros until the users are gone.  It
> would make sense to remove SIZEOF_FIELD() and sizeof_field() quickly so
> they don't gain more users, and the remaining FIELD_SIZEOF() users can be
> whittled away as the patches come through the maintainer trees.

In that case I'd say let's live with FIELD_SIZEOF() and remove
sizeof_field() and SIZEOF_FIELD().

I'm a bit surprised that the FIELD_SIZEOF() definition ends up in
stddef.h rather than in kernel.h where such things are normally
defined.  Why is that?

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH V2] include: linux: Regularise the use of FIELD_SIZEOF macro

2019-06-11 Thread Andrew Morton
On Wed, 12 Jun 2019 01:08:36 +0530 Shyam Saini 
 wrote:

> Currently, there are 3 different macros, namely sizeof_field, SIZEOF_FIELD
> and FIELD_SIZEOF which are used to calculate the size of a member of
> structure, so to bring uniformity in entire kernel source tree lets use
> FIELD_SIZEOF and replace all occurrences of other two macros with this.
> 
> For this purpose, redefine FIELD_SIZEOF in include/linux/stddef.h and
> tools/testing/selftests/bpf/bpf_util.h and remove its defination from
> include/linux/kernel.h
> 
> In favour of FIELD_SIZEOF, this patch also deprecates other two similar
> macros sizeof_field and SIZEOF_FIELD.
> 
> For code compatibility reason, retain sizeof_field macro as a wrapper macro
> to FIELD_SIZEOF

As Alexey has pointed out, C structs and unions don't have fields -
they have members.  So this is an opportunity to switch everything to
a new member_sizeof().

What do people think of that and how does this impact the patch footprint?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v7] mm, drm/i915: mark pinned shmemfs pages as unevictable

2018-11-06 Thread Andrew Morton
On Tue,  6 Nov 2018 13:23:24 + Chris Wilson  
wrote:

> From: Kuo-Hsin Yang 
> 
> The i915 driver uses shmemfs to allocate backing storage for gem
> objects. These shmemfs pages can be pinned (increased ref count) by
> shmem_read_mapping_page_gfp(). When a lot of pages are pinned, vmscan
> wastes a lot of time scanning these pinned pages. In some extreme case,
> all pages in the inactive anon lru are pinned, and only the inactive
> anon lru is scanned due to inactive_ratio, the system cannot swap and
> invokes the oom-killer. Mark these pinned pages as unevictable to speed
> up vmscan.
> 
> Export pagevec API check_move_unevictable_pages().
> 
> This patch was inspired by Chris Wilson's change [1].
> 
> [1]: https://patchwork.kernel.org/patch/9768741/
> 
> ...
>
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2382,12 +2382,26 @@ void __i915_gem_object_invalidate(struct 
> drm_i915_gem_object *obj)
>   invalidate_mapping_pages(mapping, 0, (loff_t)-1);
>  }
>  
> +/**

This token is used to introduce a kerneldoc comment.

> + * Move pages to appropriate lru and release the pagevec. Decrement the ref
> + * count of these pages.
> + */

But this isn't a kerneldoc comment.

At least, I don't think it is.  Maybe the parser got smarter when I
wasn't looking.

> +static inline void check_release_pagevec(struct pagevec *pvec)
> +{
> + if (pagevec_count(pvec)) {
> + check_move_unevictable_pages(pvec);
> + __pagevec_release(pvec);
> + cond_resched();
> +     }
> +}

This looks too large to be inlined and the compiler will ignore the
`inline' anyway.


Otherwise, Acked-by: Andrew Morton .  Please
go ahead and merge via the appropriate drm tree.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-24 Thread Andrew Morton
On Tue, 24 Jul 2018 16:17:47 +0200 Michal Hocko  wrote:

> On Fri 20-07-18 17:09:02, Andrew Morton wrote:
> [...]
> > - Undocumented return value.
> > 
> > - comment "failed to reap part..." is misleading - sounds like it's
> >   referring to something which happened in the past, is in fact
> >   referring to something which might happen in the future.
> > 
> > - fails to call trace_finish_task_reaping() in one case
> > 
> > - code duplication.
> > 
> > - Increases mmap_sem hold time a little by moving
> >   trace_finish_task_reaping() inside the locked region.  So sue me ;)
> > 
> > - Sharing the finish: path means that the trace event won't
> >   distinguish between the two sources of finishing.
> > 
> > Please take a look?
> 
> oom_reap_task_mm should return false when __oom_reap_task_mm return
> false. This is what my patch did but it seems this changed by
> http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-oom-remove-oom_lock-from-oom_reaper.patch
> so that one should be fixed.
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 104ef4a01a55..88657e018714 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -565,7 +565,7 @@ static bool oom_reap_task_mm(struct task_struct *tsk, 
> struct mm_struct *mm)
>   /* failed to reap part of the address space. Try again later */
>   if (!__oom_reap_task_mm(mm)) {
>   up_read(&mm->mmap_sem);
> - return true;
> + return false;
>   }
>  
>   pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
> file-rss:%lukB, shmem-rss:%lukB\n",

OK, thanks, I added that.

> 
> On top of that the proposed cleanup looks as follows:
> 

Looks good to me.  Seems a bit strange that we omit the pr_info()
output if the mm was partially reaped - people would still want to know
this?   Not very important though.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-20 Thread Andrew Morton
On Mon, 16 Jul 2018 13:50:58 +0200 Michal Hocko  wrote:

> From: Michal Hocko 
> 
> There are several blockable mmu notifiers which might sleep in
> mmu_notifier_invalidate_range_start and that is a problem for the
> oom_reaper because it needs to guarantee a forward progress so it cannot
> depend on any sleepable locks.
> 
> ...
>
> @@ -571,7 +565,12 @@ static bool oom_reap_task_mm(struct task_struct *tsk, 
> struct mm_struct *mm)
>  
>   trace_start_task_reaping(tsk->pid);
>  
> - __oom_reap_task_mm(mm);
> + /* failed to reap part of the address space. Try again later */
> + if (!__oom_reap_task_mm(mm)) {
> + up_read(&mm->mmap_sem);
> + ret = false;
> + goto unlock_oom;
> + }

This function is starting to look a bit screwy.

: static bool oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
: {
:   if (!down_read_trylock(&mm->mmap_sem)) {
:   trace_skip_task_reaping(tsk->pid);
:   return false;
:   }
: 
:   /*
:* MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't
:* work on the mm anymore. The check for MMF_OOM_SKIP must run
:* under mmap_sem for reading because it serializes against the
:* down_write();up_write() cycle in exit_mmap().
:*/
:   if (test_bit(MMF_OOM_SKIP, &mm->flags)) {
:   up_read(&mm->mmap_sem);
:   trace_skip_task_reaping(tsk->pid);
:   return true;
:   }
: 
:   trace_start_task_reaping(tsk->pid);
: 
:   /* failed to reap part of the address space. Try again later */
:   if (!__oom_reap_task_mm(mm)) {
:   up_read(&mm->mmap_sem);
:   return true;
:   }
: 
:   pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
file-rss:%lukB, shmem-rss:%lukB\n",
:   task_pid_nr(tsk), tsk->comm,
:   K(get_mm_counter(mm, MM_ANONPAGES)),
:   K(get_mm_counter(mm, MM_FILEPAGES)),
:   K(get_mm_counter(mm, MM_SHMEMPAGES)));
:   up_read(&mm->mmap_sem);
: 
:   trace_finish_task_reaping(tsk->pid);
:   return true;
: }

- Undocumented return value.

- comment "failed to reap part..." is misleading - sounds like it's
  referring to something which happened in the past, is in fact
  referring to something which might happen in the future.

- fails to call trace_finish_task_reaping() in one case

- code duplication.


I'm thinking it wants to be something like this?

: /*
:  * Return true if we successfully acquired (then released) mmap_sem
:  */
: static bool oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
: {
:   if (!down_read_trylock(&mm->mmap_sem)) {
:   trace_skip_task_reaping(tsk->pid);
:   return false;
:   }
: 
:   /*
:* MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't
:* work on the mm anymore. The check for MMF_OOM_SKIP must run
:* under mmap_sem for reading because it serializes against the
:* down_write();up_write() cycle in exit_mmap().
:*/
:   if (test_bit(MMF_OOM_SKIP, &mm->flags)) {
:   trace_skip_task_reaping(tsk->pid);
:   goto out;
:   }
: 
:   trace_start_task_reaping(tsk->pid);
: 
:   if (!__oom_reap_task_mm(mm)) {
:   /* Failed to reap part of the address space. Try again later */
:   goto finish;
:   }
: 
:   pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
file-rss:%lukB, shmem-rss:%lukB\n",
:   task_pid_nr(tsk), tsk->comm,
:   K(get_mm_counter(mm, MM_ANONPAGES)),
:   K(get_mm_counter(mm, MM_FILEPAGES)),
:   K(get_mm_counter(mm, MM_SHMEMPAGES)));
: finish:
:   trace_finish_task_reaping(tsk->pid);
: out:
:   up_read(&mm->mmap_sem);
:   return true;
: }

- Increases mmap_sem hold time a little by moving
  trace_finish_task_reaping() inside the locked region.  So sue me ;)

- Sharing the finish: path means that the trace event won't
  distinguish between the two sources of finishing.

Please take a look?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-20 Thread Andrew Morton
On Tue, 17 Jul 2018 10:12:01 +0200 Michal Hocko  wrote:

> > Any suggestions regarding how the driver developers can test this code
> > path?  I don't think we presently have a way to fake an oom-killing
> > event?  Perhaps we should add such a thing, given the problems we're
> > having with that feature.
> 
> The simplest way is to wrap an userspace code which uses these notifiers
> into a memcg and set the hard limit to hit the oom. This can be done
> e.g. after the test faults in all the mmu notifier managed memory and
> set the hard limit to something really small. Then we are looking for a
> proper process tear down.

Chances are, some of the intended audience don't know how to do this
and will either have to hunt down a lot of documentation or will just
not test it.  But we want them to test it, so a little worked step-by-step
example would help things along please.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-16 Thread Andrew Morton
On Mon, 16 Jul 2018 13:50:58 +0200 Michal Hocko  wrote:

> From: Michal Hocko 
> 
> There are several blockable mmu notifiers which might sleep in
> mmu_notifier_invalidate_range_start and that is a problem for the
> oom_reaper because it needs to guarantee a forward progress so it cannot
> depend on any sleepable locks.
> 
> Currently we simply back off and mark an oom victim with blockable mmu
> notifiers as done after a short sleep. That can result in selecting a
> new oom victim prematurely because the previous one still hasn't torn
> its memory down yet.
> 
> We can do much better though. Even if mmu notifiers use sleepable locks
> there is no reason to automatically assume those locks are held.
> Moreover majority of notifiers only care about a portion of the address
> space and there is absolutely zero reason to fail when we are unmapping an
> unrelated range. Many notifiers do really block and wait for HW which is
> harder to handle and we have to bail out though.
> 
> This patch handles the low hanging fruid. 
> __mmu_notifier_invalidate_range_start
> gets a blockable flag and callbacks are not allowed to sleep if the
> flag is set to false. This is achieved by using trylock instead of the
> sleepable lock for most callbacks and continue as long as we do not
> block down the call chain.

I assume device driver developers are wondering "what does this mean
for me".  As I understand it, the only time they will see
blockable==false is when their driver is being called in response to an
out-of-memory condition, yes?  So it is a very rare thing.

Any suggestions regarding how the driver developers can test this code
path?  I don't think we presently have a way to fake an oom-killing
event?  Perhaps we should add such a thing, given the problems we're
having with that feature.

> I think we can improve that even further because there is a common
> pattern to do a range lookup first and then do something about that.
> The first part can be done without a sleeping lock in most cases AFAICS.
> 
> The oom_reaper end then simply retries if there is at least one notifier
> which couldn't make any progress in !blockable mode. A retry loop is
> already implemented to wait for the mmap_sem and this is basically the
> same thing.
> 
> ...
>
> +static inline int mmu_notifier_invalidate_range_start_nonblock(struct 
> mm_struct *mm,
> +   unsigned long start, unsigned long end)
> +{
> + int ret = 0;
> + if (mm_has_notifiers(mm))
> + ret = __mmu_notifier_invalidate_range_start(mm, start, end, 
> false);
> +
> + return ret;
>  }

nit,

{
if (mm_has_notifiers(mm))
return __mmu_notifier_invalidate_range_start(mm, start, end, 
false);
return 0;
}

would suffice.


> 
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3074,7 +3074,7 @@ void exit_mmap(struct mm_struct *mm)
>* reliably test it.
>*/
>   mutex_lock(&oom_lock);
> - __oom_reap_task_mm(mm);
> + (void)__oom_reap_task_mm(mm);
>   mutex_unlock(&oom_lock);

What does this do?

>   set_bit(MMF_OOM_SKIP, &mm->flags);
> 
> ...
>

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] kernel.h: Add for_each_if()

2018-07-11 Thread Andrew Morton
On Wed, 11 Jul 2018 13:51:08 +0200 Daniel Vetter  wrote:

> But I still have the situation that a bunch of maintainers acked this
> and Andrew Morton defacto nacked it, which I guess means I'll keep the
> macro in drm? The common way to go about this seems to be to just push
> the patch series with the ack in some pull request to Linus and ignore
> the people who raised questions, but not really my thing.

Heh.

But, am I wrong?  Code which uses regular kernel style doesn't have
these issues.  We shouldn't be enabling irregular style - we should be
making such sites more regular.  The fact that the compiler generates a
nice warning in some cases simply helps us with that.



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] kernel.h: Add for_each_if()

2018-07-09 Thread Andrew Morton
On Mon,  9 Jul 2018 18:25:09 +0200 Daniel Vetter  wrote:

> To avoid compilers complainig about ambigious else blocks when putting
> an if condition into a for_each macro one needs to invert the
> condition and add a dummy else. We have a nice little convenience
> macro for that in drm headers, let's move it out. Subsequent patches
> will roll it out to other places.
> 
> The issue the compilers complain about are nested if with an else
> block and no {} to disambiguate which if the else belongs to. The C
> standard is clear, but in practice people forget:
> 
> if (foo)
>   if (bar)
>   /* something */
>   else
>   /* something else

um, yeah, don't do that.  Kernel coding style is very much to do

if (foo) {
if (bar)
/* something */
else
/* something else
}

And if not doing that generates a warning then, well, do that.

> The same can happen in a for_each macro when it also contains an if
> condition at the end, except the compiler message is now really
> confusing since there's only 1 if:
> 
> for_each_something()
>   if (bar)
>   /* something */
>   else
>   /* something else
> 
> The for_each_if() macro, by inverting the condition and adding an
> else, avoids the compiler warning.

Ditto.

> Motivated by a discussion with Andy and Yisheng, who want to add
> another for_each_macro which would benefit from for_each_if() instead
> of hand-rolling it.

Ditto.

> v2: Explain a bit better what this is good for, after the discussion
> with Peter Z.

Presumably the above was discussed in whatever-thread-that-was.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [patch 1/1] drivers/gpu/drm/i915/intel_guc_log.c: work around gcc-4.4.4 union initializer issue

2018-03-08 Thread Andrew Morton
On Thu, 08 Mar 2018 12:06:46 +0200 Jani Nikula  
wrote:

> On Wed, 07 Mar 2018, a...@linux-foundation.org wrote:
> > From: Andrew Morton 
> > Subject: drivers/gpu/drm/i915/intel_guc_log.c: work around gcc-4.4.4 union 
> > initializer issue
> >
> > gcc-4.4.4 has problems with initalizers of anon unions.
> >
> > drivers/gpu/drm/i915/intel_guc_log.c: In function 'guc_log_control':
> > drivers/gpu/drm/i915/intel_guc_log.c:64: error: unknown field 
> > 'logging_enabled' specified in initializer
> >
> > Work around this.
> 
> Thanks for the patch, pushed to drm-intel-next-queued.
> 
> That said, how long do we have to care about old compilers? I thought we
> were converging on at the very least GCC 4.5 being required.

Yes, I've seen some talk about that and it is about time for us to do
it.  I'm not sure what stage things are at though.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] mm: Track actual nr_scanned during shrink_slab()

2017-08-25 Thread Andrew Morton
On Thu, 24 Aug 2017 10:00:49 +0200 Vlastimil Babka  wrote:

> > Even if a
> > shrinker has a mistake, VM will have big trouble like infinite loop.
> 
> We could fake 0 as 1 or something, at least.

If the shrinker returns sc->nr_scanned==0 then that's a buggy shrinker
- it should return SHRINK_STOP in that case.  Only a single shrinker
(i915) presently uses sc->nr_scanned and that one gets it right.  I
think it's OK - there's a limit to how far we should go defending
against buggy kernel code, surely.



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/23] mm/shmem: introduce shmem_file_setup_with_mnt

2017-08-25 Thread Andrew Morton
On Thu, 24 Aug 2017 13:04:09 +0100 Matthew Auld 
 wrote:

> On 23 August 2017 at 23:34, Andrew Morton  wrote:
> > On Wed, 23 Aug 2017 12:31:28 +0300 Joonas Lahtinen 
> >  wrote:
> >
> >> This patch has been floating around for a while now Acked and without
> >> further comments. It is blocking us from merging huge page support to
> >> drm/i915.
> >>
> >> Would you mind merging it, or prodding the right people to get it in?
> >>
> >> Regards, Joonas
> >>
> >> On Mon, 2017-08-21 at 19:34 +0100, Matthew Auld wrote:
> >> > We are planning to use our own tmpfs mnt in i915 in place of the
> >> > shm_mnt, such that we can control the mount options, in particular
> >> > huge=, which we require to support huge-gtt-pages. So rather than roll
> >> > our own version of __shmem_file_setup, it would be preferred if we could
> >> > just give shmem our mnt, and let it do the rest.
> >
> > hm, it's a bit odd.  I'm having trouble locating the code which handles
> > huge=within_size (and any other options?).
> 
> See here https://patchwork.freedesktop.org/patch/172771/, currently we
> only care about huge=within_size.
> 
> > What other approaches were considered?
> 
> We also tried https://patchwork.freedesktop.org/patch/156528/, where
> it was suggested that we mount our own tmpfs instance.
> 
> Following from that we now have our own tmps mnt mounted with
> huge=within_size. With this patch we avoid having to roll our own
> __shmem_file_setup like in
> https://patchwork.freedesktop.org/patch/163024/.
> 
> > Was it not feasible to add i915-specific mount options to
> > mm/shmem.c (for example?).
> 
> Hmm, I think within_size should suffice for our needs.

hm, ok, well, unless someone can think of something cleaner, please add
my ack and include it in the appropriate drm tree.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/23] mm/shmem: introduce shmem_file_setup_with_mnt

2017-08-23 Thread Andrew Morton
On Wed, 23 Aug 2017 12:31:28 +0300 Joonas Lahtinen 
 wrote:

> This patch has been floating around for a while now Acked and without
> further comments. It is blocking us from merging huge page support to
> drm/i915.
> 
> Would you mind merging it, or prodding the right people to get it in?
> 
> Regards, Joonas
> 
> On Mon, 2017-08-21 at 19:34 +0100, Matthew Auld wrote:
> > We are planning to use our own tmpfs mnt in i915 in place of the
> > shm_mnt, such that we can control the mount options, in particular
> > huge=, which we require to support huge-gtt-pages. So rather than roll
> > our own version of __shmem_file_setup, it would be preferred if we could
> > just give shmem our mnt, and let it do the rest.

hm, it's a bit odd.  I'm having trouble locating the code which handles
huge=within_size (and any other options?).  What other approaches were
considered?  Was it not feasible to add i915-specific mount options to
mm/shmem.c (for example?).

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Wire up shrinkctl->nr_scanned

2017-08-22 Thread Andrew Morton
On Tue, 22 Aug 2017 14:53:25 +0100 Chris Wilson  
wrote:

> shrink_slab() allows us to report back the number of objects we
> successfully scanned (out of the target shrinkctl->nr_to_scan). As
> report the number of pages owned by each GEM object as a separate item
> to the shrinker, we cannot precisely control the number of shrinker
> objects we scan on each pass; and indeed may free more than requested.
> If we fail to tell the shrinker about the number of objects we process,
> it will continue to hold a grudge against us as any objects left
> unscanned are added to the next reclaim -- and so we will keep on
> "unfairly" shrinking our own slab in comparison to other slabs.

It's unclear which tree this is against but I think I got it all fixed
up.  Please check the changes to i915_gem_shrink().

From: Chris Wilson 
Subject: drm/i915: wire up shrinkctl->nr_scanned

shrink_slab() allows us to report back the number of objects we
successfully scanned (out of the target shrinkctl->nr_to_scan).  As report
the number of pages owned by each GEM object as a separate item to the
shrinker, we cannot precisely control the number of shrinker objects we
scan on each pass; and indeed may free more than requested.  If we fail to
tell the shrinker about the number of objects we process, it will continue
to hold a grudge against us as any objects left unscanned are added to the
next reclaim -- and so we will keep on "unfairly" shrinking our own slab
in comparison to other slabs.

Link: http://lkml.kernel.org/r/20170822135325.9191-2-ch...@chris-wilson.co.uk
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Michal Hocko 
Cc: Johannes Weiner 
Cc: Hillf Danton 
Cc: Minchan Kim 
Cc: Vlastimil Babka 
Cc: Mel Gorman 
Cc: Shaohua Li 
Cc: Christoph Lameter 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Cc: Pekka Enberg 
Signed-off-by: Andrew Morton 
---

 drivers/gpu/drm/i915/i915_debugfs.c  |4 +--
 drivers/gpu/drm/i915/i915_drv.h  |1 
 drivers/gpu/drm/i915/i915_gem.c  |4 +--
 drivers/gpu/drm/i915/i915_gem_gtt.c  |2 -
 drivers/gpu/drm/i915/i915_gem_shrinker.c |   24 +++--
 5 files changed, 24 insertions(+), 11 deletions(-)

diff -puN 
drivers/gpu/drm/i915/i915_debugfs.c~drm-i915-wire-up-shrinkctl-nr_scanned 
drivers/gpu/drm/i915/i915_debugfs.c
--- a/drivers/gpu/drm/i915/i915_debugfs.c~drm-i915-wire-up-shrinkctl-nr_scanned
+++ a/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4333,10 +4333,10 @@ i915_drop_caches_set(void *data, u64 val
 
lockdep_set_current_reclaim_state(GFP_KERNEL);
if (val & DROP_BOUND)
-   i915_gem_shrink(dev_priv, LONG_MAX, I915_SHRINK_BOUND);
+   i915_gem_shrink(dev_priv, LONG_MAX, NULL, I915_SHRINK_BOUND);
 
if (val & DROP_UNBOUND)
-   i915_gem_shrink(dev_priv, LONG_MAX, I915_SHRINK_UNBOUND);
+   i915_gem_shrink(dev_priv, LONG_MAX, NULL, I915_SHRINK_UNBOUND);
 
if (val & DROP_SHRINK_ALL)
i915_gem_shrink_all(dev_priv);
diff -puN drivers/gpu/drm/i915/i915_drv.h~drm-i915-wire-up-shrinkctl-nr_scanned 
drivers/gpu/drm/i915/i915_drv.h
--- a/drivers/gpu/drm/i915/i915_drv.h~drm-i915-wire-up-shrinkctl-nr_scanned
+++ a/drivers/gpu/drm/i915/i915_drv.h
@@ -3628,6 +3628,7 @@ i915_gem_object_create_internal(struct d
 /* i915_gem_shrinker.c */
 unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv,
  unsigned long target,
+ unsigned long *nr_scanned,
  unsigned flags);
 #define I915_SHRINK_PURGEABLE 0x1
 #define I915_SHRINK_UNBOUND 0x2
diff -puN drivers/gpu/drm/i915/i915_gem.c~drm-i915-wire-up-shrinkctl-nr_scanned 
drivers/gpu/drm/i915/i915_gem.c
--- a/drivers/gpu/drm/i915/i915_gem.c~drm-i915-wire-up-shrinkctl-nr_scanned
+++ a/drivers/gpu/drm/i915/i915_gem.c
@@ -2408,7 +2408,7 @@ rebuild_st:
goto err_sg;
}
 
-   i915_gem_shrink(dev_priv, 2 * page_count, *s++);
+   i915_gem_shrink(dev_priv, 2 * page_count, NULL, *s++);
cond_resched();
 
/* We've tried hard to allocate the memory by reaping
@@ -5012,7 +5012,7 @@ int i915_gem_freeze_late(struct drm_i915
 * the objects as well, see i915_gem_freeze()
 */
 
-   i915_gem_shrink(dev_priv, -1UL, I915_SHRINK_UNBOUND);
+   i915_gem_shrink(dev_priv, -1UL, NULL, I915_SHRINK_UNBOUND);
i915_gem_drain_freed_objects(dev_priv);
 
mutex_lock(&dev_priv->drm.struct_mutex);
diff -puN 
drivers/gpu/drm/i915/i915_gem_gtt.c~drm-i915-wire-up-shrinkctl-nr_scanned 
drivers/gpu/drm/i915/i915_gem_gtt.c
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c~drm-i915-wire-up-shrinkctl-nr_scanned
+++ a/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2061,7 +2061,7 @@ int i915_gem_gtt_prepare_pages(struct dr
 */

Re: [Intel-gfx] [PATCH] mm: Reward slab shrinkers that reclaim more than they were asked

2017-08-15 Thread Andrew Morton
On Sat, 12 Aug 2017 12:34:37 +0100 Chris Wilson  
wrote:

> Some shrinkers may only be able to free a bunch of objects at a time, and
> so free more than the requested nr_to_scan in one pass. Account for the
> extra freed objects against the total number of objects we intend to
> free, otherwise we may end up penalising the slab far more than intended.
> 
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -398,6 +398,7 @@ static unsigned long do_shrink_slab(struct shrink_control 
> *shrinkctl,
>   break;
>   freed += ret;
>  
> + nr_to_scan = max(nr_to_scan, ret);
>   count_vm_events(SLABS_SCANNED, nr_to_scan);
>   total_scan -= nr_to_scan;
>   scanned += nr_to_scan;

Well...  kinda.  But what happens if the shrinker scanned more objects
than requested but failed to free many of them?  Of if the shrinker
scanned less than requested?

We really want to return nr_scanned from the shrinker invocation. 
Could we add a field to shrink_control for this?

--- a/mm/vmscan.c~a
+++ a/mm/vmscan.c
@@ -393,14 +393,15 @@ static unsigned long do_shrink_slab(stru
unsigned long nr_to_scan = min(batch_size, total_scan);
 
shrinkctl->nr_to_scan = nr_to_scan;
+   shrinkctl->nr_scanned = nr_to_scan;
ret = shrinker->scan_objects(shrinker, shrinkctl);
if (ret == SHRINK_STOP)
break;
freed += ret;
 
-   count_vm_events(SLABS_SCANNED, nr_to_scan);
-   total_scan -= nr_to_scan;
-   scanned += nr_to_scan;
+   count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned);
+   total_scan -= shrinkctl->nr_scanned;
+   scanned += shrinkctl->nr_scanned;
 
cond_resched();
}
_

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

2017-08-02 Thread Andrew Morton
On Wed, 2 Aug 2017 14:06:39 +0100 Tvrtko Ursulin 
 wrote:

> 
> Hi Andrew,
> 
> We have a couple of small lib/scatterlist.c tidies here, plus exporting 
> the new API which allows drivers to control the maximum coalesced entry 
> as created by __sg_alloc_table_from_pages.
> 
> I am looking for an ack to merge these three patches via the drm-intel tree.
> 
> 
> ...
>
> On 27/07/2017 10:05, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin 
> > 
> > Drivers like i915 benefit from being able to control the maxium
> > size of the sg coallesced segment while building the scatter-

"coalesced"

> > gather list.
> > 
> > Introduce and export the __sg_alloc_table_from_pages function
> > which will allow it that control.
> > 
>
> ...
>
> > --- a/lib/scatterlist.c
> > +++ b/lib/scatterlist.c
> > @@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned 
> > int nents, gfp_t gfp_mask)
> >   EXPORT_SYMBOL(sg_alloc_table);
> >   
> >   /**
> > - * sg_alloc_table_from_pages - Allocate and initialize an sg table from
> > - *an array of pages
> > - * @sgt:   The sg table header to use
> > - * @pages: Pointer to an array of page pointers
> > - * @n_pages:   Number of pages in the pages array
> > - * @offset: Offset from start of the first page to the start of a 
> > buffer
> > - * @size:   Number of valid bytes in the buffer (after offset)
> > - * @gfp_mask:  GFP allocation mask
> > + * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
> > + *  an array of pages
> > + * @sgt:The sg table header to use
> > + * @pages:  Pointer to an array of page pointers
> > + * @n_pages:Number of pages in the pages array
> > + * @offset:  Offset from start of the first page to the start of a 
> > buffer
> > + * @size:Number of valid bytes in the buffer (after offset)
> > + * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
> > + * @gfp_mask:   GFP allocation mask
> >*
> >*  Description:
> >*Allocate and initialize an sg table from a list of pages. Contiguous

The Description doesn't describe how this differs from
sg_alloc_table_from_pages(), although it doesn't seem terribly
important.

> > +
> > +/**
> > + * sg_alloc_table_from_pages - Allocate and initialize an sg table from
> > + *an array of pages
> > + * @sgt:The sg table header to use
> > + * @pages:  Pointer to an array of page pointers
> > + * @n_pages:Number of pages in the pages array
> > + * @offset:  Offset from start of the first page to the start of a 
> > buffer
> > + * @size:Number of valid bytes in the buffer (after offset)
> > + * @gfp_mask:   GFP allocation mask
> > + *
> > + *  Description:
> > + *Allocate and initialize an sg table from a list of pages. Contiguous
> > + *ranges of the pages are squashed into a single scatterlist node. A 
> > user
> > + *may provide an offset at a start and a size of valid data in a buffer
> > + *specified by the page array. The returned sg table is released by
> > + *sg_free_table.
> > + *
> > + * Returns:
> > + *   0 on success, negative error on failure
> > + */
> > +int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
> > + unsigned int n_pages, unsigned int offset,
> > + unsigned long size, gfp_t gfp_mask)
> > +{
> > +   return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset, size,
> > +  SCATTERLIST_MAX_SEGMENT, gfp_mask);
> > +}

Making this an inline would save a bunch or stack space in the callers?

One could just add the new arg then run around and update the 10ish
callers but I guess the new interface is OK.

Otherwise it's OK by me, please go ahead as proposed.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] lib/debugobjects: Export for use in modules

2016-11-22 Thread Andrew Morton
On Tue, 22 Nov 2016 14:30:39 + Chris Wilson  
wrote:

> Drivers, or other modules, that use a mixture of objects (especially
> objects embedded within other objects) would like to take advantage of
> the debugobjects facilities to help catch misuse. Currently, the
> debugobjects interface is only available to builtin drivers and requires
> a set of EXPORT_SYMBOL_GPL for use by modules.

You didn't tell me this (and you should have!) but I'm going to assume
that this patch is needed for current development so I have queued it for
inclusion in 4.9-rcX.  As a general make-life-easier-for-everyone thing.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm/vmalloc: Keep a separate lazy-free list

2016-04-22 Thread Andrew Morton
On Fri, 15 Apr 2016 12:14:31 +0100 Chris Wilson  
wrote:

> > > purge_fragmented_blocks() manages per-cpu lists, so that looks safe
> > > under its own rcu_read_lock.
> > >
> > > Yes, it looks feasible to remove the purge_lock if we can relax sync.
> > 
> > what is still left is waiting on vmap_area_lock for !sync mode.
> > but probably is not that bad.
> 
> Ok, that's bit beyond my comfort zone with a patch to change the free
> list handling. I'll chicken out for the time being, atm I am more
> concerned that i915.ko may call set_page_wb() frequently on individual
> pages.

Nick Piggin's vmap rewrite.  20x (or more) faster. 
https://lwn.net/Articles/285341/

10 years ago, never finished.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] mm/vmap: Add a notifier for when we run out of vmap address space

2016-03-28 Thread Andrew Morton
On Thu, 17 Mar 2016 13:41:56 + Chris Wilson  
wrote:

> On Thu, Mar 17, 2016 at 01:34:59PM +, Chris Wilson wrote:
> > vmaps are temporary kernel mappings that may be of long duration.
> > Reusing a vmap on an object is preferrable for a driver as the cost of
> > setting up the vmap can otherwise dominate the operation on the object.
> > However, the vmap address space is rather limited on 32bit systems and
> > so we add a notification for vmap pressure in order for the driver to
> > release any cached vmappings.
> > 
> > The interface is styled after the oom-notifier where the callees are
> > passed a pointer to an unsigned long counter for them to indicate if they
> > have freed any space.
> > 
> > v2: Guard the blocking notifier call with gfpflags_allow_blocking()
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Andrew Morton 
> > Cc: David Rientjes 
> > Cc: Roman Peniaev 
> > Cc: Mel Gorman 
> > Cc: linux...@kvack.org
> > Cc: linux-ker...@vger.kernel.org
> > ---
> >  include/linux/vmalloc.h |  4 
> >  mm/vmalloc.c| 27 +++
> >  2 files changed, 31 insertions(+)
> > 
> > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> > index d1f1d338af20..edd676b8e112 100644
> > --- a/include/linux/vmalloc.h
> > +++ b/include/linux/vmalloc.h
> > @@ -187,4 +187,8 @@ pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
> >  #define VMALLOC_TOTAL 0UL
> >  #endif
> >  
> > +struct notitifer_block;
> Omg. /o\

Hah.

Please move the forward declaration to top-of-file.  This prevents
people from later adding the same thing at line 100 - this has happened
before.

Apart from that, all looks OK to me - please merge it via the DRM tree
if that is more convenient.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2 1/2] mm: Export nr_swap_pages

2015-12-23 Thread Andrew Morton
On Wed, 23 Dec 2015 17:04:27 -0500 Johannes Weiner  wrote:

> On Thu, Dec 10, 2015 at 10:32:42AM +0100, Daniel Vetter wrote:
> > On Fri, Dec 04, 2015 at 11:09:52AM -0500, Johannes Weiner wrote:
> > > On Fri, Dec 04, 2015 at 03:58:53PM +, Chris Wilson wrote:
> > > > Some modules, like i915.ko, use swappable objects and may try to swap
> > > > them out under memory pressure (via the shrinker). Before doing so, they
> > > > want to check using get_nr_swap_pages() to see if any swap space is
> > > > available as otherwise they will waste time purging the object from the
> > > > device without recovering any memory for the system. This requires the
> > > > nr_swap_pages counter to be exported to the modules.
> > > > 
> > > > Signed-off-by: Chris Wilson 
> > > > Cc: "Goel, Akash" 
> > > > Cc: Johannes Weiner 
> > > > Cc: linux...@kvack.org
> > > 
> > > Acked-by: Johannes Weiner 
> > 
> > Ack for merging this through drm-intel trees for 4.5? I'm a bit unclear
> > who's ack I need for that for linux-mm topics ...
> 
> Andrew would be the -mm maintainer. CC'd.

yup, please go ahead and merge that via the DRM tree.

nr_swap_pages is a crappy name.  That means "number of pages in swap",
which isn't the case.  Something like "swap_pages_available" would be
better.

And your swap_available() isn't good either ;) It can mean "is any swap
online" or "what is the amount of free swap space (in unknown units!)".
I'd call it "swap_is_full()" and put a ! in the caller.  But it's
hardly important for a wee little static helper.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] drivers/gpu/drm/i915/i915_gem_gtt.c

2015-05-23 Thread Andrew Morton
On Sat, 23 May 2015 14:30:09 +0100 Damien Lespiau  
wrote:

> On Fri, May 22, 2015 at 02:17:32PM -0700, Andrew Morton wrote:
> > I'm not sure what's happened to the drm code in linux-next - it's
> > exploding all over the place.  Did someone turn on -Werror without
> > doing anywhere near enough testing?
> > 
> > Anyway, I don't know how to fix this i386 build error:
> 
> Seems like you have CONFIG_DRM_I915_WERROR set?

Yes.

> We explicitely made sure to not enable -Werror by default,

`make allmodconfig' enables CONFIG_DRM_I915_WERROR.

I'm not sure what is the approved way of fixing this.  Perhaps
disabling CONFIG_DRM_I915_WERROR when CONFIG_COMPILE_TEST=y.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] drivers/gpu/drm/i915/i915_gem_gtt.c

2015-05-22 Thread Andrew Morton

I'm not sure what's happened to the drm code in linux-next - it's
exploding all over the place.  Did someone turn on -Werror without
doing anywhere near enough testing?

Anyway, I don't know how to fix this i386 build error:

drivers/gpu/drm/i915/i915_gem_gtt.c: In function 'gen8_ppgtt_init':
drivers/gpu/drm/i915/i915_gem_gtt.c:954:2: error: large integer implicitly 
truncated to unsigned type [-Werror=overflow]

ppgtt->base.total = 1ULL << 32;

i915_address_space.total is a ulong: 32-bit.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/5] mm: Refactor remap_pfn_range()

2015-04-07 Thread Andrew Morton
On Tue,  7 Apr 2015 17:31:36 +0100 Chris Wilson  
wrote:

> In preparation for exporting very similar functionality through another
> interface, gut the current remap_pfn_range(). The motivating factor here
> is to reuse the PGB/PUD/PMD/PTE walker, but allow back progation of
> errors rather than BUG_ON.

I'm not on intel-gfx and for some reason these patches didn't show up on
linux-mm.  I wanted to comment on "mutex: Export an interface to wrap a
mutex lock" but
http://lists.freedesktop.org/archives/intel-gfx/2015-April/064063.html
doesn't tell me which mailing lists were cc'ed and I can't find that
patch on linux-kernel.

Can you please do something to make this easier for us??

And please fully document all the mutex interfaces which you just
added.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: Throttle shrinkers harder

2014-04-18 Thread Andrew Morton
On Thu, 10 Apr 2014 08:05:06 +0100 Chris Wilson  
wrote:

> During testing of i915.ko with working texture sets larger than RAM, we
> encounter OOM with plenty of memory still trapped within writeback, e.g:
> 
> [   42.386039] active_anon:10134 inactive_anon:1900781 isolated_anon:32
>  active_file:33 inactive_file:39 isolated_file:0
>  unevictable:0 dirty:0 writeback:337627 unstable:0
>  free:11985 slab_reclaimable:9458 slab_unreclaimable:23614
>  mapped:41 shmem:1560769 pagetables:1276 bounce:0
> 
> If we throttle for writeback following shrink_slab, this gives us time
> to wait upon the writeback generated by the i915.ko shinker:
> 
> [ 4756.750808] active_anon:24386 inactive_anon:900793 isolated_anon:0
>  active_file:23 inactive_file:20 isolated_file:0
>  unevictable:0 dirty:0 writeback:0 unstable:0
>  free:5550 slab_reclaimable:5184 slab_unreclaimable:4888
>  mapped:3 shmem:472393 pagetables:1249 bounce:0
> 
> (Sadly though the test is still failing.)
> 
> Testcase: igt/gem_tiled_swapping
> References: https://bugs.freedesktop.org/show_bug.cgi?id=72742

i915_gem_object_get_pages_gtt() makes my head spin, but
https://bugs.freedesktop.org/attachment.cgi?id=90818 says
"gfp_mask=0x201da" which is 

___GFP_HARDWALL|___GFP_COLD|___GFP_FS|___GFP_IO|___GFP_WAIT|___GFP_MOVABLE|___GFP_HIGHMEM

so this allocation should work and it very bad if the page allocator is
declaring oom while there is so much writeback in flight, assuming the
writeback is to eligible zones.

Mel, Johannes: could you take a look please?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] [RFC] Taint the kernel for unsafe module options

2014-03-05 Thread Andrew Morton
On Wed,  5 Mar 2014 10:33:14 +0100 Daniel Vetter  wrote:

> Users just love to set random piles of options since surely enabling
> all the experimental stuff helps. Later on we get bug reports because
> it all fell apart.
> 
> Even more fun when it's labelled a regression when some change only
> just made the feature possible (e.g. stolen memory fixes suddenly
> making fbc possible).
> 
> Make it clear that users are playing with fire here. In drm/i915 all
> these options follow the same pattern of using -1 as the per-machine
> default, and any other value being used for force the parameter.
> 
> Adding a pile of cc's to solicit input and figure out whether this
> would be generally useful - this quick rfc is just for drm/i915.

Seems harmless and potentially useful to others so yes, I'd vote for
putting it in core kernel.

However this only handles integers.  Will we end up needed great gobs
of new code to detect unsafe setting of u8's, strings, etc?


> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c

The patch adds lots of trailing whitespace.  checkpatch is ->thattaway.

> @@ -50,7 +50,46 @@ struct i915_params i915 __read_mostly = {
>   .disable_display = 0,
>  };
>  
> -module_param_named(modeset, i915.modeset, int, 0400);
> +int param_set_unsafe_int(const char *val, const struct kernel_param *kp)
> +{
> + long l;
> + int ret;
> +
> + ret = kstrtol(val, 0, &l);
> + if (ret < 0 || ((int)l != l))
> + return ret < 0 ? ret : -EINVAL;

That's a bit screwy.  Simpler:

if (ret < 0)
return ret;
if ((int)l != l)
return -EINVAL;


> + /* Taint if userspace overrides the kernel defaults. */
> + if (l != -1) {
> + printk(KERN_WARNING "Setting dangerous option %s to non-default 
> value!\n",
> +kp->name);

pr_warn() is nicer.

> + add_taint(TAINT_USER, LOCKDEP_STILL_OK);
> + }
> +
> + *((int *)kp->arg) = l;
> + return 0;
> +}
>
> ...
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] lib: Export interval_tree

2014-01-27 Thread Andrew Morton
On Sun, 26 Jan 2014 12:24:33 + Chris Wilson  
wrote:

> Note for maintainers, this is being proposed for use by i915.ko, so it
> may make the most sense to merge it via the drm/i915 tree in the next
> cycle.

Yes, please proceed that way.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [Bug 64521] New: BUG kmalloc-4096 (Tainted: G W ): Poison overwritten

2013-11-06 Thread Andrew Morton

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).


On Wed, 06 Nov 2013 19:12:29 + bugzilla-dae...@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=64521
> 
> Bug ID: 64521
>Summary: BUG kmalloc-4096 (Tainted: GW   ): Poison
> overwritten
>Product: Memory Management
>Version: 2.5
> Kernel Version: 3.11.7
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Slab Allocator
>   Assignee: a...@linux-foundation.org
>   Reporter: mikhail.v.gavri...@gmail.com
> Regression: No
> 
> Created attachment 113671
>   --> https://bugzilla.kernel.org/attachment.cgi?id=113671&action=edit
> dmesg output
> 
> $ uname -a
> Linux localhost.localdomain 3.11.7-300.fc20.x86_64+debug #1 SMP Mon Nov 4
> 14:54:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
> 
> [14065.872585] perf samples too long (2552 > 2500), lowering
> kernel.perf_event_max_sample_rate to 5
> [23825.627427]
> =
> [23825.627438] BUG kmalloc-4096 (Tainted: GW   ): Poison overwritten
> [23825.627442]
> -
> 
> [23825.627449] INFO: 0x880311f7a5ee-0x880311f7a5ee. First byte 0x7b
> instead of 0x6b
> [23825.627495] INFO: Allocated in i915_gem_execbuffer2+0x51/0x290 [i915] 
> age=89
> cpu=6 pid=2995
> [23825.627503] __slab_alloc+0x45f/0x526
> [23825.627509] __kmalloc+0x322/0x3b0
> [23825.627534] i915_gem_execbuffer2+0x51/0x290 [i915]
> [23825.627554] drm_ioctl+0x542/0x680 [drm]
> [23825.627561] do_vfs_ioctl+0x305/0x530
> [23825.627566] SyS_ioctl+0x81/0xa0
> [23825.627576] tracesys+0xdd/0xe2
> [23825.627601] INFO: Freed in i915_gem_execbuffer2+0xed/0x290 [i915] age=89
> cpu=6 pid=2995
> [23825.627607] __slab_free+0x3a/0x382
> [23825.627611] kfree+0x2c0/0x2f0
> [23825.627632] i915_gem_execbuffer2+0xed/0x290 [i915]
> [23825.627649] drm_ioctl+0x542/0x680 [drm]
> [23825.627653] do_vfs_ioctl+0x305/0x530
> [23825.627657] SyS_ioctl+0x81/0xa0
> [23825.627663] tracesys+0xdd/0xe2
> [23825.627668] INFO: Slab 0xea000c47de00 objects=7 used=7 fp=0x 
> (null) flags=0x5ff0004080
> [23825.627672] INFO: Object 0x880311f7a290 @offset=8848
> fp=0x880311f79148
> 
> [23825.627679] Bytes b4 880311f7a280: d6 ea 65 01 01 00 00 00 5a 5a 5a 5a
> 5a 5a 5a 5a  ..e.
> [23825.627684] Object 880311f7a290: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627688] Object 880311f7a2a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627692] Object 880311f7a2b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627696] Object 880311f7a2c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627700] Object 880311f7a2d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627704] Object 880311f7a2e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627708] Object 880311f7a2f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627711] Object 880311f7a300: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627715] Object 880311f7a310: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627719] Object 880311f7a320: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627723] Object 880311f7a330: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627727] Object 880311f7a340: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627731] Object 880311f7a350: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627735] Object 880311f7a360: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627739] Object 880311f7a370: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627743] Object 880311f7a380: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627747] Object 880311f7a390: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627751] Object 880311f7a3a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627755] Object 880311f7a3b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627759] Object 880311f7a3c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627762] Object 880311f7a3d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  

Re: [Intel-gfx] [PATCH 0/2] lib/scatterlist: sg_page_iter: support for memory w/o backing pages

2013-03-26 Thread Andrew Morton
On Tue, 26 Mar 2013 15:50:20 +0100 Daniel Vetter  wrote:

> On Tue, Mar 26, 2013 at 03:14:17PM +0200, Imre Deak wrote:
> > When adding sg_page_iter I haven't thought properly through the use case
> > for sg lists w/o backing pages - which is specific to the i915 driver -
> > so this patchset adds support for this.
> > 
> > It applies on the i915 tree [1], where the iterator is in use already.
> > 
> > [1] git://people.freedesktop.org/~danvet/drm-intel [nightly branch]
> 
> i915 patches are already included in linux-next, so should apply on top of
> that, too. So can this go in through -mm for 3.10 or should I slurp it in
> through drm-intel trees (once it passes review)? I'd like to ditch the
> dummy page hack we're currently using (i.e. patch 2).

Please slurp it - there's little benefit in spreading it across two trees.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-03-01 Thread Andrew Morton
On Thu,  1 Mar 2012 20:22:59 +0100
Daniel Vetter  wrote:

> drm/i915 wants to read/write more than one page in its fastpath
> and hence needs to prefault more than PAGE_SIZE bytes.
> 
> Add new functions in filemap.h to make that possible.
> 
> Also kill a copy&pasted spurious space in both functions while at it.
> 
>
> ...
>
> +/* Multipage variants of the above prefault helpers, useful if more than
> + * PAGE_SIZE of date needs to be prefaulted. These are separate from the 
> above
> + * functions (which only handle up to PAGE_SIZE) to avoid clobbering the
> + * filemap.c hotpaths. */

Like this please:

/*
 * Multipage variants of the above prefault helpers, useful if more than
 * PAGE_SIZE of date needs to be prefaulted. These are separate from the above
 * functions (which only handle up to PAGE_SIZE) to avoid clobbering the
 * filemap.c hotpaths.
 */

and s/date/data/

> +static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
> +{
> + int ret;
> + const char __user *end = uaddr + size - 1;
> +
> + if (unlikely(size == 0))
> + return 0;
> +
> + /*
> +  * Writing zeroes into userspace here is OK, because we know that if
> +  * the zero gets there, we'll be overwriting it.
> +  */

Yeah, like that.

> + while (uaddr <= end) {
> + ret = __put_user(0, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }
> +
> + /* Check whether the range spilled into the next page. */
> + if (((unsigned long)uaddr & PAGE_MASK) ==
> + ((unsigned long)end & PAGE_MASK))
> + ret = __put_user(0, end);
> +
> + return ret;
> +}
> +
> +static inline int fault_in_multipages_readable(const char __user *uaddr,
> +int size)
> +{
> + volatile char c;
> + int ret;
> + const char __user *end = uaddr + size - 1;
> +
> + if (unlikely(size == 0))
> + return 0;
> +
> + while (uaddr <= end) {
> + ret = __get_user(c, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }
> +
> + /* Check whether the range spilled into the next page. */
> + if (((unsigned long)uaddr & PAGE_MASK) ==
> + ((unsigned long)end & PAGE_MASK)) {
> + ret = __get_user(c, end);
> + (void)c;
> + }
> +
> + return ret;
> +}

Please merge it via the DRI tree.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-29 Thread Andrew Morton
On Thu, 1 Mar 2012 00:14:53 +0100
Daniel Vetter  wrote:

> I'll redo this patch by adding _multipage versions of these 2 functions
> for i915.

OK, but I hope "for i915" doesn't mean "private to"!  Put 'em in
pagemap.h, for maintenance reasons and because they are generic.

Making them inline is a bit sad, but that's OK for now - they have few
callsites.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-29 Thread Andrew Morton
On Wed, 29 Feb 2012 15:03:31 +0100
Daniel Vetter  wrote:

> drm/i915 wants to read/write more than one page in its fastpath
> and hence needs to prefault more than PAGE_SIZE bytes.
> 
> I've checked the callsites and they all already clamp size when
> calling fault_in_pages_* to the same as for the subsequent
> __copy_to|from_user and hence don't rely on the implicit clamping
> to PAGE_SIZE.
> 
> Also kill a copy&pasted spurious space in both functions while at it.
> 
> v2: As suggested by Andrew Morton, add a multipage parameter to both
> functions to avoid the additional branch for the pagemap.c hotpath.
> My gcc 4.6 here seems to dtrt and indeed reap these branches where not
> needed.
> 

I don't think this produced a very good result :(

>
> ...
>
> -static inline int fault_in_pages_writeable(char __user *uaddr, int size)
> +static inline int fault_in_pages_writeable(char __user *uaddr, int size,
> +bool multipage)
>  {
>   int ret;
> + char __user *end = uaddr + size - 1;
>  
>   if (unlikely(size == 0))
>   return 0;
> @@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user 
> *uaddr, int size)
>* Writing zeroes into userspace here is OK, because we know that if
>* the zero gets there, we'll be overwriting it.
>*/
> - ret = __put_user(0, uaddr);
> - if (ret == 0) {
> - char __user *end = uaddr + size - 1;
> + do {
> + ret = __put_user(0, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + } while (multipage && uaddr <= end);
>  
> + if (ret == 0) {
>   /*
>* If the page was already mapped, this will get a cache miss
>* for sure, so try to avoid doing it.
>*/
> - if (((unsigned long)uaddr & PAGE_MASK) !=
> + if (((unsigned long)uaddr & PAGE_MASK) ==
>   ((unsigned long)end & PAGE_MASK))
> - ret = __put_user(0, end);
> + ret = __put_user(0, end);
>   }
>   return ret;
>  }

One effect of this change for the filemap.c callsite is that `uaddr'
now gets incremented by PAGE_SIZE.  That happens to not break anything
because we then mask `uaddr' with PAGE_MASK, and if gcc were really
smart, it could remove that addition.  But it's a bit ugly.

Ideally the patch would have no effect upon filemap.o size, but with an
allmodconfig config I'm seeing

   textdata bss dec hex filename
  22876 1187344   303387682 mm/filemap.o(before)
  22925 1187392   3043576e3 mm/filemap.o(after)

so we are adding read()/write() overhead, and bss mysteriously got larger.

Can we improve on this?  Even if it's some dumb

static inline int fault_in_pages_writeable(char __user *uaddr, int size,
   bool multipage)
{
if (multipage) {
do-this
} else {
do-that
}
}

the code duplication between do-this and do-that is regrettable, but at
least it's all in the same place in the same file, so we won't
accidentally introduce skew later on.

Alternatively, add a separate fault_in_multi_pages_writeable() to
pagemap.h.  I have a bad feeling this is what your original patch did!

(But we *should* be able to make this work!  Why did this version of
the patch go so wrong?)

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-24 Thread Andrew Morton
On Fri, 24 Feb 2012 14:34:31 +0100
Daniel Vetter  wrote:

> > > --- a/include/linux/pagemap.h
> > > +++ b/include/linux/pagemap.h
> > > @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
> > > wait_queue_t *waiter);
> > >  static inline int fault_in_pages_writeable(char __user *uaddr, int size)
> > >  {
> > >   int ret;
> > > + char __user *end = uaddr + size - 1;
> > >  
> > >   if (unlikely(size == 0))
> > >   return 0;
> > > @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char 
> > > __user *uaddr, int size)
> > >* Writing zeroes into userspace here is OK, because we know that if
> > >* the zero gets there, we'll be overwriting it.
> > >*/
> > > - ret = __put_user(0, uaddr);
> > > + while (uaddr <= end) {
> > > + ret = __put_user(0, uaddr);
> > > + if (ret != 0)
> > > + return ret;
> > > + uaddr += PAGE_SIZE;
> > > + }
> > 
> > The callsites in filemap.c are pretty hot paths, which is why this
> > thing remains explicitly inlined.  I think it would be worth adding a
> > bit of code here to avoid adding a pointless test-n-branch and larger
> > cache footprint to read() and write().
> > 
> > A way of doing that is to add another argument to these functions, say
> > "bool multipage".  Change the code to do
> > 
> > if (multipage) {
> > while (uaddr <= end) {
> > ...
> > }
> > }
> > 
> > and change the callsites to pass in constant "true" or "false".  Then
> > compile it up and manually check that the compiler completely removed
> > the offending code from the filemap.c callsites.
> > 
> > Wanna have a think about that?  If it all looks OK then please be sure
> > to add code comments explaining why we did this.
> 
> I wasn't really happy with the added branch either, but failed to come up
> with a trick to avoid it. Imho adding new _multipage variants of these
> functions instead of adding a constant argument is simpler because the
> functions don't really share much thanks to the block below. I'll see what
> it looks like (and obviously add a comment explaining what's going on).

well... that's just syntactic sugar:

static inline int __fault_in_pages_writeable(char __user *uaddr, int size, bool 
multipage)
{
...
}

static inline int fault_in_pages_writeable(char __user *uaddr, int size)
{
return __fault_in_pages_writeable(uaddr, size, false);
}

static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
{
return __fault_in_pages_writeable(uaddr, size, true);
}

which I don't think is worth bothering with given the very small number
of callsites.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-23 Thread Andrew Morton
On Thu, 16 Feb 2012 13:01:36 +0100
Daniel Vetter  wrote:

> drm/i915 wants to read/write more than one page in its fastpath
> and hence needs to prefault more than PAGE_SIZE bytes.
> 
> I've checked the callsites and they all already clamp size when
> calling fault_in_pages_* to the same as for the subsequent
> __copy_to|from_user and hence don't rely on the implicit clamping
> to PAGE_SIZE.
> 
> Also kill a copy&pasted spurious space in both functions while at it.
>
> ...
>
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
> wait_queue_t *waiter);
>  static inline int fault_in_pages_writeable(char __user *uaddr, int size)
>  {
>   int ret;
> + char __user *end = uaddr + size - 1;
>  
>   if (unlikely(size == 0))
>   return 0;
> @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user 
> *uaddr, int size)
>* Writing zeroes into userspace here is OK, because we know that if
>* the zero gets there, we'll be overwriting it.
>*/
> - ret = __put_user(0, uaddr);
> + while (uaddr <= end) {
> + ret = __put_user(0, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }

The callsites in filemap.c are pretty hot paths, which is why this
thing remains explicitly inlined.  I think it would be worth adding a
bit of code here to avoid adding a pointless test-n-branch and larger
cache footprint to read() and write().

A way of doing that is to add another argument to these functions, say
"bool multipage".  Change the code to do

if (multipage) {
while (uaddr <= end) {
...
}
}

and change the callsites to pass in constant "true" or "false".  Then
compile it up and manually check that the compiler completely removed
the offending code from the filemap.c callsites.

Wanna have a think about that?  If it all looks OK then please be sure
to add code comments explaining why we did this.

>   if (ret == 0) {
> - char __user *end = uaddr + size - 1;
> -
>   /*
>* If the page was already mapped, this will get a cache miss
>* for sure, so try to avoid doing it.
>*/
> - if (((unsigned long)uaddr & PAGE_MASK) !=
> + if (((unsigned long)uaddr & PAGE_MASK) ==
>   ((unsigned long)end & PAGE_MASK))

Maybe I'm having a dim day, but I don't immediately see why != got
turned into ==.


Once we have this settled I'd suggest that the patch be carried in
whatever-git-tree-needs-it.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] i915 native backlight never got merged

2011-08-10 Thread Andrew Morton
On Wed, 10 Aug 2011 07:53:18 -0700 Keith Packard  wrote:

> On Wed, 10 Aug 2011 15:14:09 +0200, Michel Alexandre Salim 
>  wrote:
> 
> > FYI, I'm using that on top of Linux 3.0 and 3.1-rc1 and it still works
> > fine.
> 
> Matthew -- shall I just merge the old patch? Or did you want to make
> some changes?

I have a note here that Ali Gholami Rudi  had issues with
the version I merged into -mm.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/5] i915: Add native backlight control

2011-01-20 Thread Andrew Morton
On Fri, 21 Jan 2011 06:36:54 +0100 Sedat Dilek  
wrote:

> ( Original posting from [1] )
> 
> I have the backlight-type patchset for months in my patch-series (and
> maintained them if necessary against daily linux-next).
> Also the last series from 14-Jan-2011 (see 1-5 patch from [2] and the
> following ones at dri-devel ML).
> 
> If you couldn't find the updated v2 radeon-backlight-type patch, here
> the extract from my patch-series:
> ...
> # Patch from 
> + 
> backlight-type/v2-drm-radeon-kms-Expose-backlight-class-device-for-legacy-LVDS-encoder.patch
> ...
> 
> I can only speak for the radeon(-KMS) part with
> CONFIG_BACKLIGHT_CLASS_DEVICE=y set and against linux-next: It worked,
> it works.
> 
> I am a bit angry that someone of the "big 5" gets immediately an
> answer where mine is ignored [3].

Well, who were they sent to?

If it was dri-devel then perhaps the people there felt it was
inappropriate to their tree, or they were all busy fixing
100 regressions.

If it was Richard then he's been distracted by other things for some
time, which is why I recently started handling backlight and leds
patches.

If it was me then kick me, but I don't think it was.

In general, if you think that patches aren't getting attention then let
me know and send them to me - harassing people for you is part of my
job description.

> So dear Mr. AKPM, if you can standstill for a few moments to answer
> only one of my questions, through which kernel-tree will this patchset
> walk trough?

leds and backlight patches are presently getting merged into my tree
and I'm sending them into Linus.  I make sure that Richard sees them
all and when he finds time he helps review them for us.

They should turn up in linux-next too - we're working on that.


This particular patch series is theoretically a bit late for 2.6.38,
but if anyone thinks that's a wrong decision, feel free to explain why.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/5] i915: Add native backlight control

2011-01-20 Thread Andrew Morton
On Fri, 21 Jan 2011 00:43:59 +
Matthew Garrett  wrote:

> On Thu, Jan 20, 2011 at 03:13:49PM -0800, Andrew Morton wrote:
> > On Fri, 21 Jan 2011 00:43:46 +0330
> > Ali Gholami Rudi  wrote:
> > 
> > > Ali Gholami Rudi  wrote:
> > > > I tried to apply this patch but I get:
> > > > 
> > > > drivers/gpu/drm/i915/intel_panel.c: In function 
> > > > 'intel_panel_setup_backlight':
> > > > drivers/gpu/drm/i915/intel_panel.c:319: error: 'struct 
> > > > backlight_properties' has no member named 'type'
> > > > drivers/gpu/drm/i915/intel_panel.c:319: error: 'BACKLIGHT_RAW' 
> > > > undeclared (first use in this function)
> > > > drivers/gpu/drm/i915/intel_panel.c:319: error: (Each undeclared 
> > > > identifier is reported only once
> > > > drivers/gpu/drm/i915/intel_panel.c:319: error: for each 
> > > > function it appears in.)
> > > 
> > > After applying patch 1/5, the patch applied cleanly.
> > > Now I can change the brightness using
> > > /sys/class/backlight/intel_backlight/brightness.
> > > So it does fix my issue.
> > > 
> > 
> > So we have a patch ordering issue and the
> > radeon-expose-backlight-class-device-for-legacy-lvds-encoder.patch
> > build error.
> 
> He applied 2/5, it didn't build, he applied 1/5 and it built? I don't 
> think that's a patch ordering issue :)

What, you expect reading skills?

> I think Michel's patch should fix the Radeon one. If not, can you drop 
> that and keep the rest of the set? I'm travelling at the moment and 
> won't have proper build access until the weekend.

OK, I resurrected everything.

I updated the new drivers/video/backlight/adp5520_bl.c, but there's a
decent chance that unconverted drivers will sneak in.  I trust they
will still work OK?

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/5] i915: Add native backlight control

2011-01-20 Thread Andrew Morton
On Fri, 21 Jan 2011 00:43:46 +0330
Ali Gholami Rudi  wrote:

> Ali Gholami Rudi  wrote:
> > I tried to apply this patch but I get:
> > 
> > drivers/gpu/drm/i915/intel_panel.c: In function 
> > 'intel_panel_setup_backlight':
> > drivers/gpu/drm/i915/intel_panel.c:319: error: 'struct 
> > backlight_properties' has no member named 'type'
> > drivers/gpu/drm/i915/intel_panel.c:319: error: 'BACKLIGHT_RAW' 
> > undeclared (first use in this function)
> > drivers/gpu/drm/i915/intel_panel.c:319: error: (Each undeclared 
> > identifier is reported only once
> > drivers/gpu/drm/i915/intel_panel.c:319: error: for each function it 
> > appears in.)
> 
> After applying patch 1/5, the patch applied cleanly.
> Now I can change the brightness using
> /sys/class/backlight/intel_backlight/brightness.
> So it does fix my issue.
> 

So we have a patch ordering issue and the
radeon-expose-backlight-class-device-for-legacy-lvds-encoder.patch
build error.

Matthew, I'll drop 'em all for now.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/5] radeon: Expose backlight class device for legacy LVDS encoder

2011-01-19 Thread Andrew Morton
On Fri, 14 Jan 2011 14:24:23 -0500
Matthew Garrett  wrote:

> From: Michel D__nzer 
> 
> Allows e.g. power management daemons to control the backlight level. Inspired
> by the corresponding code in radeonfb.
> 
> (Updated to add backlight type and make the connector the parent device - mjg)
> 

x86_64 allmodconfig:

drivers/gpu/drm/radeon/radeon_legacy_encoders.c: In function 
'radeon_legacy_lvds_update':
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:64: error: 'struct 
radeon_encoder_atom_dig' has no member named 'bl_dev'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:65: error: 'struct 
radeon_encoder_atom_dig' has no member named 'backlight_level'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:69: error: 'struct 
radeon_encoder_lvds' has no member named 'bl_dev'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:70: error: 'struct 
radeon_encoder_lvds' has no member named 'backlight_level'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c: In function 
'radeon_legacy_lvds_dpms':
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:144: error: 'struct 
radeon_encoder_atom_dig' has no member named 'dpms_mode'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:147: error: 'struct 
radeon_encoder_lvds' has no member named 'dpms_mode'

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/4] drm: fix headers to include linux/types.h

2010-12-01 Thread Andrew Morton
On Wed, 1 Dec 2010 17:54:18 +0100
Julien Cristau  wrote:

> On Wed, Dec  1, 2010 at 17:10:42 +0200, Alexander Shishkin wrote:
> 
> > For headers that get exported to userland and make use of u32 style
> > type names, it is advised to include linux/types.h.
> > 
> > This fixes 5 headers_check warnings.
> > 
> How many times does this need to be NAKed?

Until someone gets a clue and puts comments in there explaining this?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] Backlight: Add backlight type

2010-11-19 Thread Andrew Morton
On Fri, 19 Nov 2010 20:25:59 +
Matthew Garrett  wrote:

> On Fri, Nov 19, 2010 at 12:05:23PM -0800, Andrew Morton wrote:
> > On Fri, 19 Nov 2010 10:53:52 -0500
> > Matthew Garrett  wrote:
> > 
> > > There may be multiple ways of controlling the backlight on a given 
> > > machine.
> > > Allow drivers to expose the type of interface they are providing, making
> > > it possible for userspace to make appropriate policy decisions.
> > > 
> > > ...
> > >
> > >  60 files changed, 102 insertions(+), 0 deletions(-)
> > 
> > This patch has a pretty short half-life.
> 
> Well, ideally it would have landed in the backlight tree when I sent it 
> months ago. Then we'd have the opportunity to ensure that everything was 
> fixed up before it went in in the merge window.

Richard got distracted.  At present I'm grabbing the leds and backlight
patches and Richard is reviewing them as they fly past.

I don't see there's much point in me merging this patch series so if it
survives review, I'd suggest that you put it into an mjg tree and
thence into linux-next and mainline?

> > > @@ -62,6 +68,8 @@ struct backlight_properties {
> > >   /* FB Blanking active? (values as for power) */
> > >   /* Due to be removed, please use (state & BL_CORE_FBBLANK) */
> > >   int fb_blank;
> > > + /* Backlight type */
> > > + enum backlight_type type;
> > >   /* Flags used to signal drivers of state changes */
> > >   /* Upper 4 bits are reserved for driver internal use */
> > >   unsigned int state;
> > 
> > And if/when the half-life expires, we'll have drivers in-tree which
> > forget to set backlight_properties.type.  I haven't checked, but if
> > we're lucky they will default to "0".
> 
> Depends entirely on whether they kzalloc the structure or not before
> calling backlight_device_register(). 

Well.  Even if it's uninitialised, the chances of the value being 1, 2
or 3 for all users are pretty small, so we'll still get to hear about
it if the runtime check is appropriately implemented.

> > What will be the runtime effects upon such unconverted drivers? 
> > Ideally we'd like them to continue to work OK, and to emit a runtime
> > warning.  In which case you'll need BACKLIGHT_RAW=1 so the unconverted
> > driver can be detected, warned about and fixed up by the core code.
> 
> The worst case I can think of is that we walk off the array - I guess 
> there's an argument for sanity checking that in backlight_show_type().

OK, well please have a think about it, see what you can do to handle
unconverted (and possibly out-of-tree) drivers in a friendly fashion.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] Backlight: Add backlight type

2010-11-19 Thread Andrew Morton
On Fri, 19 Nov 2010 10:53:52 -0500
Matthew Garrett  wrote:

> There may be multiple ways of controlling the backlight on a given machine.
> Allow drivers to expose the type of interface they are providing, making
> it possible for userspace to make appropriate policy decisions.
> 
> ...
>
>  60 files changed, 102 insertions(+), 0 deletions(-)

This patch has a pretty short half-life.

>
> ...
>
> --- a/include/linux/backlight.h
> +++ b/include/linux/backlight.h
> @@ -32,6 +32,12 @@ enum backlight_update_reason {
>   BACKLIGHT_UPDATE_SYSFS,
>  };
>  
> +enum backlight_type {
> + BACKLIGHT_RAW,
> + BACKLIGHT_PLATFORM,
> + BACKLIGHT_FIRMWARE,
> +};
> +
>  struct backlight_device;
>  struct fb_info;
>  
> @@ -62,6 +68,8 @@ struct backlight_properties {
>   /* FB Blanking active? (values as for power) */
>   /* Due to be removed, please use (state & BL_CORE_FBBLANK) */
>   int fb_blank;
> + /* Backlight type */
> + enum backlight_type type;
>   /* Flags used to signal drivers of state changes */
>   /* Upper 4 bits are reserved for driver internal use */
>   unsigned int state;

And if/when the half-life expires, we'll have drivers in-tree which
forget to set backlight_properties.type.  I haven't checked, but if
we're lucky they will default to "0".

What will be the runtime effects upon such unconverted drivers? 
Ideally we'd like them to continue to work OK, and to emit a runtime
warning.  In which case you'll need BACKLIGHT_RAW=1 so the unconverted
driver can be detected, warned about and fixed up by the core code.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] x86 platform driver: intelligent power sharing driver

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 11:18:54 -0700 Jesse Barnes  
wrote:

> > > > +#define thm_writeb(off, val) writeb((val), ips->regmap + (off))
> > > > +#define thm_writew(off, val) writew((val), ips->regmap + (off))
> > > > +#define thm_writel(off, val) writel((val), ips->regmap + (off))
> > > 
> > > ick.
> > > 
> > > static inline unsigned short thm_readw(struct ips_driver *ips, unsigned 
> > > offset)
> > > {
> > >   readw(ips->regmap + offset);
> > > }
> > > 
> > > would be nicer.
> > 
> > Yes, it would.
> 
> No, I take that back, it just means more typing.  This idiom of
> expecting a given variable to be declared for the IO routines to work
> is pretty common in drivers,

yeah, but it sucks there, too.

> and saves a bunch of redundant "(ips," everywhere...

It's not redundant - it's C.

grr.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
wrote:

> + reloc_offset = src_priv->gtt_offset;
>   for (page = 0; page < page_count; page++) {
> - void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + void __iomem *s;
> + void *d;
> +
> + d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> - memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> +
> + s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +  reloc_offset);
> + memcpy_fromio(d, s, PAGE_SIZE);
> + io_mapping_unmap_atomic(s);

As mentioned in the other email, this will still corrupt the KM_USER0
slot, and will generate a debug_kmap_atomic() warning.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] x86 platform driver: intelligent power sharing driver

2010-05-10 Thread Andrew Morton
On Mon, 10 May 2010 14:26:52 -0700 Jesse Barnes  
wrote:

> Intel Core i3/5 platforms with integrated graphics support both CPU and
> GPU turbo mode.  CPU turbo mode is opportunistic: the CPU will use any
> available power to increase core frequencies if thermal headroom is
> available.  The GPU side is more manual however; the graphics driver
> must monitor GPU power and temperature and coordinate with a core
> thermal driver to take advantage of available thermal and power headroom
> in the package.
> 
> The intelligent power sharing (IPS) driver is intended to coordinate
> this activity by monitoring MCP (multi-chip package) temperature and
> power, allowing the CPU and/or GPU to increase their power consumption,
> and thus performance, when possible.  The goal is to maximize
> performance within a given platform's TDP (thermal design point).
> 
>
> ...
>
> +#define thm_readb(off) readb(ips->regmap + (off))
> +#define thm_readw(off) readw(ips->regmap + (off))
> +#define thm_readl(off) readl(ips->regmap + (off))
> +#define thm_readq(off) readq(ips->regmap + (off))
> +
> +#define thm_writeb(off, val) writeb((val), ips->regmap + (off))
> +#define thm_writew(off, val) writew((val), ips->regmap + (off))
> +#define thm_writel(off, val) writel((val), ips->regmap + (off))

ick.

static inline unsigned short thm_readw(struct ips_driver *ips, unsigned offset)
{
readw(ips->regmap + offset);
}

would be nicer.

>
> ...
>
> +static void ips_enable_cpu_turbo(struct ips_driver *ips)
> +{
> + /* Already on, no need to mess with MSRs */
> + if (ips->__cpu_turbo_on)
> + return;
> +
> + on_each_cpu(do_enable_cpu_turbo, ips, 1);
> +
> + ips->__cpu_turbo_on = true;
> +}

How does the code ensure that a hot-added CPU comes up in the correct
turbo state?

>
> ...
>
> +static bool cpu_exceeded(struct ips_driver *ips, int cpu)
> +{
> + unsigned long flags;
> + int avg;
> + bool ret = false;
> +
> + spin_lock_irqsave(&ips->turbo_status_lock, flags);
> + avg = cpu ? ips->ctv2_avg_temp : ips->ctv1_avg_temp;
> + if (avg > (ips->limits->core_temp_limit * 100))
> + ret = true;
> + if (ips->cpu_avg_power > ips->core_power_limit)
> + ret = true;
> + spin_unlock_irqrestore(&ips->turbo_status_lock, flags);
> +
> + if (ret)
> + printk(KERN_CRIT "CPU power or thermal limit exceeded\n");
> +
> + return ret;
> +}

afacit these messages might come out at one-per-five-seconds max?

I bet the driver blows up and someone's logs get spammed ;)

>
> ...
>
> +static int ips_adjust(void *data)
> +{
> + struct ips_driver *ips = data;
> + unsigned long flags;
> +
> + dev_dbg(&ips->dev->dev, "starting ips-adjust thread\n");
> +
> + /*
> +  * Adjust CPU and GPU clamps every 5s if needed.  Doing it more
> +  * often isn't recommended due to ME interaction.
> +  */
> + do {
> + bool cpu_busy = ips_cpu_busy(ips);
> + bool gpu_busy = ips_gpu_busy(ips);
> +
> + spin_lock_irqsave(&ips->turbo_status_lock, flags);
> + if (ips->poll_turbo_status)
> + update_turbo_limits(ips);
> + spin_unlock_irqrestore(&ips->turbo_status_lock, flags);
> +
> + /* Update turbo status if necessary */
> + if (ips->cpu_turbo_enabled)
> + ips_enable_cpu_turbo(ips);
> + else
> + ips_disable_cpu_turbo(ips);
> +
> + if (ips->gpu_turbo_enabled)
> + ips_enable_gpu_turbo(ips);
> + else
> + ips_disable_gpu_turbo(ips);
> +
> + /* We're outside our comfort zone, crank them down */
> + if (!mcp_exceeded(ips)) {
> + ips_cpu_lower(ips);
> + ips_gpu_lower(ips);
> + goto sleep;
> + }
> +
> + if (!cpu_exceeded(ips, 0) && cpu_busy)
> + ips_cpu_raise(ips);
> + else
> + ips_cpu_lower(ips);
> +
> + if (!mch_exceeded(ips) && gpu_busy)
> + ips_gpu_raise(ips);
> + else
> + ips_gpu_lower(ips);
> +
> +sleep:
> + 
> schedule_timeout_interruptible(msecs_to_jiffies(IPS_ADJUST_PERIOD));
> + } while(!kthread_should_stop());

please run checkpatch.

> + dev_dbg(&ips->dev->dev, "ips-adjust thread stopped\n");
> +
> + return 0;
> +}

Did we really need a new kernel thread for this?  Can't use
schedule_delayed_work() or something?  Maybe slow-work, or one of the
other just-like-workqueues-only-different things we seem to keep
adding?

> +/*
> + * Helpers for reading out temp/power values and calculating their
> + * averages for the decision making and monitoring functions.
> + */
> +
> +static u16 calc_avg_temp(struct ips_driver *ips, u16 *array)
> +{
> + u64 total = 0;
> + int i;
> + u16 avg;
> +
> +