[PATCH] documentation: Update ide-cd documentation to reflect CONFIG_BLK_DEV_HD_IDE removal
CONFIG_BLK_DEV_HD_IDE was removed in commit 80aa31cb460d ("ide: remove CONFIG_BLK_DEV_HD_IDE config option (take 2)") but the ide-cd documentation was not updated and still asks users to disable it, which is misleading and involves a fruitless search. Signed-off-by: Finn Thaindiff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd index f4dc9de2694e..a5f2a7f1ff46 100644 --- a/Documentation/cdrom/ide-cd +++ b/Documentation/cdrom/ide-cd @@ -55,13 +55,9 @@ This driver provides the following features: (to compile support as a module which can be loaded and unloaded) to the options: - Enhanced IDE/MFM/RLL disk/cdrom/tape/floppy support + ATA/ATAPI/MFM/RLL support Include IDE/ATAPI CDROM support - and `no' to - - Use old disk-only driver on primary interface - Depending on what type of IDE interface you have, you may need to specify additional configuration options. See Documentation/ide/ide.txt. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v11 3/6] mm, oom: cgroup-aware OOM killer
On Wed, Oct 11, 2017 at 01:21:47PM -0700, David Rientjes wrote: > On Tue, 10 Oct 2017, Roman Gushchin wrote: > > > > We don't need a better approximation, we need a fair comparison. The > > > heuristic that this patchset is implementing is based on the usage of > > > individual mem cgroups. For the root mem cgroup to be considered > > > eligible, we need to understand its usage. That usage is _not_ what is > > > implemented by this patchset, which is the largest rss of a single > > > attached process. This, in fact, is not an "approximation" at all. In > > > the example of 1 processes attached with 80MB rss each, the usage of > > > the root mem cgroup is _not_ 80MB. > > > > It's hard to imagine a "healthy" setup with 1 process in the root > > memory cgroup, and even if we kill 1 process we will still have > > remaining process. I agree with you at some point, but it's not > > a real world example. > > > > It's an example that illustrates the problem with the unfair comparison > between the root mem cgroup and leaf mem cgroups. It's unfair to compare > [largest rss of a single process attached to a cgroup] to > [anon + unevictable + unreclaimable slab usage of a cgroup]. It's not an > approximation, as previously stated: the usage of the root mem cgroup is > not 100MB if there are 10 such processes attached to the root mem cgroup, > it's off by orders of magnitude. > > For the root mem cgroup to be treated equally as a leaf mem cgroup as this > patchset proposes, it must have a fair comparison. That can be done by > accounting memory to the root mem cgroup in the same way it is to leaf mem > cgroups. > > But let's move the discussion forward to fix it. To avoid necessarily > accounting memory to the root mem cgroup, have we considered if it is even > necessary to address the root mem cgroup? For the users who opt-in to > this heuristic, would it be possible to discount the root mem cgroup from > the heuristic entirely so that oom kills originate from leaf mem cgroups? > Or, perhaps better, oom kill from non-memory.oom_group cgroups only if > the victim rss is greater than an eligible victim rss attached to the root > mem cgroup? David, I'm not pretending for implementing the best possible accounting for the root memory cgroup, and I'm sure there is a place for further enhancement. But if it's not leading to some obviously stupid victim selection (like ignoring leaking task, which consumes most of the memory), I don't see why it should be treated as a blocker for the whole patchset. I also doubt that any of us has these examples, and the best way to get them is to get some real usage feedback. Ignoring oom_score_adj, subtracting leaf usage sum from system usage etc, these all are perfect ideas which can be implemented on top of this patchset. > > > > For these reasons: unfair comparison of root mem cgroup usage to bias > > > against that mem cgroup from oom kill in system oom conditions, the > > > ability of users to completely evade the oom killer by attaching all > > > processes to child cgroups either purposefully or unpurposefully, and the > > > inability of userspace to effectively control oom victim selection: > > > > > > Nacked-by: David Rientjes> > > > So, if we'll sum the oom_score of tasks belonging to the root memory cgroup, > > will it fix the problem? > > > > It might have some drawbacks as well (especially around oom_score_adj), > > but it's doable, if we'll ignore tasks which are not owners of their's mm > > struct. > > > > You would be required to discount oom_score_adj because the heuristic > doesn't account for oom_score_adj when comparing the anon + unevictable + > unreclaimable slab of leaf mem cgroups. This wouldn't result in the > correct victim selection in real-world scenarios where processes attached > to the root mem cgroup are vital to the system and not part of any user > job, i.e. they are important system daemons and the "activity manager" > responsible for orchestrating the cgroup hierarchy. > > It's also still unfair because it now compares > [sum of rss of processes attached to a cgroup] to > [anon + unevictable + unreclaimable slab usage of a cgroup]. RSS isn't > going to be a solution, regardless if its one process or all processes, if > it's being compared to more types of memory in leaf cgroups. > > If we really don't want root mem cgroup accounting so this is a fair > comparison, I think the heuristic needs to special case the root mem > cgroup either by discounting root oom kills if there are eligible oom > kills from leaf cgroups (the user would be opting-in to this behavior) or > comparing the badness of a victim from a leaf cgroup to the badness of a > victim from the root cgroup when deciding which to kill and allow the user > to protect root mem cgroup processes with oom_score_adj. > > That aside, all of this has only addressed one of the three concerns with
Re: [PATCH 04/24] media: v4l2-mediabus: convert flags to enums and document them
On Mon 2017-10-09 07:19:10, Mauro Carvalho Chehab wrote: > There is a mess with media bus flags: there are two sets of > flags, one used by parallel and ITU-R BT.656 outputs, > and another one for CSI2. > > Depending on the type, the same bit has different meanings. > > @@ -86,11 +125,22 @@ enum v4l2_mbus_type { > /** > * struct v4l2_mbus_config - media bus configuration > * @type:in: interface type > - * @flags: in / out: configuration flags, depending on @type > + * @pb_flags:in / out: configuration flags, if @type is > + * %V4L2_MBUS_PARALLEL or %V4L2_MBUS_BT656. > + * @csi2_flags: in / out: configuration flags, if @type is > + * %V4L2_MBUS_CSI2. > + * @flag:access flags, no matter the @type. > + * Used just to avoid needing to rewrite the logic inside > + * soc_camera and pxa_camera drivers. Don't use on newer > + * drivers! > */ > struct v4l2_mbus_config { > enum v4l2_mbus_type type; > - unsigned int flags; > + union { > + enum v4l2_mbus_parallel_and_bt656_flags pb_flags; > + enum v4l2_mbus_csi2_flags csi2_flags; > + unsigned intflag; > + }; > }; > > static inline void v4l2_fill_pix_format(struct v4l2_pix_format > *pix_fmt, The flags->flag conversion is quite subtle, and "flag" is confusing because there is more than one inside. What about something like __legacy_flags? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: [v11 3/6] mm, oom: cgroup-aware OOM killer
On Wed, 11 Oct 2017, Michal Hocko wrote: > > For these reasons: unfair comparison of root mem cgroup usage to bias > > against that mem cgroup from oom kill in system oom conditions, the > > ability of users to completely evade the oom killer by attaching all > > processes to child cgroups either purposefully or unpurposefully, and the > > inability of userspace to effectively control oom victim selection: > > > > Nacked-by: David Rientjes> > I consider this NACK rather dubious. Evading the heuristic as you > describe requires root privileges in default configuration because > normal users are not allowed to create subtrees. If you > really want to delegate subtree to an untrusted entity then you do not > have to opt-in for this oom strategy. We can work on an additional means > which would allow to cover those as well (e.g. priority based one which > is requested for other usecases). > You're missing the point that the user is trusted and it may be doing something to circumvent oom kill unknowingly. With a single unified hierarchy, the user is forced to attach its processes to subcontainers if it wants to constrain resources with other controllers. Doing so ends up completely avoiding oom kill because of this implementation detail. It has nothing to do with trust and the admin who is opting-in will not know a user has cirumvented oom kill purely because it constrains its processes with controllers other than the memory controller. > A similar argument applies to the root memcg evaluation. While the > proposed behavior is not optimal it would work for general usecase > described here where the root memcg doesn't really run any large number > of tasks. If somebody who explicitly opts-in for the new strategy and it > doesn't work well for that usecase we can enhance the behavior. That > alone is not a reason to nack the whole thing. > > I find it really disturbing that you keep nacking this approach just > because it doesn't suite your specific usecase while it doesn't break > it. Moreover it has been stated several times already that future > improvements are possible and cover what you have described already. This has nothing to do with my specific usecase. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v11 3/6] mm, oom: cgroup-aware OOM killer
On Tue, 10 Oct 2017, Roman Gushchin wrote: > > We don't need a better approximation, we need a fair comparison. The > > heuristic that this patchset is implementing is based on the usage of > > individual mem cgroups. For the root mem cgroup to be considered > > eligible, we need to understand its usage. That usage is _not_ what is > > implemented by this patchset, which is the largest rss of a single > > attached process. This, in fact, is not an "approximation" at all. In > > the example of 1 processes attached with 80MB rss each, the usage of > > the root mem cgroup is _not_ 80MB. > > It's hard to imagine a "healthy" setup with 1 process in the root > memory cgroup, and even if we kill 1 process we will still have > remaining process. I agree with you at some point, but it's not > a real world example. > It's an example that illustrates the problem with the unfair comparison between the root mem cgroup and leaf mem cgroups. It's unfair to compare [largest rss of a single process attached to a cgroup] to [anon + unevictable + unreclaimable slab usage of a cgroup]. It's not an approximation, as previously stated: the usage of the root mem cgroup is not 100MB if there are 10 such processes attached to the root mem cgroup, it's off by orders of magnitude. For the root mem cgroup to be treated equally as a leaf mem cgroup as this patchset proposes, it must have a fair comparison. That can be done by accounting memory to the root mem cgroup in the same way it is to leaf mem cgroups. But let's move the discussion forward to fix it. To avoid necessarily accounting memory to the root mem cgroup, have we considered if it is even necessary to address the root mem cgroup? For the users who opt-in to this heuristic, would it be possible to discount the root mem cgroup from the heuristic entirely so that oom kills originate from leaf mem cgroups? Or, perhaps better, oom kill from non-memory.oom_group cgroups only if the victim rss is greater than an eligible victim rss attached to the root mem cgroup? > > For these reasons: unfair comparison of root mem cgroup usage to bias > > against that mem cgroup from oom kill in system oom conditions, the > > ability of users to completely evade the oom killer by attaching all > > processes to child cgroups either purposefully or unpurposefully, and the > > inability of userspace to effectively control oom victim selection: > > > > Nacked-by: David Rientjes> > So, if we'll sum the oom_score of tasks belonging to the root memory cgroup, > will it fix the problem? > > It might have some drawbacks as well (especially around oom_score_adj), > but it's doable, if we'll ignore tasks which are not owners of their's mm > struct. > You would be required to discount oom_score_adj because the heuristic doesn't account for oom_score_adj when comparing the anon + unevictable + unreclaimable slab of leaf mem cgroups. This wouldn't result in the correct victim selection in real-world scenarios where processes attached to the root mem cgroup are vital to the system and not part of any user job, i.e. they are important system daemons and the "activity manager" responsible for orchestrating the cgroup hierarchy. It's also still unfair because it now compares [sum of rss of processes attached to a cgroup] to [anon + unevictable + unreclaimable slab usage of a cgroup]. RSS isn't going to be a solution, regardless if its one process or all processes, if it's being compared to more types of memory in leaf cgroups. If we really don't want root mem cgroup accounting so this is a fair comparison, I think the heuristic needs to special case the root mem cgroup either by discounting root oom kills if there are eligible oom kills from leaf cgroups (the user would be opting-in to this behavior) or comparing the badness of a victim from a leaf cgroup to the badness of a victim from the root cgroup when deciding which to kill and allow the user to protect root mem cgroup processes with oom_score_adj. That aside, all of this has only addressed one of the three concerns with the patchset. I believe the solution to avoid allowing users to circumvent oom kill is to account usage up the hierarchy as you have done in the past. Cgroup hierarchies can be managed by the user so they can create their own subcontainers, this is nothing new, and I would hope that you wouldn't limit your feature to only a very specific set of usecases. That may be your solution for the root mem cgroup itself: if the hierarchical usage of all top-level mem cgroups is known, it's possible to find the root mem cgroup usage by subtraction, you are using stats that are global vmstats in your heuristic. Accounting usage up the hierarchy avoids the first two concerns with the patchset. It allows you to implicitly understand the usage of the root mem cgroup itself, and does not allow users to circumvent oom kill by creating
Re: [PATCH v3 00/17] kernel-doc: add supported to document nested structs/
Em Mon, 9 Oct 2017 10:19:55 -0300 Mauro Carvalho Chehabescreveu: > > > I really don't want to add that much noise to the docs build; I think it > > > will defeat any hope of cleaning up the warnings we already have. I > > > wonder if we could suppress warnings about nested structs by default, and > > > add a flag or envar to turn them back on for those who want them? > > > > This is what I vote for: +1 > > > > > In truth, now that I look, the real problem is that the warnings are > > > repeated many times - as in, like 100 times: > > > > > >> ./include/net/cfg80211.h:4115: warning: Function parameter or member > > >> 'wext.ibss' not described in 'wireless_dev' > > >> ./include/net/cfg80211.h:4115: warning: Function parameter or member > > >> 'wext.ibss' not described in 'wireless_dev' > > > <107 instances deleted...> > > >> ./include/net/cfg80211.h:4115: warning: Function parameter or member > > >> 'wext.ibss' not described in 'wireless_dev' > > > > > > That's not something that used to happen, and is certainly not helpful. > > > Figuring that out would help a lot. Can I get you to take a look at > > > what's going on here? > > > > Hi Jon, if you grep for > > > > .. kernel-doc:: include/net/cfg80211.h > > > > e.g. by: grep cfg80211.h Documentation/driver-api/80211/cfg80211.rst | wc > > -l > > you will see, that this directive is used 110 times in one reST file. If you > > have 5 warnings about the kernel-doc markup in include/net/cfg80211.h you > > will > > get 550 warnings about kernel-doc markup just for just one source code file. > > > > This is because the kernel-doc parser is an external process which is called > > (in this example) 110 times for one reST file and one source code file. If > > include/net/cfg80211.h is referred in other reST files, we got accordingly > > more and more warnings .. thats really noise. > > I guess the solution here is simple: if any output selection argument > is passed (-export, -internal, -function, -nofunction), only report > warnings for things that match the output criteria. > > Not sure how easy is to implement that. I'll take a look. It is actually very easy to suppress those warnings. See the enclosed proof of concept patch. Please notice that this patch is likely incomplete: all it does is that, if --function or --nofunction is used, it won't print any warnings for structs or enums. I didn't check yet if this is not making it too silent. If we're going to follow this way, IMHO, the best would be to define a warning function and move the needed checks for $output_selection and for $function for it to do the right thing, printing warnings only for the stuff that will be output. Jon, Please let me know if you prefer go though this way or if, instead, you opt to replace the kernel-doc perl script by a Sphinx module. If you decide to keep with the perl script for now, I can work on an improved version of this PoC. Regards, Mauro scripts: kernel-doc: shut up enum/struct warnings if parsing only functions There are two command line parameters that restrict the kernel-doc output to output only functions. If this is used, it doesn't make any sense to produce warnings about enums and structs. So, shut up such warnings. Signed-off-by: Mauro Carvalho Chehab diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 049044d95c0e..b4eebea6a8d6 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1138,16 +1138,20 @@ sub dump_enum($$) { push @parameterlist, $arg; if (!$parameterdescs{$arg}) { $parameterdescs{$arg} = $undescribed; - print STDERR "${file}:$.: warning: Enum value '$arg' ". - "not described in enum '$declaration_name'\n"; + if ($output_selection != OUTPUT_INCLUDE && + $output_selection != OUTPUT_EXCLUDE) { + print STDERR "${file}:$.: warning: Enum value '$arg' not described in enum '$declaration_name'\n"; + } } $_members{$arg} = 1; } while (my ($k, $v) = each %parameterdescs) { if (!exists($_members{$k})) { -print STDERR "${file}:$.: warning: Excess enum value " . - "'$k' description in '$declaration_name'\n"; + if ($output_selection != OUTPUT_INCLUDE && + $output_selection != OUTPUT_EXCLUDE) { +print STDERR "${file}:$.: warning: Excess enum value '$k' description in '$declaration_name'\n"; + } } } @@ -1353,9 +1357,12 @@ sub push_parameter() { if (!defined $parameterdescs{$param} && $param !~ /^#/) { $parameterdescs{$param} = $undescribed; - print STDERR - "${file}:$.: warning: Function parameter or member '$param' not described in '$declaration_name'\n"; -
Re: [PATCH v2 1/2] kbuild: Add a cache for generated variables
Hi, On Tue, Oct 10, 2017 at 8:59 PM, Masahiro Yamadawrote: > Hi Douglas, > > > 2017-10-05 7:37 GMT+09:00 Douglas Anderson : >> While timing a "no-op" build of the kernel (incrementally building the >> kernel even though nothing changed) in the Chrome OS build system I >> found that it was much slower than I expected. >> >> Digging into things a bit, I found that quite a bit of the time was >> spent invoking the C compiler even though we weren't actually building >> anything. Currently in the Chrome OS build system the C compiler is >> called through a number of wrappers (one of which is written in >> python!) and can take upwards of 100 ms to invoke even if we're not >> doing anything difficult, so these invocations of the compiler were >> taking a lot of time. Worse the invocations couldn't seem to take >> advantage of the multiple cores on my system. >> >> Certainly it seems like we could make the compiler invocations in the >> Chrome OS build system faster, but only to a point. Inherently >> invoking a program as big as a C compiler is a fairly heavy >> operation. Thus even if we can speed the compiler calls it made sense >> to track down what was happening. >> >> It turned out that all the compiler invocations were coming from >> usages like this in the kernel's Makefile: >> >> KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,) >> >> Due to the way cc-option and similar statements work the above >> contains an implicit call to the C compiler. ...and due to the fact >> that we're storing the result in KBUILD_CFLAGS, a simply expanded >> variable, the call will happen every time the Makefile is parsed, even >> if there are no users of KBUILD_CFLAGS. >> >> Rather than redoing this computation every time, it makes a lot of >> sense to cache the result of all of the Makefile's compiler calls just >> like we do when we compile a ".c" file to a ".o" file. Conceptually >> this is quite a simple idea. ...and since the calls to invoke the >> compiler and similar tools are centrally located in the Kbuild.include >> file this doesn't even need to be super invasive. >> >> Implementing the cache in a simple-to-use and efficient way is not >> quite as simple as it first sounds, though. To get maximum speed we >> really want the cache in a format that make can natively understand >> and make doesn't really have an ability to load/parse files. ...but >> make _can_ import other Makefiles, so the solution is to store the >> cache in Makefile format. This requires coming up with a valid/unique >> Makefile variable name for each value to be cached, but that's >> solvable with some cleverness. >> >> After this change, we'll automatically create a ".cache.mk" file that >> will contain our cached variables. We'll load this on each invocation >> of make and will avoid recomputing anything that's already in our >> cache. The cache is stored in a format that it shouldn't need any >> invalidation since anything that might change should affect the "key" >> and any old cached value won't be used. >> >> Signed-off-by: Douglas Anderson > > > I reviewed and tested this patch more closely. > > V2 is almost good, but > I see some problem and things that should be improved. > (including bike-shed) > > > [1] > > If you apply this patch and run "make clean" > on a machine without "sphinx-build" installed, > you will see a mysterious error message like follows: > > > $ make clean > Documentation/Makefile:24: The 'sphinx-build' command was not found. > Make sure you have Sphinx installed and in PATH, or set the > SPHINXBUILD make variable to point to the full path of the > 'sphinx-build' executable. > > Detected OS: Ubuntu 16.04.2 LTS. > Warning: better to also install "dot". > Warning: better to also install "rsvg-convert". > ERROR: please install "virtualenv", otherwise, build won't work. > You should run: > > sudo apt-get install graphviz librsvg2-bin virtualenv > virtualenv sphinx_1.4 > . sphinx_1.4/bin/activate > pip install -r Documentation/sphinx/requirements.txt > > Can't build as 2 mandatory dependencies are missing at > ./scripts/sphinx-pre-install line 566. > > > > This comes from the ".DEFAULT" target > when "make clean" descends into Documentation/ directory. > > > You can fix it by adding > > $(make-cache): ; > > to scripts/Kbuild.include > > > This will prevent Make from searching > a target that would generate $(make-cache). > > > (Of course, we can fix Documentation/Makefile > to not use '.DEFAULT', > but canceling $(make-cache) rule is a good thing.) > > > You will need this > https://patchwork.kernel.org/patch/9998651/ > > before adding the target to Kbuild.include Thanks! I will do that. > [2] Please clean up .cache.mk > > Adding .cache.mk pattern around line 1540 will be good. Done. > A few more comments below. > > > >> +--- 3.14 $(LD) support function cache >> + >> +One thing to realize about all the
Re: [v11 3/6] mm, oom: cgroup-aware OOM killer
On Tue, Oct 10, 2017 at 02:13:00PM -0700, David Rientjes wrote: > On Tue, 10 Oct 2017, Roman Gushchin wrote: > > > > This seems to unfairly bias the root mem cgroup depending on process > > > size. > > > It isn't treated fairly as a leaf mem cgroup if they are being compared > > > based on different criteria: the root mem cgroup as (mostly) the largest > > > rss of a single process vs leaf mem cgroups as all anon, unevictable, and > > > unreclaimable slab pages charged to it by all processes. > > > > > > I imagine a configuration where the root mem cgroup has 100 processes > > > attached each with rss of 80MB, compared to a leaf cgroup with 100 > > > processes of 1MB rss each. How does this logic prevent repeatedly oom > > > killing the processes of 1MB rss? > > > > > > In this case, "the root cgroup is treated as a leaf memory cgroup" isn't > > > quite fair, it can simply hide large processes from being selected. > > > Users > > > who configure cgroups in a unified hierarchy for other resource > > > constraints are penalized for this choice even though the mem cgroup with > > > 100 processes of 1MB rss each may not be limited itself. > > > > > > I think for this comparison to be fair, it requires accounting for the > > > root mem cgroup itself or for a different accounting methodology for leaf > > > memory cgroups. > > > > This is basically a workaround, because we don't have necessary stats for > > root > > memory cgroup. If we'll start gathering them at some point, we can change > > this > > and treat root memcg exactly as other leaf cgroups. > > > > I understand why it currently cannot be an apples vs apples comparison > without, as I suggest in the last paragraph, that the same accounting is > done for the root mem cgroup, which is intuitive if it is to be considered > on the same basis as leaf mem cgroups. > > I understand for the design to work that leaf mem cgroups and the root mem > cgroup must be compared if processes can be attached to the root mem > cgroup. My point is that it is currently completely unfair as I've > stated: you can have 1 processes attached to the root mem cgroup with > rss of 80MB each and a leaf mem cgroup with 100 processes of 1MB rss each > and the oom killer is going to target the leaf mem cgroup as a result of > this apples vs oranges comparison. > > In case it's not clear, the 1 processes of 80MB rss each is the most > likely contributor to a system-wide oom kill. Unfortunately, the > heuristic introduced by this patchset is broken wrt a fair comparison of > the root mem cgroup usage. > > > Or, if someone will come with an idea of a better approximation, it can be > > implemented as a separate enhancement on top of the initial implementation. > > This is more than welcome. > > > > We don't need a better approximation, we need a fair comparison. The > heuristic that this patchset is implementing is based on the usage of > individual mem cgroups. For the root mem cgroup to be considered > eligible, we need to understand its usage. That usage is _not_ what is > implemented by this patchset, which is the largest rss of a single > attached process. This, in fact, is not an "approximation" at all. In > the example of 1 processes attached with 80MB rss each, the usage of > the root mem cgroup is _not_ 80MB. > > I'll restate that oom killing a process is a last resort for the kernel, > but it also must be able to make a smart decision. Targeting dozens of > 1MB processes instead of 80MB processes because of a shortcoming in this > implementation is not the appropriate selection, it's the opposite of the > correct selection. > > > > I'll reiterate what I did on the last version of the patchset: > > > considering > > > only leaf memory cgroups easily allows users to defeat this heuristic and > > > bias against all of their memory usage up to the largest process size > > > amongst the set of processes attached. If the user creates N child mem > > > cgroups for their N processes and attaches one process to each child, the > > > _only_ thing this achieved is to defeat your heuristic and prefer other > > > leaf cgroups simply because those other leaf cgroups did not do this. > > > > > > Effectively: > > > > > > for i in $(cat cgroup.procs); do mkdir $i; echo $i > $i/cgroup.procs; done > > > > > > will radically shift the heuristic from a score of all anonymous + > > > unevictable memory for all processes to a score of the largest anonymous + > > > unevictable memory for a single process. There is no downside or > > > ramifaction for the end user in doing this. When comparing cgroups based > > > on usage, it only makes sense to compare the hierarchical usage of that > > > cgroup so that attaching processes to descendants or splitting the > > > implementation of a process into several smaller individual processes > > > does > > > not allow this heuristic to be defeated. > > > > To all
Re: [PATCH 1/4] iio: adc: ina2xx: Make max expected current configurable
On Wed, 11 Oct 2017 16:42:53 +0200 Maciej Purskiwrote: > On 10/09/2017 03:35 PM, Jonathan Cameron wrote: > > On Mon, 9 Oct 2017 10:08:42 +0200 > > Maciej Purski wrote: > > > >> On 10/08/2017 11:47 AM, Jonathan Cameron wrote: > >>> On Wed, 04 Oct 2017 09:11:31 +0200 > >>> Maciej Purski wrote: > >>> > On 10/01/2017 12:29 PM, Jonathan Cameron wrote: > > On Thu, 28 Sep 2017 14:50:12 +0200 > > Maciej Purski wrote: > > > >> Max expected current is used for calculating calibration register > >> value, > >> Current LSB and Power LSB according to equations found in ina > >> datasheet. > >> Max expected current is now implicitly set to default value, > >> which is 2^15, thanks to which Current LSB is equal to 1 mA and > >> Power LSB is equal to 2 uW or 25000 uW depending on ina model. > >> > >> Make max expected current configurable, just like it's already done > >> with shunt resistance: from device tree, platform_data or later > >> from sysfs. On each max_expected_current change, calculate new values > >> for Current LSB and Power LSB. According to datasheet Current LSB > >> should > >> be calculated by dividing max expected current by 2^15, as values read > >> from device registers are in this case 16-bit integers. Power LSB > >> is calculated by multiplying Current LSB by a factor, which is defined > >> in ina documentation. > > > > One odd bit of casting inline. Also this is new userspace ABI. > > It needs documenting in > > > > Documentation/ABI/testing/sysfs-bus-iio* as appropriate. > > I'm also unclear on one element about this - is it a value used only > > for calibration or are we talking about the actual 'range' of the > > device? > > > > This is used for calibration. The device measures directly only voltage. > However, it has also current and power registers. Their values are > calculated by the device using the calibration register which is > calculated > using max expected current. So I guess that it's not what you mean > by the actual 'range' of the device. > > > The interpretation of this value isn't clear against the more general > > ABI. > > > > In particular it is it in raw units (adc counts) or mA? Docs say > > that but the naming of the attribute doesn't make this clear. > > > > It's in mA. I can make it clear in the attribute name. > > > Also I'm unconvinced this isn't better represented using the > > range specifications available for any IIO attribute on the raw > > value in combination with adjusting the scale value. > > Note not many drivers yet provide ranges on their raw outputs > > but we do have core support for it. I've been meaning to start > > pushing this out into drivers, but been busy since we introduced > > the core support. The dpot-dac driver does use it for examplel > > > > > I'm not sure if what I'm about to add is similar to what is done > in the mentioned dpot-dac driver. It seems that the callback read_avail > returns information on raw values which can be obtained from the device. > What I need is an adjustable value, which is then used by the device > internally > in order to calculate current with the requested precision. Max expected > current > is also used for calculating the scale value. > Tell me if I'm wrong. Maybe I misunderstood the 'range' concept in IIO > and > your solution fits in here. > > >>> > >>> I think I answered this in the other branch of the thread. > >>> _calibscale is what you want here as it's internal to the device. > >>> > >>> It's not one often used for ADCs but quite a few other types of > >>> device provide some front end analog adjustment (whilst it is digital > >>> here, it is applied within the device - so we don't need to care). > >>> > >>> Jonathan > >> > >> Thank you for your explanation. Calibscale seems suitable for me in this > >> case, > >> but what do you think I should do then with SCALE attribute? Should I get > >> rid of > >> it for current and use only calibscale? Or should I use both calibscale and > >> scale attributes and for current they will be the same value? > > > > You'll have to leave it as it is existing ABI. It won't have the same value > > as calibscale. Calibscale is for internal changes that don't effect the raw > > value. scale is to be applied by userspace to the raw value. As I > > understand it > > here the calibscale value should have no effect on scale. > > > Sorry, but now I'm a little bit confused. Which value would you > like me to substitute with calibscale - max_expected_current or current_lsb? As it is an internal
Re: [PATCH 1/4] iio: adc: ina2xx: Make max expected current configurable
On 10/09/2017 03:35 PM, Jonathan Cameron wrote: On Mon, 9 Oct 2017 10:08:42 +0200 Maciej Purskiwrote: On 10/08/2017 11:47 AM, Jonathan Cameron wrote: On Wed, 04 Oct 2017 09:11:31 +0200 Maciej Purski wrote: On 10/01/2017 12:29 PM, Jonathan Cameron wrote: On Thu, 28 Sep 2017 14:50:12 +0200 Maciej Purski wrote: Max expected current is used for calculating calibration register value, Current LSB and Power LSB according to equations found in ina datasheet. Max expected current is now implicitly set to default value, which is 2^15, thanks to which Current LSB is equal to 1 mA and Power LSB is equal to 2 uW or 25000 uW depending on ina model. Make max expected current configurable, just like it's already done with shunt resistance: from device tree, platform_data or later from sysfs. On each max_expected_current change, calculate new values for Current LSB and Power LSB. According to datasheet Current LSB should be calculated by dividing max expected current by 2^15, as values read from device registers are in this case 16-bit integers. Power LSB is calculated by multiplying Current LSB by a factor, which is defined in ina documentation. One odd bit of casting inline. Also this is new userspace ABI. It needs documenting in Documentation/ABI/testing/sysfs-bus-iio* as appropriate. I'm also unclear on one element about this - is it a value used only for calibration or are we talking about the actual 'range' of the device? This is used for calibration. The device measures directly only voltage. However, it has also current and power registers. Their values are calculated by the device using the calibration register which is calculated using max expected current. So I guess that it's not what you mean by the actual 'range' of the device. The interpretation of this value isn't clear against the more general ABI. In particular it is it in raw units (adc counts) or mA? Docs say that but the naming of the attribute doesn't make this clear. It's in mA. I can make it clear in the attribute name. Also I'm unconvinced this isn't better represented using the range specifications available for any IIO attribute on the raw value in combination with adjusting the scale value. Note not many drivers yet provide ranges on their raw outputs but we do have core support for it. I've been meaning to start pushing this out into drivers, but been busy since we introduced the core support. The dpot-dac driver does use it for examplel I'm not sure if what I'm about to add is similar to what is done in the mentioned dpot-dac driver. It seems that the callback read_avail returns information on raw values which can be obtained from the device. What I need is an adjustable value, which is then used by the device internally in order to calculate current with the requested precision. Max expected current is also used for calculating the scale value. Tell me if I'm wrong. Maybe I misunderstood the 'range' concept in IIO and your solution fits in here. I think I answered this in the other branch of the thread. _calibscale is what you want here as it's internal to the device. It's not one often used for ADCs but quite a few other types of device provide some front end analog adjustment (whilst it is digital here, it is applied within the device - so we don't need to care). Jonathan Thank you for your explanation. Calibscale seems suitable for me in this case, but what do you think I should do then with SCALE attribute? Should I get rid of it for current and use only calibscale? Or should I use both calibscale and scale attributes and for current they will be the same value? You'll have to leave it as it is existing ABI. It won't have the same value as calibscale. Calibscale is for internal changes that don't effect the raw value. scale is to be applied by userspace to the raw value. As I understand it here the calibscale value should have no effect on scale. Sorry, but now I'm a little bit confused. Which value would you like me to substitute with calibscale - max_expected_current or current_lsb? Both values do have effect on the scale, as scale is basically current_lsb / 1000 and on the raw value, as current register is calculated internally using current_lsb. The content of current register is always RAW unless we set current_lsb to 1. See the documentation: http://www.ti.com/lit/ds/symlink/ina231.pdf page 15 Thanks, Maciej I should mention that currenst_lsb value is also used for calculating power_lsb as they have a fixed ratio (20 or 25) given in the documentation. Then expose it for power as well (appropriately adjusted). In IIO ABI it is fine to have two elements addressing the same underlying hardware value - rule is you change something, you check everything else hasn't changed. Jonathan >> Thanks, Maciej Best regards, Maciej This moves the burden of
Re: [v11 3/6] mm, oom: cgroup-aware OOM killer
On Tue 10-10-17 14:13:00, David Rientjes wrote: [...] > For these reasons: unfair comparison of root mem cgroup usage to bias > against that mem cgroup from oom kill in system oom conditions, the > ability of users to completely evade the oom killer by attaching all > processes to child cgroups either purposefully or unpurposefully, and the > inability of userspace to effectively control oom victim selection: > > Nacked-by: David RientjesI consider this NACK rather dubious. Evading the heuristic as you describe requires root privileges in default configuration because normal users are not allowed to create subtrees. If you really want to delegate subtree to an untrusted entity then you do not have to opt-in for this oom strategy. We can work on an additional means which would allow to cover those as well (e.g. priority based one which is requested for other usecases). A similar argument applies to the root memcg evaluation. While the proposed behavior is not optimal it would work for general usecase described here where the root memcg doesn't really run any large number of tasks. If somebody who explicitly opts-in for the new strategy and it doesn't work well for that usecase we can enhance the behavior. That alone is not a reason to nack the whole thing. I find it really disturbing that you keep nacking this approach just because it doesn't suite your specific usecase while it doesn't break it. Moreover it has been stated several times already that future improvements are possible and cover what you have described already. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 1/7] media: add glossary.rst with a glossary of terms used at V4L2 spec
Em Wed, 11 Oct 2017 01:18:58 +0300 Sakari Ailusescreveu: > > So, if you look, for example, at the chapter 1 name: > > "common API elements" > > > > it implies that every single V4L2 device node supports what's there. > > But that's not the case, for example, for what's described at > > Documentation/media/uapi/v4l/querycap.rst (with is part of > > chapter 1). > > Ah. I see what you mean. > > > > > There are a couple of possible alternatives: > > > > 1) define V4L2 device nodes excluding /dev/subdev, with is the > >current approach; > > > > 2) rewrite the entire V4L2 uAPI spec to explicitly talk, on each > >section, if it applies or not to sub-devices; > > There are exceptions in the common API elements section even now. For > instance, it mentions that radio devices don't support video streaming > related IOCTLs. FYI, when this chapter was written, radio devices did support video streaming ioctls :-) As I explained on a previous email, Only several years after V4L2 spec was written, it was decided to nack using a radio device to control a video stream. The real issue is that radio sets frequencies on multiples of 62.5 Hz, while TV sets frequencies on multiples of 62.5 kHz. Due to that, the core needs to know if the device is in radio or TV mode, in order to handle VIDIOC_G_FREQUENCY/VIDIOC_S_FREQUENCY. There was a code there that used to identify, but it was buggy, and there were no safe way fix. So, the "common API elements" was modified to forbid to use radio to do video stream. Later, a coarse grain logic was added at the v4l2 core in order to present a different set of ioctls, based on the V4L2 device type, and the chapter was modified again. On that time, we were lazy enough to rewrite chapter 1. So, we just add an exception "hack" at the chapter noticing that radio devices don't stream. I don't see a strong reason to not add other pontual exceptions there where pertinent, but if we start having too many exceptions, then is time to rewrite it in a way that would make more sense. > Under common API elements, also the first section (opening > and closing devices) refers to the interfaces section which, as we know, > contains sub-device documentation. Yeah, exceptions could be added, but it means that someone should read the entire chapter and mention what doesn't apply for subdev. > Do you see that something else is needed than telling which common API > elements aren't relevant for sub-devices? (I didn't find explicit > information in other sections, by a quick glance at least, telling which > interfaces they apply to.) No. I didn't read the entire chapter to see what doesn't apply for sub-devices, but I'm pretty sure that there are many things. For example, you can't read/write/mmap/stream a video on a subdev. > Should we make such a change, this would be an additional argument for > supporting VIDIOC_QUERYCAP on sub-devices. Nah. We shouldn't add ioctls with the argument that it would be easier to document. Also, if we add a subdev QUERYCAP, it would provide a different set of information than a V4L2 device QUERYCAP. > Further on, section 8, "Common > definitions for V4L2 and V4L2 subdev interfaces", should probably be merged > with the "common API elements" section. Well, if we go to approach (2), we'll need to shrink the common API definitions chapter, and add another chapter for the "uncommon" features. As an exercise: try to rewrite just open.rst to exclude sudevs where not pertinent. Even a single definition like this: The main driver always exposes one or more :term:`V4L2 device nodes ` (see :ref:`v4l2_device_naming`) with are responsible for implementing data streaming, if applicable. will become a lot more complex to explain if we don't have any term to refer to "all device node types created by the v4l2 core except for /dev/v4l-subdev* and /dev/mediaXXX". The "Related devices" and "Shared Data Streams" sections won't apply to subdevs. Most of what's written at the "Multiple Opens" section also don't apply, as subdevs don't support streaming nor they accept the priority mechanisms via VIDIOC_S_PRIORITY. The notes at "Functions" section also don't apply to subdevs. So, it is probably a lot easier to just create a new "open.rst" for subdev API and remove all that doesn't apply - as proposed in approach (3) (see below) - keeping the V4L2 one as-is. In a matter of fact, most of the things explained at V4L2 part of the spec don't apply to subdevs. So, it is easier to just tell what applies there (controls, events, format negotiation, crop/selection) than the opposite. The dev-subdev.rst does a good job describing it. If we move it from Documentation/media/uapi/v4l to Documentation/media/uapi/subdev, all we need to do is to add chapters for the syscalls it supports: open/close/ioctl, with can be a simplified version of what is there at v4l, removing all things that don't apply to subdevs.
Re: [PATCH v7 5/7] media: open.rst: Adjust some terms to match the glossary
Em Wed, 11 Oct 2017 01:41:00 +0300 Sakari Ailusescreveu: > Hi Mauro, > > On Tue, Oct 10, 2017 at 08:37:43AM -0300, Mauro Carvalho Chehab wrote: > > Em Fri, 6 Oct 2017 15:48:22 +0300 > > Sakari Ailus escreveu: > > > > > Hi Mauro, > > > > > > On Wed, Sep 27, 2017 at 07:23:47PM -0300, Mauro Carvalho Chehab wrote: > > > > As we now have a glossary, some terms used on open.rst > > > > require adjustments. > > > > > > > > Acked-by: Hans Verkuil > > > > Signed-off-by: Mauro Carvalho Chehab > > > > --- > > > > Documentation/media/uapi/v4l/open.rst | 12 ++-- > > > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/Documentation/media/uapi/v4l/open.rst > > > > b/Documentation/media/uapi/v4l/open.rst > > > > index f603bc9b49a0..0daf0c122c19 100644 > > > > --- a/Documentation/media/uapi/v4l/open.rst > > > > +++ b/Documentation/media/uapi/v4l/open.rst > > > > @@ -143,7 +143,7 @@ Related Devices > > > > Devices can support several functions. For example video capturing, VBI > > > > capturing and radio support. > > > > > > > > -The V4L2 API creates different nodes for each of these functions. > > > > +The V4L2 API creates different V4L2 device nodes for each of these > > > > functions. > > > > > > A V4L2 device node is an instance of the V4L2 API. At the very least we > > > should call them "V4L2 device node types", not device nodes only. This > > > simply would suggests they're separate. > > > > OK, I added "types" there. > > > > > > > > s/creates/defines/ ? > > > > It is meant to say create. > > > > A device that supports both radio, video and VBI for the same V4L2 > > input will create three device nodes: > > /dev/video0 > > /dev/radio0 > > /dev/vbi0 > > > > As all are associated to the same video input, and an ioctl send > > to one device may affect the other devices too, as they all associated > > with the same hardware. > > Right. In this case I'd change the sentence. What would you think of this? > > "Each of these functions is available via separate V4L2 device node." Maybe replacing "is" by "should be" (or shall be?) would express the requirement. > > For it's not the V4L2 API that creates them. I failed to grasp what the > original sentence meant. Was it about API, or framework, or were the > devices nodes just separate or unlike as well? Short answer: >From uAPI PoV, it doesn't matter if it is the driver or the framework that creates multiple device nodes. What matters is that, if multiple types of capture/input are possible, multiple devnodes are created. Also, if one wants to control a radio function, it should use /dev/radio, instead of /dev/video. Long answer: At kAPI, a typical register code, found on almost all non-webcam drivers is: ret = video_register_device(>vdev, VFL_TYPE_GRABBER, video_nr[dev->devno]); if (ret) goto error; if (vbi_supported(dev)) { ret = video_register_device(>vbi_dev, VFL_TYPE_VBI, vbi_nr[dev->devno]); if (ret) goto error; } if (radio_supported(dev)) { ret = video_register_device(>radio_dev, VFL_TYPE_RADIO, radio_nr[dev->devno]); if (ret) goto error; } (note: it doesn't make sense to mention the above kAPI code at the uAPI documentation) All tree devnodes are associated to the same capture or input device. In the past, it was possible to use a /dev/radio to watch a video, or /dev/video to listen radio, as they both corresponds to the same V4L2 device, and the ioctl arguments used for radio and VBI are different. In other words, "/dev/radio" and "/dev/vbi" worked like a sort of alias to the same device. Such support, however, required the V4L2 core to do some tricks to identify the type of usage between radio and video modes when handling tuner ioctls, like VIDIOC_G_FREQUENCY/VIDIOC_S_FREQUENCY, making harder to prevent the simultaneous usage of the device by a radio and a video application, and causing some bugs (like returning a FM frequency on TV, or an UHF frequency on radio). So, several years ago, the API spec was modified to forbid such usage, and support for switching the mode between radio/video mode based on ioctl set usage was removed. Yet, the v4l2 core use a coarse logic there: it only handles differently stuff that are required to be different (like frequencies). It won't check if, for example, a CTRL belongs to radio or video. So, one can change a control using any device node, no matter if the control belongs to video or radio. So, in summary, the goal of the "Related Devices" at uAPI is to mention that, if a V4L2 device supports multiple types,
Re: [PATCH v8 01/20] crypto: change transient busy return code to -EAGAIN
On Sat, Oct 07, 2017 at 10:51:42AM +0300, Gilad Ben-Yossef wrote: > On Sat, Oct 7, 2017 at 6:05 AM, Herbert Xu> wrote: > > On Tue, Sep 05, 2017 at 03:38:40PM +0300, Gilad Ben-Yossef wrote: > >> > >> diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c > >> index 5e92bd2..3b3c154 100644 > >> --- a/crypto/algif_hash.c > >> +++ b/crypto/algif_hash.c > >> @@ -39,6 +39,20 @@ struct algif_hash_tfm { > >> bool has_key; > >> }; > >> > >> +/* Previous versions of crypto_* ops used to return -EBUSY > >> + * rather than -EAGAIN to indicate being tied up. The in > >> + * kernel API changed but we don't want to break the user > >> + * space API. As only the hash user interface exposed this > >> + * error ever to the user, do the translation here. > >> + */ > >> +static inline int crypto_user_err(int err) > >> +{ > >> + if (err == -EAGAIN) > >> + return -EBUSY; > >> + > >> + return err; > > > > I don't see the need to carry along this baggage. Does anyone > > in user-space actually rely on EBUSY? > > > I am not aware of anyone who does. I was just trying to avoid > changing the user ABI. > > Shall I roll a new revision without this patch? Yes please. I'd rather not carry this around for eternity unless it was actually required. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html