Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume

2017-10-20 Thread Rafael J. Wysocki
On Friday, October 20, 2017 10:46:07 PM CEST Bjorn Helgaas wrote:
> On Mon, Oct 16, 2017 at 03:12:35AM +0200, Rafael J. Wysocki wrote:
> > Hi All,
> > 
> > Well, this took more time than expected, as I tried to cover everything I 
> > had
> > in mind regarding PM flags for drivers.
> 
> For the parts that touch PCI,
> 
> Acked-by: Bjorn Helgaas 

Thank you!

> I doubt there'll be conflicts with changes in my tree, but let me know if
> you trip over any so I can watch for them when merging.

Well, if there are any conflicts, we'll see them in linux-next I guess. :-)

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume

2017-10-20 Thread Bjorn Helgaas
On Mon, Oct 16, 2017 at 03:12:35AM +0200, Rafael J. Wysocki wrote:
> Hi All,
> 
> Well, this took more time than expected, as I tried to cover everything I had
> in mind regarding PM flags for drivers.

For the parts that touch PCI,

Acked-by: Bjorn Helgaas 

I doubt there'll be conflicts with changes in my tree, but let me know if
you trip over any so I can watch for them when merging.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] bug-hunting.rst: Fix an example and a typo in a Sphinx tag

2017-10-20 Thread Christophe JAILLET
- Use the same file name in the explanation and in the example (conex.c vs
sonixj.c)
- Add a missing ':' in a :ref: tag which leads to incorrect Shpinx output
- Add some missing ',' and ';'

Signed-off-by: Christophe JAILLET 
---
 Documentation/admin-guide/bug-hunting.rst | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/bug-hunting.rst 
b/Documentation/admin-guide/bug-hunting.rst
index 08c4b1308189..f278b289e260 100644
--- a/Documentation/admin-guide/bug-hunting.rst
+++ b/Documentation/admin-guide/bug-hunting.rst
@@ -240,7 +240,7 @@ In order to report it upstream, you should identify the 
mailing list
 used for the development of the affected code. This can be done by using
 the ``get_maintainer.pl`` script.
 
-For example, if you find a bug at the gspca's conex.c file, you can get
+For example, if you find a bug at the gspca's sonixj.c file, you can get
 their maintainers with::
 
$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
@@ -257,7 +257,7 @@ Please notice that it will point to:
   Tejun and Bhaktipriya (in this specific case, none really envolved on the
   development of this file);
 - The driver maintainer (Hans Verkuil);
-- The subsystem maintainer (Mauro Carvalho Chehab)
+- The subsystem maintainer (Mauro Carvalho Chehab);
 - The driver and/or subsystem mailing list (linux-me...@vger.kernel.org);
 - the Linux Kernel mailing list (linux-ker...@vger.kernel.org).
 
@@ -274,14 +274,14 @@ Fixing the bug
 --
 
 If you know programming, you could help us by not only reporting the bug,
-but also providing us with a solution. After all open source is about
+but also providing us with a solution. After all, open source is about
 sharing what you do and don't you want to be recognised for your genius?
 
 If you decide to take this way, once you have worked out a fix please submit
 it upstream.
 
 Please do read
-ref:`Documentation/process/submitting-patches.rst ` though
+:ref:`Documentation/process/submitting-patches.rst ` though
 to help your code get accepted.
 
 
-- 
2.14.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1 1/2] thunderbolt: Make pathname to force_power shorter

2017-10-20 Thread Andy Shevchenko
WMI is the bus inside kernel, so, we may access the GUID via
/sys/bus/wmi instead of doing this through /sys/devices path.

Signed-off-by: Andy Shevchenko 
---
 Documentation/admin-guide/thunderbolt.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/thunderbolt.rst 
b/Documentation/admin-guide/thunderbolt.rst
index de50a8561774..9b55952039a6 100644
--- a/Documentation/admin-guide/thunderbolt.rst
+++ b/Documentation/admin-guide/thunderbolt.rst
@@ -230,7 +230,7 @@ If supported by your machine this will be exposed by the 
WMI bus with
 a sysfs attribute called "force_power".
 
 For example the intel-wmi-thunderbolt driver exposes this attribute in:
-  
/sys/devices/platform/PNP0C14:00/wmi_bus/wmi_bus-PNP0C14:00/86CCFD48-205E-4A77-9C48-2021CBEDE341/force_power
+  /sys/bus/wmi/devices/86CCFD48-205E-4A77-9C48-2021CBEDE341/force_power
 
   To force the power to on, write 1 to this attribute file.
   To disable force power, write 0 to this attribute file.
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1 2/2] thunderbolt: Additional step for built-in module to power on

2017-10-20 Thread Andy Shevchenko
The device will not appear until we rescan the bus.

Signed-off-by: Andy Shevchenko 
---
 Documentation/admin-guide/thunderbolt.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/thunderbolt.rst 
b/Documentation/admin-guide/thunderbolt.rst
index 9b55952039a6..86987c566d6a 100644
--- a/Documentation/admin-guide/thunderbolt.rst
+++ b/Documentation/admin-guide/thunderbolt.rst
@@ -235,4 +235,9 @@ For example the intel-wmi-thunderbolt driver exposes this 
attribute in:
   To force the power to on, write 1 to this attribute file.
   To disable force power, write 0 to this attribute file.
 
+In some cases (usually when thunderbolt.ko is built-in) the additional
+step should be performed::
+
+  # echo 1 > /sys/bus/pci/rescan
+
 Note: it's currently not possible to query the force power state of a platform.
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] printk: Introduce per-console loglevel setting

2017-10-20 Thread Calvin Owens

On 10/20/2017 01:05 AM, Petr Mladek wrote:

On Thu 2017-10-19 16:40:45, Calvin Owens wrote:

On 09/28/2017 05:43 PM, Calvin Owens wrote:

Not all consoles are created equal: depending on the actual hardware,
the latency of a printk() call can vary dramatically. The worst examples
are serial consoles, where it can spin for tens of milliseconds banging
the UART to emit a message, which can cause application-level problems
when the kernel spews onto the console.


Any thoughts on this series? Happy to resend again, but if there are no
objections I'd love to see it merged sooner rather than later :)

Happy to resend too, just let me know.


There is no need to resend the patch. It is on my radar and I am
going to look at it.

Please, be patient, you hit conference, illness, after vacation
season. We do not want to unnecessarily delay it but it is
not a trivial change that might be accepted within minutes.


No worries, just wanted to make sure it hadn't been missed :)

Thanks,
Calvin


Best Regards,
Petr

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v9 10/10] sparc64: Add support for ADI (Application Data Integrity)

2017-10-20 Thread Khalid Aziz
ADI is a new feature supported on SPARC M7 and newer processors to allow
hardware to catch rogue accesses to memory. ADI is supported for data
fetches only and not instruction fetches. An app can enable ADI on its
data pages, set version tags on them and use versioned addresses to
access the data pages. Upper bits of the address contain the version
tag. On M7 processors, upper four bits (bits 63-60) contain the version
tag. If a rogue app attempts to access ADI enabled data pages, its
access is blocked and processor generates an exception. Please see
Documentation/sparc/adi.txt for further details.

This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
MCD (Memory Corruption Detection) on selected memory ranges, enable
TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
version tags on page swap out/in or migration. ADI is not enabled by
default for any task. A task must explicitly enable ADI on a memory
range and set version tag for ADI to be effective for the task.

Signed-off-by: Khalid Aziz 
Cc: Khalid Aziz 
---
v9:
- Added code to migrate ADI tags to copy_highpage() to
  ensure tags get copied on page migration
- Improved code to detect underflow and overflow when allocating
  tag storage
v8: 
- Added note to doc about non-faulting loads not triggering
  ADI tag mismatch and more details on special tag values
  of 0x0 and 0xf, as suggested by Anthony Yznaga)
- Added an IPI on mprotect(...PROT_ADI...) call to set
  TSTATE.MCDE on threads running on other processors and
  restore of TSTATE.MCDE on context switch (suggested by
  David Miller)
- Removed restriction on enabling ADI on read-only memory
  (suggested by Anthony Yznaga)
- Changed kzalloc() for tag storage to use GFP_NOWAIT
- Added code to handle overflow and underflow when allocating
  tag storage, as suggested by Anthony Yznaga
- Replaced sun_m7_patch_1insn_range() with sun4v_patch_1insn_range()
  which is functionally identical (suggested by Anthony Yznaga)
- Added membar after restoring ADI tags in copy_user_highpage(),
  as suggested by David Miller

v7:
- Enhanced arch_validate_prot() to enable ADI only on writable
  addresses backed by physical RAM
- Added support for saving/restoring ADI tags for each ADI
  block size address range on a page on swap in/out
- Added code to copy ADI tags on COW
- Updated values for auxiliary vectors to not conflict with
  values on other architectures to avoid conflict in glibc. glibc
  consolidates all auxiliary vectors into its headers and
  duplicate values in consolidated header are problematic
- Disable same page merging on ADI enabled pages since ADI tags
  may not match on pages with identical data
- Broke the patch up further into smaller patches

v6:
- Eliminated instructions to read and write PSTATE as well as
  MCDPER and PMCDPER on every access to userspace addresses
  by setting PSTATE and PMCDPER correctly upon entry into
  kernel. PSTATE.mcde and PMCDPER are set upon entry into
  kernel when running on an M7 processor. PSTATE.mcde being
  set only affects memory accesses that have TTE.mcd set.
  PMCDPER being set only affects writes to memory addresses
  that have TTE.mcd set. This ensures any faults caused by
  ADI tag mismatch on a write are exposed before kernel returns
  to userspace.

v5:
- Fixed indentation issues and instrcuctions in assembly code
- Removed CONFIG_SPARC64 from mdesc.c
- Changed to maintain state of MCDPER register in thread info
  flags as opposed to in mm context. MCDPER is a per-thread
  state and belongs in thread info flag as opposed to mm context
  which is shared across threads. Added comments to clarify this
  is a lazily maintained state and must be updated on context
  switch and copy_process()
- Updated code to use the new arch_do_swap_page() and
  arch_unmap_one() functions

v4:
- Broke patch up into smaller patches

v3:
- Removed CONFIG_SPARC_ADI
- Replaced prctl commands with mprotect
- Added auxiliary vectors for ADI parameters
- Enabled ADI for swappable pages

v2:
- Fixed a build error

 Documentation/sparc/adi.txt | 278 +
 arch/sparc/include/asm/mman.h   |  84 -
 arch/sparc/include/asm/mmu_64.h |  17 ++
 arch/sparc/include/asm/mmu_context_64.h |  50 ++
 arch/sparc/include/asm/page_64.h|   6 +
 arch/sparc/include/asm/pgtable_64.h |  46 +
 arch/sparc/include/asm/thread_info_64.h |   2 +-
 arch/sparc/include/asm/trap_block.h | 

[PATCH v9 00/10] Application Data Integrity feature introduced by SPARC M7

2017-10-20 Thread Khalid Aziz
SPARC M7 processor adds additional metadata for memory address space
that can be used to secure access to regions of memory. This additional
metadata is implemented as a 4-bit tag attached to each cacheline size
block of memory. A task can set a tag on any number of such blocks.
Access to such block is granted only if the virtual address used to
access that block of memory has the tag encoded in the uppermost 4 bits
of VA. Since sparc processor does not implement all 64 bits of VA, top 4
bits are available for ADI tags. Any mismatch between tag encoded in VA
and tag set on the memory block results in a trap. Tags are verified in
the VA presented to the MMU and tags are associated with the physical
page VA maps on to. If a memory page is swapped out and page frame gets
reused for another task, the tags are lost and hence must be saved when
swapping or migrating the page.

A userspace task enables ADI through mprotect(). This patch series adds
a page protection bit PROT_ADI and a corresponding VMA flag
VM_SPARC_ADI. VM_SPARC_ADI is used to trigger setting TTE.mcd bit in the
sparc pte that enables ADI checking on the corresponding page. MMU
validates the tag embedded in VA for every page that has TTE.mcd bit set
in its pte. After enabling ADI on a memory range, the userspace task can
set ADI version tags using stxa instruction with ASI_MCD_PRIMARY or
ASI_MCD_ST_BLKINIT_PRIMARY ASI.

Once userspace task calls mprotect() with PROT_ADI, kernel takes
following overall steps:

1. Find the VMAs covering the address range passed in to mprotect and
set VM_SPARC_ADI flag. If address range covers a subset of a VMA, the
VMA will be split.

2. When a page is allocated for a VA and the VMA covering this VA has
VM_SPARC_ADI flag set, set the TTE.mcd bit so MMU will check the
vwersion tag.

3. Userspace can now set version tags on the memory it has enabled ADI
on. Userspace accesses ADI enabled memory using a virtual address that
has the version tag embedded in the high bits. MMU validates this
version tag against the actual tag set on the memory. If tag matches,
MMU performs the VA->PA translation and access is granted. If there is a
mismatch, hypervisor sends a data access exception or precise memory
corruption detected exception depending upon whether precise exceptions
are enabled or not (controlled by MCDPERR register). Kernel sends
SIGSEGV to the task with appropriate si_code.

4. If a page is being swapped out or migrated, kernel must save any ADI
tags set on the page. Kernel maintains a page worth of tag storage
descriptors. Each descriptors pointsto a tag storage space and the
address range it covers. If the page being swapped out or migrated has
ADI enabled on it, kernel finds a tag storage descriptor that covers the
address range for the page or allocates a new descriptor if none of the
existing descriptors cover the address range. Kernel saves tags from the
page into the tag storage space descriptor points to.

5. When the page is swapped back in or reinstantiated after migration,
kernel restores the version tags on the new physical page by retrieving
the original tag from tag storage pointed to by a tag storage descriptor
for the virtual address range for new page.

User task can disable ADI by calling mprotect() again on the memory
range with PROT_ADI bit unset. Kernel clears the VM_SPARC_ADI flag in
VMAs, merges adjacent VMAs if necessary, and clears TTE.mcd bit in the
corresponding ptes.

IOMMU does not support ADI checking. Any version tags embedded in the
top bits of VA meant for IOMMU, are cleared and replaced with sign
extension of the first non-version tag bit (bit 59 for SPARC M7) for
IOMMU addresses.

This patch series adds support for this feature in 10 patches:

Patch 1/10
  Tag mismatch on access by a task results in a trap from hypervisor as
  data access exception or a precide memory corruption detected
  exception. As part of handling these exceptions, kernel sends a
  SIGSEGV to user process with special si_code to indicate which fault
  occurred. This patch adds three new si_codes to differentiate between
  various mismatch errors.

Patch 2/10
  When a page is swapped or migrated, metadata associated with the page
  must be saved so it can be restored later. This patch adds a new
  function that saves/restores this metadata when updating pte upon a
  swap/migration.

Patch 3/10
  SPARC M7 processor adds new fields to control registers to support ADI
  feature. It also adds a new exception for precise traps on tag
  mismatch. This patch adds definitions for the new control register
  fields, new ASIs for ADI and an exception handler for the precise trap
  on tag mismatch.

Patch 4/10
  New hypervisor fault types were added by sparc M7 processor to support
  ADI feature. This patch adds code to handle these fault types for data
  access exception handler.

Patch 5/10
  When ADI is in use for a page and a tag mismatch occurs, processor
  raises "Memory corruption Detected" trap. This patch adds 

Re: [PATCH 7/8] Documentation: fix selftests related file refs

2017-10-20 Thread Jerry Hoemann
On Thu, Oct 12, 2017 at 03:24:10PM -0500, Tom Saeger wrote:
> Make refs to selftests files valid including:
>   - watchdog-test.c
>   - dnotify_test.c
> 
> Signed-off-by: Tom Saeger 
> ---
>  Documentation/filesystems/dnotify.txt| 2 +-
>  Documentation/watchdog/hpwdt.txt | 2 +-
>  Documentation/watchdog/pcwd-watchdog.txt | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/filesystems/dnotify.txt 
> b/Documentation/filesystems/dnotify.txt
> index 6baf88f46859..15156883d321 100644
> --- a/Documentation/filesystems/dnotify.txt
> +++ b/Documentation/filesystems/dnotify.txt
> @@ -62,7 +62,7 @@ disabled, fcntl(fd, F_NOTIFY, ...) will return -EINVAL.
>  
>  Example
>  ---
> -See Documentation/filesystems/dnotify_test.c for an example.
> +See tools/testing/selftests/filesystems/dnotify_test.c for an example.
>  
>  NOTE
>  
> diff --git a/Documentation/watchdog/hpwdt.txt 
> b/Documentation/watchdog/hpwdt.txt
> index 7a9f635d0258..6d866c537127 100644
> --- a/Documentation/watchdog/hpwdt.txt
> +++ b/Documentation/watchdog/hpwdt.txt
> @@ -15,7 +15,7 @@ Last reviewed: 05/20/2016
>  
>   Watchdog functionality is enabled like any other common watchdog driver. 
> That
>   is, an application needs to be started that kicks off the watchdog timer. A
> - basic application exists in the Documentation/watchdog/src directory called
> + basic application exists in tools/testing/selftests/watchdog/ named
>   watchdog-test.c. Simply compile the C file and kick it off. If the system
>   gets into a bad state and hangs, the HPE ProLiant iLO timer register will
>   not be updated in a timely fashion and a hardware system reset (also known 
> as

Taking over hpwdt for Jimmy Vance.

Signed-off-by: Jerry Hoemann 


> diff --git a/Documentation/watchdog/pcwd-watchdog.txt 
> b/Documentation/watchdog/pcwd-watchdog.txt
> index 4f68052395c0..b8e60a441a43 100644
> --- a/Documentation/watchdog/pcwd-watchdog.txt
> +++ b/Documentation/watchdog/pcwd-watchdog.txt
> @@ -25,7 +25,7 @@ Last reviewed: 10/05/2007
>  
>   If you want to write a program to be compatible with the PC Watchdog
>   driver, simply use of modify the watchdog test program:
> - Documentation/watchdog/src/watchdog-test.c
> + tools/testing/selftests/watchdog/watchdog-test.c
>  
>  
>   Other IOCTL functions include:
> -- 
> 2.14.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 

-
Jerry Hoemann  Software Engineer   Hewlett Packard Enterprise
-
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rcu kernel-doc issues (4.14-rc1)

2017-10-20 Thread Randy Dunlap
On 10/20/17 09:42, Paul E. McKenney wrote:
> On Wed, Oct 18, 2017 at 10:36:47AM -0600, Jonathan Corbet wrote:
>> On Wed, 18 Oct 2017 09:27:01 -0700
>> "Paul E. McKenney"  wrote:
>>
>>> On a related topic...  Is there anything that test-builds docbook prior
>>> to patches hitting mainline?  My experience indicates that the answer is
>>> "no".
>>
>> The zero-day robot is said to be testing for new doc-build errors, but I
>> haven't actually seen much of that.
> 
> Well, on the good side, Linus did take the fixes.  I will leave it
> to you guys to sort things as needed with Fengguang.  ;-)

Thanks.

-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rcu kernel-doc issues (4.14-rc1)

2017-10-20 Thread Paul E. McKenney
On Wed, Oct 18, 2017 at 10:36:47AM -0600, Jonathan Corbet wrote:
> On Wed, 18 Oct 2017 09:27:01 -0700
> "Paul E. McKenney"  wrote:
> 
> > On a related topic...  Is there anything that test-builds docbook prior
> > to patches hitting mainline?  My experience indicates that the answer is
> > "no".
> 
> The zero-day robot is said to be testing for new doc-build errors, but I
> haven't actually seen much of that.

Well, on the good side, Linus did take the fixes.  I will leave it
to you guys to sort things as needed with Fengguang.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/4] arm64: kvm: route synchronous external abort exceptions to EL2

2017-10-20 Thread gengdongjiu
Hi james,
  Thanks for the mail and sorry for my late response.


2017-10-19 1:21 GMT+08:00, James Morse :
> Hi Dongjiu Geng,
>
> On 17/10/17 15:14, Dongjiu Geng wrote:
>> ARMv8.2 adds a new bit HCR_EL2.TEA which controls to
>> route synchronous external aborts to EL2, and adds a
>> trap control bit HCR_EL2.TERR which controls to
>> trap all Non-secure EL1&0 error record accesses to EL2.
>
> The bulk of this patch is about trap-and-emulating these ERR registers, but
> that's not reflected in the title:
>> KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
>
>
>> This patch enables the two bits for the guest OS.
>> when an synchronous abort is generated in the guest OS,
>> it will trap to EL3 firmware, EL3 firmware will check the
>
> *buzz*
> This depends on SCR_EL3.EA, which this patch doesn't touch and the
> normal-world
> can't even know about. This is what your system does, the commit message
> should
> be about the change to Linux.
>
> (I've said this before)

Thanks for the point out, I make this series in a hurry(you are
waiting this patch), forget to check again your comments before.

>
>
>> HCR_EL2.TEA value to decide to jump to hypervisor or host
>> OS. Enabling HCR_EL2.TERR makes error record access
>> from guest trap to EL2.
>>
>> Add some minimal emulation for RAS-Error-Record registers.
>> In the emulation, ERRIDR_EL1 and ERRSELR_EL1 are zero.
>> Then, the others ERX* registers are RAZ/WI.
>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h
>> b/arch/arm64/include/asm/kvm_emulate.h
>> index fe39e68..47983db 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -47,6 +47,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu
>> *vcpu)
>>  vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
>>  if (is_kernel_in_hyp_mode())
>>  vcpu->arch.hcr_el2 |= HCR_E2H;
>> +if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
>
> This ARM64_HAS_RAS_EXTN isn't in mainline, nor is it added by your series.
> I
> know where it comes from, but other reviewers may not. If you have
> dependencies
> on another series, please call them out in the cover letter.

yes, thanks for the point out.

>
> This is the first cpus_have_const_cap() user in this header file, it
> probably needs:
> #include 

OK

>
>
>> +/* route synchronous external abort exceptions to EL2 */
>> +vcpu->arch.hcr_el2 |= HCR_TEA;
>> +/* trap error record accesses */
>> +vcpu->arch.hcr_el2 |= HCR_TERR;
>> +}
>> +
>>  if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
>>  vcpu->arch.hcr_el2 &= ~HCR_RW;
>>  }
>> diff --git a/arch/arm64/include/asm/kvm_host.h
>> b/arch/arm64/include/asm/kvm_host.h
>> index d686300..af55b3bc 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -105,6 +105,8 @@ enum vcpu_sysreg {
>>  TTBR1_EL1,  /* Translation Table Base Register 1 */
>>  TCR_EL1,/* Translation Control Register */
>>  ESR_EL1,/* Exception Syndrome Register */
>
>> +ERRIDR_EL1, /* Error Record ID Register */
>
> Page 39 of [0]: 'ERRIDR_EL1 is a 64-bit read-only ...'.

yes, it is read-only.

>
>
>> +ERRSELR_EL1,/* Error Record Select Register */
>
> We're emulating these as RAZ/WI, do we really need to allocate
> vcpu->arch.ctxt.sys_regs space for them? If we always return 0 for ERRIDR,
> then
> we don't need to keep ERRSELR as 'the value read back [..] is UNKNOWN'.

https://lists.cs.columbia.edu/pipermail/kvmarm/2017-September/027176.html

" 'If ERRSELR_EL1.SEL is
[>=]  ERRIDR_EL1.NUM' that makes the ERX* registers RAZ/WI"

This is because I want to make above simulation as you suggested, if
want to make above simulation, it needs set "vcpu->arch.ctxt.sys_regs"
to 0, instead of reading from system register.

so need a space to store it

>
> I think we only need space for these once their value needs to be migrated,
> user-space doesn't need to know they exist until then.
>
>
>>  AFSR0_EL1,  /* Auxiliary Fault Status Register 0 */
>>  AFSR1_EL1,  /* Auxiliary Fault Status Register 1 */
>>  FAR_EL1,/* Fault Address Register */
>
>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>> index 2e070d3..a74617b 100644
>> --- a/arch/arm64/kvm/sys_regs.c
>> +++ b/arch/arm64/kvm/sys_regs.c
>> @@ -775,6 +775,36 @@ static bool access_pmovs(struct kvm_vcpu *vcpu,
>> struct sys_reg_params *p,
>>  return true;
>>  }
>>
>> +static bool access_error_reg(struct kvm_vcpu *vcpu, struct sys_reg_params
>> *p,
>> + const struct sys_reg_desc *r)
>> +{
>> +/* accessing ERRIDR_EL1 */
>> +if (r->CRm == 3 && r->Op2 == 0) {
>> +if (p->is_write)
>> +vcpu_sys_reg(vcpu, ERRIDR_EL1) = 0;
>
> As above, this register is read-only.

yes, it is my mistake.

>
>
>> +return trap_raz_wi(vcpu, p, r);
>
> If we can 

Re: [PATCH v2 1/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-20 Thread Christopher Lameter
On Fri, 20 Oct 2017, changbin...@intel.com wrote:

> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 269b5df..2a960fc 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -501,6 +501,43 @@ void prep_transhuge_page(struct page *page)
>   set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR);
>  }
>
> +struct page *alloc_transhuge_page_vma(gfp_t gfp_mask,
> + struct vm_area_struct *vma, unsigned long addr)
> +{
> + struct page *page;
> +
> + page = alloc_pages_vma(gfp_mask | __GFP_COMP, HPAGE_PMD_ORDER,
> +vma, addr, numa_node_id(), true);
> + if (unlikely(!page))
> + return NULL;
> + prep_transhuge_page(page);
> + return page;
> +}
> +
> +struct page *alloc_transhuge_page_nodemask(gfp_t gfp_mask,
> + int preferred_nid, nodemask_t *nmask)
> +{
> + struct page *page;
> +
> + page = __alloc_pages_nodemask(gfp_mask | __GFP_COMP, HPAGE_PMD_ORDER,
> +   preferred_nid, nmask);
> + if (unlikely(!page))
> + return NULL;
> + prep_transhuge_page(page);
> + return page;
> +}
> +
> +struct page *alloc_transhuge_page(gfp_t gfp_mask)
> +{
> + struct page *page;
> +
> + page = alloc_pages(gfp_mask | __GFP_COMP, HPAGE_PMD_ORDER);
> + if (unlikely(!page))
> + return NULL;
> + prep_transhuge_page(page);
> + return page;
> +}
> +

These look pretty similar to the code used for huge pages (aside from the
call to prep_transhuge_page(). Maybe we can have common allocation
primitives for huge pages?

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags

2017-10-20 Thread Rafael J. Wysocki
On Friday, October 20, 2017 1:35:27 PM CEST Greg Kroah-Hartman wrote:
> On Fri, Oct 20, 2017 at 01:11:22PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, October 19, 2017 9:33:15 AM CEST Greg Kroah-Hartman wrote:
> > > On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki 
> > > > 
> > > > The motivation for this change is to provide a way to work around
> > > > a problem with the direct-complete mechanism used for avoiding
> > > > system suspend/resume handling for devices in runtime suspend.
> > > > 
> > > > The problem is that some middle layer code (the PCI bus type and
> > > > the ACPI PM domain in particular) returns positive values from its
> > > > system suspend ->prepare callbacks regardless of whether the driver's
> > > > ->prepare returns a positive value or 0, which effectively prevents
> > > > drivers from being able to control the direct-complete feature.
> > > > Some drivers need that control, however, and the PCI bus type has
> > > > grown its own flag to deal with this issue, but since it is not
> > > > limited to PCI, it is better to address it by adding driver flags at
> > > > the core level.
> > > > 
> > > > To that end, add a driver_flags field to struct dev_pm_info for flags
> > > > that can be set by device drivers at the probe time to inform the PM
> > > > core and/or bus types, PM domains and so on on the capabilities and/or
> > > > preferences of device drivers.  Also add two static inline helpers
> > > > for setting that field and testing it against a given set of flags
> > > > and make the driver core clear it automatically on driver remove
> > > > and probe failures.
> > > > 
> > > > Define and document two PM driver flags related to the direct-
> > > > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > > > respectively, to indicate to the PM core that the direct-complete
> > > > mechanism should never be used for the device and to inform the
> > > > middle layer code (bus types, PM domains etc) that it can only
> > > > request the PM core to use the direct-complete mechanism for
> > > > the device (by returning a positive value from its ->prepare
> > > > callback) if it also has been requested by the driver.
> > > > 
> > > > While at it, make the core check pm_runtime_suspended() when
> > > > setting power.direct_complete so that it doesn't need to be
> > > > checked by ->prepare callbacks.
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki 
> > > 
> > > Acked-by: Greg Kroah-Hartman 
> > 
> > Thanks!
> > 
> > Does it also apply to the other patches in the series?
> > 
> > I'd like to queue up the core patches for 4.15 as they are specifically
> > designed to only affect the drivers that actually set the flags, so there
> > shouldn't be any regression resulting from them, and I'd like to be
> > able to start using the flags in drivers going forward.
> 
> Yes, sorry, I thought I acked them, but you are right, I didn't:
> 
> Acked-by: Greg Kroah-Hartman 
> 
> for all of them please.

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags

2017-10-20 Thread Greg Kroah-Hartman
On Fri, Oct 20, 2017 at 01:11:22PM +0200, Rafael J. Wysocki wrote:
> On Thursday, October 19, 2017 9:33:15 AM CEST Greg Kroah-Hartman wrote:
> > On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki 
> > > 
> > > The motivation for this change is to provide a way to work around
> > > a problem with the direct-complete mechanism used for avoiding
> > > system suspend/resume handling for devices in runtime suspend.
> > > 
> > > The problem is that some middle layer code (the PCI bus type and
> > > the ACPI PM domain in particular) returns positive values from its
> > > system suspend ->prepare callbacks regardless of whether the driver's
> > > ->prepare returns a positive value or 0, which effectively prevents
> > > drivers from being able to control the direct-complete feature.
> > > Some drivers need that control, however, and the PCI bus type has
> > > grown its own flag to deal with this issue, but since it is not
> > > limited to PCI, it is better to address it by adding driver flags at
> > > the core level.
> > > 
> > > To that end, add a driver_flags field to struct dev_pm_info for flags
> > > that can be set by device drivers at the probe time to inform the PM
> > > core and/or bus types, PM domains and so on on the capabilities and/or
> > > preferences of device drivers.  Also add two static inline helpers
> > > for setting that field and testing it against a given set of flags
> > > and make the driver core clear it automatically on driver remove
> > > and probe failures.
> > > 
> > > Define and document two PM driver flags related to the direct-
> > > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > > respectively, to indicate to the PM core that the direct-complete
> > > mechanism should never be used for the device and to inform the
> > > middle layer code (bus types, PM domains etc) that it can only
> > > request the PM core to use the direct-complete mechanism for
> > > the device (by returning a positive value from its ->prepare
> > > callback) if it also has been requested by the driver.
> > > 
> > > While at it, make the core check pm_runtime_suspended() when
> > > setting power.direct_complete so that it doesn't need to be
> > > checked by ->prepare callbacks.
> > > 
> > > Signed-off-by: Rafael J. Wysocki 
> > 
> > Acked-by: Greg Kroah-Hartman 
> 
> Thanks!
> 
> Does it also apply to the other patches in the series?
> 
> I'd like to queue up the core patches for 4.15 as they are specifically
> designed to only affect the drivers that actually set the flags, so there
> shouldn't be any regression resulting from them, and I'd like to be
> able to start using the flags in drivers going forward.

Yes, sorry, I thought I acked them, but you are right, I didn't:

Acked-by: Greg Kroah-Hartman 

for all of them please.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags

2017-10-20 Thread Rafael J. Wysocki
On Thursday, October 19, 2017 9:33:15 AM CEST Greg Kroah-Hartman wrote:
> On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki 
> > 
> > The motivation for this change is to provide a way to work around
> > a problem with the direct-complete mechanism used for avoiding
> > system suspend/resume handling for devices in runtime suspend.
> > 
> > The problem is that some middle layer code (the PCI bus type and
> > the ACPI PM domain in particular) returns positive values from its
> > system suspend ->prepare callbacks regardless of whether the driver's
> > ->prepare returns a positive value or 0, which effectively prevents
> > drivers from being able to control the direct-complete feature.
> > Some drivers need that control, however, and the PCI bus type has
> > grown its own flag to deal with this issue, but since it is not
> > limited to PCI, it is better to address it by adding driver flags at
> > the core level.
> > 
> > To that end, add a driver_flags field to struct dev_pm_info for flags
> > that can be set by device drivers at the probe time to inform the PM
> > core and/or bus types, PM domains and so on on the capabilities and/or
> > preferences of device drivers.  Also add two static inline helpers
> > for setting that field and testing it against a given set of flags
> > and make the driver core clear it automatically on driver remove
> > and probe failures.
> > 
> > Define and document two PM driver flags related to the direct-
> > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > respectively, to indicate to the PM core that the direct-complete
> > mechanism should never be used for the device and to inform the
> > middle layer code (bus types, PM domains etc) that it can only
> > request the PM core to use the direct-complete mechanism for
> > the device (by returning a positive value from its ->prepare
> > callback) if it also has been requested by the driver.
> > 
> > While at it, make the core check pm_runtime_suspended() when
> > setting power.direct_complete so that it doesn't need to be
> > checked by ->prepare callbacks.
> > 
> > Signed-off-by: Rafael J. Wysocki 
> 
> Acked-by: Greg Kroah-Hartman 

Thanks!

Does it also apply to the other patches in the series?

I'd like to queue up the core patches for 4.15 as they are specifically
designed to only affect the drivers that actually set the flags, so there
shouldn't be any regression resulting from them, and I'd like to be
able to start using the flags in drivers going forward.

Thanks,
Rafael


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Make squashfs fragments' cache size more configurable

2017-10-20 Thread Wuqixuan
Hi Phillip, 

Thank you for your fast reply. 

On Fri, Oct 20, 2017 at 2:18 PM,  Phillip Lougher ‎ 
wrote:
> On Thu, Oct 19, 2017 at 12:50 AM, Qixuan Wu  wrote:
>> Hi All,
>>
>> Currently, squashfs fragments' cache size is only determined by
>> config option CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE. Users have
>> no way to change the value when they get the binary kernel.

> Thank-you for the patches, but they're both pointless and dangerous.
> Let's be clear here you're trying to change an "expert only" kernel
> configuration option into a user changeable option.  This is stupid
> because it is not meant for non-experts to change for good reason.
>
> The fragment cache size isn't  some small tweak to the operation of
> Squashfs, it fundamentally affects both the performance and memory
> overhead of Squashfs.  As such right from its introduction in 2003, it
> has been an "expert only" configuration option at build time.  Even
> then it is made clear that the default has been carefully chosen, and
> it should only be changed in exceptional circumstances.  This
> basically means don't change the default unless you really know what
> you're doing, and this means tracing of Squashfs against your use-case
> to determine caching behaviour.  There is absolutely no other reason
> why you'd want to change the default.  This also means it should be
> restricted to kernel configuration time only.
> 
> Let's be clear again, very few people should ever want to change the
> default, and for the "experts" that do want to do so, they can do so
> when configuring the kernel.  If you're not in a position to change it
> at kernel configuration time then by definition you're not an expert,
> and you shouldn't be able to change it anyway and certainly not as a
> user.
> 
> There is absolutely no use-case here to make this a user changeable
> option.  I can see no upsides in doing this, only downsides.
> 
> Frankly if you need to change this value at module insert time then
> there is something wrong with your system or build process.   If you
> want this because you want to build the kernel/modules once, and then
> post-facto configure them for various products then it is your build
> process that is broken.   If you want this because you want to
> dynamically change Squashfs memory usage/caching behaviour post kernel
> configuration time it suggests you're trying to adapt Squashfs's
> footprint based on available memory.   This is an abuse of the option
> as it's only meant to be used after detailed tracing/analysis and
> certainly not used to accommodate unforeseen dynamic low memory
> situations, and if that's the reason for needing this option, you
> should be looking to solve it elsewhere.
> 
> Ultimately this has been an "expert"  kernel configuration only option
> since its introduction in 2003, and I never been asked to change it,
> and I believe this is because people recognise it as such.  I suspect
> you're trying to change this for fundamentally bogus reasons.
> Moreover Squashfs is used in many different use-cases and
> distributions, and I'm not going to make this a user-changeable option
> allowing users to insert the Squashfs module in such a way that will
> break its performance.
> 
> So NACK.

I need apologize for not describing the scenario clear enough. In our 
company, maybe we have a bit different kernel distribution mode. We 
only can release the single kernel binary to multiple customer. For one
customer of us, they have a very strict kernel boot speed requirement that
is 2~3s including rootfs (squashfs) uncompression. We found if modify the 
value from 3 to 8, some handreds of miliseconds can be saved, and the total 
boot time satisfied the requirement. But we were afraid to impact other 
customer, so used kernel boot parameter. Module interface currently there 
is no user-case. 

Maybe it's not correct, from my opinion, that kernel boot parameter is almost 
same as config option, kernel module, /proc or /sysfs. It gives administrator 
the chance to change some kernel's variables as per different scenario, if 
they cannot chagne the config option. And administrator sometime is root, 
not normal user. So the parameter set by them through kernel boot parameter 
with enough understanding, testing and analysis. For example, such kernel boot
parameters like crashkernel=size[KMG], default_hugepagesz= also do the 
same work. So before setting to 8, the administrator of our customer 
understands 
the meaning and memory overhead impact of this modification of fragments 
cache size. 

Frankly I admit maybe our scenario is a bit special in embedded system, it's 
not as useful for others as for us. So it seems like a bit over-design and I 
can 
understand what you are worried about if accepting the feature. 

Anyway, thanks for your reply and your opinion. 

Thanks 
Qixuan

> Phillip Lougher (Squashfs maintainer)

--
To unsubscribe from this list: send the line 

Re: [RFC PATCH 3/5] gpio: gpiolib: Add chardev support for maintaining GPIO values on reset

2017-10-20 Thread Andrew Jeffery
On Fri, 2017-10-20 at 09:27 +0200, Linus Walleij wrote:
> I paged Bartosz and Michael on this, they are experts on the use cases for
> the character device and their opinions are likely more valuable than mine.
> 
> > On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:
> > Similar to devicetree support, add flags and mappings to expose reset
> > tolerance configuration through the chardev interface.
> > 
> > Signed-off-by: Andrew Jeffery 
> 
> (...)
> 
> > +* Unconditionally configure reset tolerance, as it's 
> > possible
> > +* that the tolerance flag itself becomes tolerant to 
> > resets.
> > +* Thus it could remain set from a previous environment, but
> > +* the current environment may not expect it so.
> > +*/
> > +   ret = gpiod_set_reset_tolerant(desc,
> > +   !!(lflags & 
> > GPIOHANDLE_REQUEST_RESET_TOLERANT));
> > +   if (ret < 0)
> > +   goto out_free_descs;
> 
> First, as noted in the first patch, IMO we should just go for persistance,
> i.e. you want to flag to the system to keep the line persistent in any case,
> no matter if the system goes to sleep or resets.
> 
> So the usecase is going to be a control system or similar, a makerspace
> project, an industrial product of some kind, driving GPIO from userspace.
> 
> I don't see it as helpful to give userspace control over whether the line
> is persistent or not. It is more reasonable to assume persistance for
> userspace use cases, don't you think? Whether the system goes to sleep
> or the gpiochip resets should not make a door suddenly close or the
> lights in the christmas tree go out, right? I think if the gpiochip supports
> persistance of any kind, we should try to use it and not have userspace
> provide flags for that.

Right. I guess the counter argument to your examples is if the gpio is
controlling any active process that we don't want to continue if we've
lost the capacity to monitor some other inputs (some kind of dead-man's 
switch). But maybe the argument is that should be implemented in the
kernel anyway?

Andrew

signature.asc
Description: This is a digitally signed message part


[PATCH v2 1/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-20 Thread changbin . du
From: Changbin Du 

This patch introduced 4 new interfaces to allocate a prepared
transparent huge page. The aim is to remove duplicated code and
simplify transparent huge page allocation. These are similar to
alloc_hugepage_xxx which are for hugetlbfs pages.
  - alloc_transhuge_page_vma
  - alloc_transhuge_page_nodemask
  - alloc_transhuge_page_node
  - alloc_transhuge_page

These interfaces implicitly add __GFP_COMP gfp mask which is
the minimum flags used for huge page allocation. More flags
leave to the callers.

This patch does below changes:
  - define alloc_transhuge_page_xxx interfaces
  - apply them to all existing code
  - declare prep_transhuge_page as static since no others use it
  - remove alloc_hugepage_vma definition since it no longer has users

Signed-off-by: Changbin Du 

---
v2:
Anshuman Khandu:
  - Remove redundant 'VM_BUG_ON(!(gfp_mask & __GFP_COMP))'.
Andrew Morton:
  - Fix build error if thp is disabled.
---
 include/linux/gfp.h |  4 
 include/linux/huge_mm.h | 18 --
 include/linux/migrate.h | 14 +-
 mm/huge_memory.c| 48 +---
 mm/khugepaged.c | 11 ++-
 mm/mempolicy.c  | 14 +++---
 mm/migrate.c| 14 --
 mm/shmem.c  |  6 ++
 8 files changed, 73 insertions(+), 56 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f780718..855c72e 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -507,15 +507,11 @@ alloc_pages(gfp_t gfp_mask, unsigned int order)
 extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
struct vm_area_struct *vma, unsigned long addr,
int node, bool hugepage);
-#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
-   alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true)
 #else
 #define alloc_pages(gfp_mask, order) \
alloc_pages_node(numa_node_id(), gfp_mask, order)
 #define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\
alloc_pages(gfp_mask, order)
-#define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
-   alloc_pages(gfp_mask, order)
 #endif
 #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
 #define alloc_page_vma(gfp_mask, vma, addr)\
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 14bc21c..184eb38 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -130,9 +130,20 @@ extern unsigned long thp_get_unmapped_area(struct file 
*filp,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags);
 
-extern void prep_transhuge_page(struct page *page);
 extern void free_transhuge_page(struct page *page);
 
+extern struct page *alloc_transhuge_page_vma(gfp_t gfp_mask,
+   struct vm_area_struct *vma, unsigned long addr);
+extern struct page *alloc_transhuge_page_nodemask(gfp_t gfp_mask,
+   int preferred_nid, nodemask_t *nmask);
+
+static inline struct page *alloc_transhuge_page_node(int nid, gfp_t gfp_mask)
+{
+   return alloc_transhuge_page_nodemask(gfp_mask, nid, NULL);
+}
+
+extern struct page *alloc_transhuge_page(gfp_t gfp_mask);
+
 bool can_split_huge_page(struct page *page, int *pextra_pins);
 int split_huge_page_to_list(struct page *page, struct list_head *list);
 static inline int split_huge_page(struct page *page)
@@ -260,7 +271,10 @@ static inline bool transparent_hugepage_enabled(struct 
vm_area_struct *vma)
return false;
 }
 
-static inline void prep_transhuge_page(struct page *page) {}
+#define alloc_transhuge_page_vma(gfp_mask, vma, addr) NULL
+#define alloc_transhuge_page_nodemask(gfp_mask, preferred_nid, nmask) NULL
+#define alloc_transhuge_page_node(nid, gfp_maskg) NULL
+#define alloc_transhuge_page(gfp_mask) NULL
 
 #define transparent_hugepage_flags 0UL
 
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 643c7ae..70a00f3 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -42,19 +42,15 @@ static inline struct page *new_page_nodemask(struct page 
*page,
return 
alloc_huge_page_nodemask(page_hstate(compound_head(page)),
preferred_nid, nodemask);
 
-   if (thp_migration_supported() && PageTransHuge(page)) {
-   order = HPAGE_PMD_ORDER;
-   gfp_mask |= GFP_TRANSHUGE;
-   }
-
if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
gfp_mask |= __GFP_HIGHMEM;
 
-   new_page = __alloc_pages_nodemask(gfp_mask, order,
+   if (thp_migration_supported() && PageTransHuge(page))
+   return alloc_transhuge_page_nodemask(gfp_mask | GFP_TRANSHUGE,
+   preferred_nid, nodemask);
+   else
+   return __alloc_pages_nodemask(gfp_mask, order,
   

[PATCH v2 0/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-20 Thread changbin . du
From: Changbin Du 

The first one introduce new interfaces, the second one kills naming confusion.
The aim is to simplify transparent huge page allocation and remove duplicated
code.

V2:
  - Coding improvment.
  - Fix build error if thp is disabled.

Changbin Du (2):
  mm, thp: introduce dedicated transparent huge page allocation
interfaces
  mm: rename page dtor functions to {compound,huge,transhuge}_page__dtor

 Documentation/vm/hugetlbfs_reserv.txt |  4 +--
 include/linux/gfp.h   |  4 ---
 include/linux/huge_mm.h   | 20 --
 include/linux/hugetlb.h   |  2 +-
 include/linux/migrate.h   | 14 --
 include/linux/mm.h|  8 +++---
 mm/huge_memory.c  | 52 +--
 mm/hugetlb.c  | 14 +-
 mm/khugepaged.c   | 11 ++--
 mm/mempolicy.c| 14 ++
 mm/migrate.c  | 14 +++---
 mm/page_alloc.c   | 10 +++
 mm/shmem.c|  6 ++--
 mm/swap.c |  2 +-
 mm/userfaultfd.c  |  2 +-
 15 files changed, 97 insertions(+), 80 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] mm: rename page dtor functions to {compound,huge,transhuge}_page__dtor

2017-10-20 Thread changbin . du
From: Changbin Du 

The current name free_{huge,transhuge}_page are paired with
alloc_{huge,transhuge}_page functions, but the actual page free
function is still free_page() which will indirectly call
free_{huge,transhuge}_page.

So this patch removes this confusion by renaming all the
compound page dtors to {compound,huge,transhuge}_page__dtor.
And since we already have a typedef compound_page_dtor,
rename it to compound_page_dtor_t to avoid name conflict.

Signed-off-by: Changbin Du 

---
v2: Improve commit message.
---
 Documentation/vm/hugetlbfs_reserv.txt |  4 ++--
 include/linux/huge_mm.h   |  2 +-
 include/linux/hugetlb.h   |  2 +-
 include/linux/mm.h|  8 
 mm/huge_memory.c  |  4 ++--
 mm/hugetlb.c  | 14 +++---
 mm/page_alloc.c   | 10 +-
 mm/swap.c |  2 +-
 mm/userfaultfd.c  |  2 +-
 9 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/Documentation/vm/hugetlbfs_reserv.txt 
b/Documentation/vm/hugetlbfs_reserv.txt
index 9aca09a..b3ffa3e 100644
--- a/Documentation/vm/hugetlbfs_reserv.txt
+++ b/Documentation/vm/hugetlbfs_reserv.txt
@@ -238,7 +238,7 @@ to the global reservation count (resv_huge_pages).
 
 Freeing Huge Pages
 --
-Huge page freeing is performed by the routine free_huge_page().  This routine
+Huge page freeing is performed by the routine huge_page_dtor().  This routine
 is the destructor for hugetlbfs compound pages.  As a result, it is only
 passed a pointer to the page struct.  When a huge page is freed, reservation
 accounting may need to be performed.  This would be the case if the page was
@@ -468,7 +468,7 @@ However, there are several instances where errors are 
encountered after a huge
 page is allocated but before it is instantiated.  In this case, the page
 allocation has consumed the reservation and made the appropriate subpool,
 reservation map and global count adjustments.  If the page is freed at this
-time (before instantiation and clearing of PagePrivate), then free_huge_page
+time (before instantiation and clearing of PagePrivate), then huge_page_dtor
 will increment the global reservation count.  However, the reservation map
 indicates the reservation was consumed.  This resulting inconsistent state
 will cause the 'leak' of a reserved huge page.  The global reserve count will
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 184eb38..bd05bc7 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -130,7 +130,7 @@ extern unsigned long thp_get_unmapped_area(struct file 
*filp,
unsigned long addr, unsigned long len, unsigned long pgoff,
unsigned long flags);
 
-extern void free_transhuge_page(struct page *page);
+extern void transhuge_page_dtor(struct page *page);
 
 extern struct page *alloc_transhuge_page_vma(gfp_t gfp_mask,
struct vm_area_struct *vma, unsigned long addr);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 8bbbd37..24492c5 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -118,7 +118,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long 
start, long end,
long freed);
 bool isolate_huge_page(struct page *page, struct list_head *list);
 void putback_active_hugepage(struct page *page);
-void free_huge_page(struct page *page);
+void huge_page_dtor(struct page *page);
 void hugetlb_fix_reserve_counts(struct inode *inode);
 extern struct mutex *hugetlb_fault_mutex_table;
 u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 065d99d..adfa906 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -616,7 +616,7 @@ void split_page(struct page *page, unsigned int order);
  * prototype for that function and accessor functions.
  * These are _only_ valid on the head of a compound page.
  */
-typedef void compound_page_dtor(struct page *);
+typedef void compound_page_dtor_t(struct page *);
 
 /* Keep the enum in sync with compound_page_dtors array in mm/page_alloc.c */
 enum compound_dtor_id {
@@ -630,7 +630,7 @@ enum compound_dtor_id {
 #endif
NR_COMPOUND_DTORS,
 };
-extern compound_page_dtor * const compound_page_dtors[];
+extern compound_page_dtor_t * const compound_page_dtors[];
 
 static inline void set_compound_page_dtor(struct page *page,
enum compound_dtor_id compound_dtor)
@@ -639,7 +639,7 @@ static inline void set_compound_page_dtor(struct page *page,
page[1].compound_dtor = compound_dtor;
 }
 
-static inline compound_page_dtor *get_compound_page_dtor(struct page *page)
+static inline compound_page_dtor_t *get_compound_page_dtor(struct page *page)
 {
VM_BUG_ON_PAGE(page[1].compound_dtor >= NR_COMPOUND_DTORS, page);

Re: [PATCH 1/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-20 Thread Du, Changbin
Hi Hocko,
On Thu, Oct 19, 2017 at 02:49:31PM +0200, Michal Hocko wrote:
> On Wed 18-10-17 19:00:26, Du, Changbin wrote:
> > Hi Hocko,
> > 
> > On Tue, Oct 17, 2017 at 12:20:52PM +0200, Michal Hocko wrote:
> > > [CC Kirill]
> > > 
> > > On Mon 16-10-17 17:19:16, changbin...@intel.com wrote:
> > > > From: Changbin Du 
> > > > 
> > > > This patch introduced 4 new interfaces to allocate a prepared
> > > > transparent huge page.
> > > >   - alloc_transhuge_page_vma
> > > >   - alloc_transhuge_page_nodemask
> > > >   - alloc_transhuge_page_node
> > > >   - alloc_transhuge_page
> > > > 
> > > > The aim is to remove duplicated code and simplify transparent
> > > > huge page allocation. These are similar to alloc_hugepage_xxx
> > > > which are for hugetlbfs pages. This patch does below changes:
> > > >   - define alloc_transhuge_page_xxx interfaces
> > > >   - apply them to all existing code
> > > >   - declare prep_transhuge_page as static since no others use it
> > > >   - remove alloc_hugepage_vma definition since it no longer has users
> > > 
> > > So what exactly is the advantage of the new API? The diffstat doesn't
> > > sound very convincing to me.
> > >
> > The caller only need one step to allocate thp. Several LOCs removed for all 
> > the
> > caller side with this change. So it's little more convinent.
> 
> Yeah, but the overall result is more code. So I am not really convinced. 
Yes, but some of code are just to make compiler happy (declarations). These are
just simple light wrappers same as other functions in kernel. At least the code
readbility is improved by this, two steps allocation merged into one so
duplicated logic removed.

> -- 
> Michal Hocko
> SUSE Labs

-- 
Thanks,
Changbin Du


signature.asc
Description: PGP signature


Re: [RFC PATCH 1/5] gpio: gpiolib: Add core support for maintaining GPIO values on reset

2017-10-20 Thread Andrew Jeffery
On Fri, 2017-10-20 at 09:43 +0200, Linus Walleij wrote:
> On Fri, Oct 20, 2017 at 9:17 AM, Linus Walleij  
> wrote:
> > > > On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:
> > 
> > > GPIO state reset tolerance is implemented in gpiolib through the
> > > addition of a new pinconf parameter. With that, some renaming of helpers
> > > is done to clarify the scope of the already existing
> > > gpiochip_line_is_persistent(), as it's now ambiguous as to whether that
> > > means on suspend, reset or both.
> > 
> > Isn't it most reasonable to say persistance covers both cases, reset
> > and/or sleep? This seems a bit like overdefined.
> 
> I should also add: right now persistance is defined in negative terms,
> you can supply the flag "may lose value", which means the subsystem
> by default, and driver by default, will try to keep values persistent across
> sleep.
> 
> Then it is possible to opt in for not doing so. (Usually to save power I
> think.)
> 
> I think that especially for userspace use cases, saving power should
> not really be the concern, but correct me if I'm wrong. I am thinking
> of a box with a DC plug wired up to a factory line here.
> 
> What we have in the Arizona driver is an opt-in where the DT can
> say "don't preserve the value  this line during system sleep" i.e. "lay lose
> value" and we can extend that flag to mean "don't preserve this line
> during reset either" but by default assume that we should.

Yeah, the preserve polarity was another thing I debated given the
current example with the Arizona driver. Not preserving is the default
for the Aspeed hardware, so that ended up influencing my choice. Not
that implementation details should necessarily influence interface
design, but it was at least more than a coin toss.

I don't have anything specific against preserving by default, just my
gut instinct and the hardware went the other way. As long as we expose
the option to opt out, which the additions for the Arizona already do.

Cheers,

Andrew

signature.asc
Description: This is a digitally signed message part


Re: [RFC PATCH 1/5] gpio: gpiolib: Add core support for maintaining GPIO values on reset

2017-10-20 Thread Andrew Jeffery
On Fri, 2017-10-20 at 09:17 +0200, Linus Walleij wrote:
> > On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:
> 
> > GPIO state reset tolerance is implemented in gpiolib through the
> > addition of a new pinconf parameter. With that, some renaming of helpers
> > is done to clarify the scope of the already existing
> > gpiochip_line_is_persistent(), as it's now ambiguous as to whether that
> > means on suspend, reset or both.
> 
> Isn't it most reasonable to say persistance covers both cases, reset
> and/or sleep? This seems a bit like overdefined.

I definitely had some internal debate about that. I erred on the side of
avoiding potential change in expectations for the arizona. If you consider that
overdefined then I'm happy to go the other way.

> 
> So can we say that is this flag is set, the hardware and driver should
> do its best to preserve the value across any system disruptions.
> 
> We can change the wording of course, patches welcome for that.

Yep.

> 
> But do we really need to distinguish the cases of disruption and
> whether we cover up for them or not?
> 
> I would say we can deal with that the day we have a system with
> two register bits (or similar) where you can select to preserve across
> sleep, reset, one or the other, AND there is also a usecase such that
> a user wants to preserve the value across reset but not suspend or
> vice versa.
> 
> I suspect that will not happen.

A very reasonable approach.

Cheers for the feedback.

Andrew

signature.asc
Description: This is a digitally signed message part


Re: [PATCH 1/3] printk: Introduce per-console loglevel setting

2017-10-20 Thread Petr Mladek
On Thu 2017-10-19 16:40:45, Calvin Owens wrote:
> On 09/28/2017 05:43 PM, Calvin Owens wrote:
> >Not all consoles are created equal: depending on the actual hardware,
> >the latency of a printk() call can vary dramatically. The worst examples
> >are serial consoles, where it can spin for tens of milliseconds banging
> >the UART to emit a message, which can cause application-level problems
> >when the kernel spews onto the console.
> 
> Any thoughts on this series? Happy to resend again, but if there are no
> objections I'd love to see it merged sooner rather than later :)
> 
> Happy to resend too, just let me know.

There is no need to resend the patch. It is on my radar and I am
going to look at it.

Please, be patient, you hit conference, illness, after vacation
season. We do not want to unnecessarily delay it but it is
not a trivial change that might be accepted within minutes.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/5] gpio: gpiolib: Add core support for maintaining GPIO values on reset

2017-10-20 Thread Linus Walleij
On Fri, Oct 20, 2017 at 9:17 AM, Linus Walleij  wrote:
> On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:
>
>> GPIO state reset tolerance is implemented in gpiolib through the
>> addition of a new pinconf parameter. With that, some renaming of helpers
>> is done to clarify the scope of the already existing
>> gpiochip_line_is_persistent(), as it's now ambiguous as to whether that
>> means on suspend, reset or both.
>
> Isn't it most reasonable to say persistance covers both cases, reset
> and/or sleep? This seems a bit like overdefined.

I should also add: right now persistance is defined in negative terms,
you can supply the flag "may lose value", which means the subsystem
by default, and driver by default, will try to keep values persistent across
sleep.

Then it is possible to opt in for not doing so. (Usually to save power I
think.)

I think that especially for userspace use cases, saving power should
not really be the concern, but correct me if I'm wrong. I am thinking
of a box with a DC plug wired up to a factory line here.

What we have in the Arizona driver is an opt-in where the DT can
say "don't preserve the value  this line during system sleep" i.e. "lay lose
value" and we can extend that flag to mean "don't preserve this line
during reset either" but by default assume that we should.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 4/5] gpio: gpiolib: Add sysfs support for maintaining GPIO values on reset

2017-10-20 Thread Andrew Jeffery
On Fri, 2017-10-20 at 09:29 +0200, Linus Walleij wrote:
> On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery 
> wrote:
> 
> > Expose a new 'maintain' sysfs attribute to control both suspend and
> > reset tolerance.
> > 
> > Signed-off-by: Andrew Jeffery 
> 
> NAK. You will find the actual ABI documentation in
> Documentation/ABI/obsolete/sysfs-gpio

Right, I did a quick grep to find an attribute description in order to
judge what documentation to change. Unfortunately my grep didn't pick
up this file.

> that's why. This is being phased out and should not be extended.
> Everyone should use the character device, especially for new
> functionality.

Yeah, I expected this (and the NAK) would be the response but figured I
should ask the question.

Thanks,

Andrew

signature.asc
Description: This is a digitally signed message part


Re: [RFC PATCH 4/5] gpio: gpiolib: Add sysfs support for maintaining GPIO values on reset

2017-10-20 Thread Linus Walleij
On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:

> Expose a new 'maintain' sysfs attribute to control both suspend and
> reset tolerance.
>
> Signed-off-by: Andrew Jeffery 

NAK. You will find the actual ABI documentation in
Documentation/ABI/obsolete/sysfs-gpio
that's why. This is being phased out and should not be extended.
Everyone should use the character device, especially for new
functionality.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5] gpio: gpiolib: Add OF support for maintaining GPIO values on reset

2017-10-20 Thread Andrew Jeffery
On Fri, 2017-10-20 at 09:18 +0200, Linus Walleij wrote:
> On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery 
> wrote:
> 
> > @@ -32,6 +32,7 @@ enum of_gpio_flags {
> > OF_GPIO_SINGLE_ENDED = 0x2,
> > OF_GPIO_OPEN_DRAIN = 0x4,
> > OF_GPIO_SLEEP_MAY_LOSE_VALUE = 0x8,
> > +   OF_GPIO_RESET_TOLERANT = 0x16,
> 
> Now you're mixing up decimal and hex.

Ugh. Whoops.

signature.asc
Description: This is a digitally signed message part


Re: [RFC PATCH 3/5] gpio: gpiolib: Add chardev support for maintaining GPIO values on reset

2017-10-20 Thread Linus Walleij
I paged Bartosz and Michael on this, they are experts on the use cases for
the character device and their opinions are likely more valuable than mine.

On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:
> Similar to devicetree support, add flags and mappings to expose reset
> tolerance configuration through the chardev interface.
>
> Signed-off-by: Andrew Jeffery 
(...)

> +* Unconditionally configure reset tolerance, as it's possible
> +* that the tolerance flag itself becomes tolerant to resets.
> +* Thus it could remain set from a previous environment, but
> +* the current environment may not expect it so.
> +*/
> +   ret = gpiod_set_reset_tolerant(desc,
> +   !!(lflags & 
> GPIOHANDLE_REQUEST_RESET_TOLERANT));
> +   if (ret < 0)
> +   goto out_free_descs;

First, as noted in the first patch, IMO we should just go for persistance,
i.e. you want to flag to the system to keep the line persistent in any case,
no matter if the system goes to sleep or resets.

So the usecase is going to be a control system or similar, a makerspace
project, an industrial product of some kind, driving GPIO from userspace.

I don't see it as helpful to give userspace control over whether the line
is persistent or not. It is more reasonable to assume persistance for
userspace use cases, don't you think? Whether the system goes to sleep
or the gpiochip resets should not make a door suddenly close or the
lights in the christmas tree go out, right? I think if the gpiochip supports
persistance of any kind, we should try to use it and not have userspace
provide flags for that.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/5] gpio: gpiolib: Add OF support for maintaining GPIO values on reset

2017-10-20 Thread Linus Walleij
On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:

> @@ -32,6 +32,7 @@ enum of_gpio_flags {
> OF_GPIO_SINGLE_ENDED = 0x2,
> OF_GPIO_OPEN_DRAIN = 0x4,
> OF_GPIO_SLEEP_MAY_LOSE_VALUE = 0x8,
> +   OF_GPIO_RESET_TOLERANT = 0x16,

Now you're mixing up decimal and hex.

Anyways, I do not think this is necessary.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/5] gpio: gpiolib: Add core support for maintaining GPIO values on reset

2017-10-20 Thread Linus Walleij
On Fri, Oct 20, 2017 at 5:37 AM, Andrew Jeffery  wrote:

> GPIO state reset tolerance is implemented in gpiolib through the
> addition of a new pinconf parameter. With that, some renaming of helpers
> is done to clarify the scope of the already existing
> gpiochip_line_is_persistent(), as it's now ambiguous as to whether that
> means on suspend, reset or both.

Isn't it most reasonable to say persistance covers both cases, reset
and/or sleep? This seems a bit like overdefined.

So can we say that is this flag is set, the hardware and driver should
do its best to preserve the value across any system disruptions.

We can change the wording of course, patches welcome for that.

But do we really need to distinguish the cases of disruption and
whether we cover up for them or not?

I would say we can deal with that the day we have a system with
two register bits (or similar) where you can select to preserve across
sleep, reset, one or the other, AND there is also a usecase such that
a user wants to preserve the value across reset but not suspend or
vice versa.

I suspect that will not happen.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Make squashfs fragments' cache size more configurable

2017-10-20 Thread Phillip Lougher
On Thu, Oct 19, 2017 at 12:50 AM, Qixuan Wu  wrote:
> Hi All,
>
> Currently, squashfs fragments' cache size is only determined by
> config option CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE. Users have
> no way to change the value when they get the binary kernel.

Thank-you for the patches, but they're both pointless and dangerous.
Let's be clear here you're trying to change an "expert only" kernel
configuration option into a user changeable option.  This is stupid
because it is not meant for non-experts to change for good reason.

The fragment cache size isn't  some small tweak to the operation of
Squashfs, it fundamentally affects both the performance and memory
overhead of Squashfs.  As such right from its introduction in 2003, it
has been an "expert only" configuration option at build time.  Even
then it is made clear that the default has been carefully chosen, and
it should only be changed in exceptional circumstances.  This
basically means don't change the default unless you really know what
you're doing, and this means tracing of Squashfs against your use-case
to determine caching behaviour.  There is absolutely no other reason
why you'd want to change the default.  This also means it should be
restricted to kernel configuration time only.

Let's be clear again, very few people should ever want to change the
default, and for the "experts" that do want to do so, they can do so
when configuring the kernel.  If you're not in a position to change it
at kernel configuration time then by definition you're not an expert,
and you shouldn't be able to change it anyway and certainly not as a
user.

There is absolutely no use-case here to make this a user changeable
option.  I can see no upsides in doing this, only downsides.

Frankly if you need to change this value at module insert time then
there is something wrong with your system or build process.   If you
want this because you want to build the kernel/modules once, and then
post-facto configure them for various products then it is your build
process that is broken.   If you want this because you want to
dynamically change Squashfs memory usage/caching behaviour post kernel
configuration time it suggests you're trying to adapt Squashfs's
footprint based on available memory.   This is an abuse of the option
as it's only meant to be used after detailed tracing/analysis and
certainly not used to accommodate unforeseen dynamic low memory
situations, and if that's the reason for needing this option, you
should be looking to solve it elsewhere.

Ultimately this has been an "expert"  kernel configuration only option
since its introduction in 2003, and I never been asked to change it,
and I believe this is because people recognise it as such.  I suspect
you're trying to change this for fundamentally bogus reasons.
Moreover Squashfs is used in many different use-cases and
distributions, and I'm not going to make this a user-changeable option
allowing users to insert the Squashfs module in such a way that will
break its performance.

So NACK.

Phillip Lougher (Squashfs maintainer)

> Now make it be configured when booting or inserting module.
> Actually, it's better that a config option in a number format
> in .config file cat be reconfigured during booting or inserting
> module.
>
> Thanks
> Qixuan
>
> Qixuan Wu (2):
>   Squashfs: Let the number of fragments cached configurable
>   Documentation/kernel-parameters.txt: Add kernel parameter of squashfs
> fragments' cache size
>
>  Documentation/admin-guide/kernel-parameters.txt |  7 
>  fs/squashfs/super.c | 43 
> -
>  2 files changed, 49 insertions(+), 1 deletion(-)
>
> --
> 2.7.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume

2017-10-20 Thread Ulf Hansson
[...]

> In this regards as we consider genpd being a trivial PM domain, those
> examples your bring up above is too me also examples of trivial PM
> domains. Especially because they don't deal with wakeups, as that is
> taken care of by the drivers, right!?

 Not directly, for example, omap device framework has noirq callback 
 implemented
 which forcibly disable all devices which are not PM runtime suspended.
 while doing this it calls drivers PM .runtime_suspend() which may return
 non 0 value and in this case device will be left enabled (powered) at 
 suspend for
 wake up purposes (see _od_suspend_noirq()).

>>>
>>> Yeah, I had that feeling that omap has some trickyness going on. :-)
>>>
>>> I sure that can be fixed in the omap PM domain, although
>>
>> ...slipped with my fingers.. here is the rest of the reply...
>>
>> ..of course that require us to use another way for drivers to signal
>> to the omap PM domain that it needs to stay powered as to deal with
>> wakeup.
>>
>> I can have a look at that more closely, to see if it makes sense to change.
>>
>
> Also, additional note here. some IPs are reused between OMAP/Davinci/Keystone,
> OMAP PM domain have some code running at noirq time to dial with devices left
> in PM runtime enabled state (OMAP PM runtime centric), while Davinci/Keystone 
> haven't (clock_ops.c),
> so pm_runtime_force_* API is actually possibility now to make the same driver 
> work
>  on all these platforms.

That sounds great!

Also, in the end it would be nice to also convert the OMAP PM domain
to genpd. I think most of the needed infrastructure is already there
to do that.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html