[PATCH v3] mm: Fix phys_to_target_node() and memory_add_physaddr_to_nid() exports

2020-11-03 Thread Dan Williams
The core-mm has a default __weak implementation of phys_to_target_node()
to mirror the weak definition of memory_add_physaddr_to_nid(). That
symbol is exported for modules. However, while the export in
mm/memory_hotplug.c exported the symbol in the configuration cases of:

CONFIG_NUMA_KEEP_MEMINFO=y
CONFIG_MEMORY_HOTPLUG=y

...and:

CONFIG_NUMA_KEEP_MEMINFO=n
CONFIG_MEMORY_HOTPLUG=y

...it failed to export the symbol in the case of:

CONFIG_NUMA_KEEP_MEMINFO=y
CONFIG_MEMORY_HOTPLUG=n

Not only is that broken, but Christoph points out that the kernel should
not be exporting any __weak symbol, which means that
memory_add_physaddr_to_nid() example that phys_to_target_node() copied
is broken too.

Rework the definition of phys_to_target_node() and
memory_add_physaddr_to_nid() to not require weak symbols. Move to the
common arch override design-pattern of an asm header defining a symbol
to replace the default implementation.

The only common header that all memory_add_physaddr_to_nid() producing
architectures implement is asm/sparsemem.h. In fact, powerpc already
defines its memory_add_physaddr_to_nid() helper in sparsemem.h.
Double-down on that observation and define phys_to_target_node() where
necessary in asm/sparsemem.h. An alternate consideration that was
discarded was to put this override in asm/numa.h, but that entangles
with the definition of MAX_NUMNODES relative to the inclusion of
linux/nodemask.h, and requires powerpc to grow a new header.

The dependency on NUMA_KEEP_MEMINFO for DEV_DAX_HMEM_DEVICES is invalid
now that the symbol is properly exported / stubbed in all combinations
of CONFIG_NUMA_KEEP_MEMINFO and CONFIG_MEMORY_HOTPLUG.

Reported-by: Randy Dunlap 
Tested-by: Randy Dunlap 
Reported-by: Thomas Gleixner 
Tested-by: Thomas Gleixner 
Reviewed-by: Thomas Gleixner 
Reported-by: kernel test robot 
Reported-by: Christoph Hellwig 
Reviewed-by: Christoph Hellwig 
Fixes: a035b6bf863e ("mm/memory_hotplug: introduce default 
phys_to_target_node() implementation")
Cc: Joao Martins 
Cc: Andrew Morton 
Cc: x...@kernel.org
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
Changes since v2 [1]:
- Fixed build on archs that don't have asm/sparsemem.h, but do use
  linux/numa.h (kbuild-robot)
- Fixed header include paths that need linux/printk.h for the
  pr_info_once() in the default implementations. (kbuild-robot)

[1]: 
http://lore.kernel.org/r/capcyv4gj9ibfuby1yt79cdkrgyaftdvext1ow4qvyrxri4j...@mail.gmail.com

 arch/ia64/include/asm/sparsemem.h|6 ++
 arch/powerpc/include/asm/sparsemem.h |3 ++-
 arch/x86/include/asm/sparsemem.h |   10 ++
 arch/x86/mm/numa.c   |2 ++
 drivers/dax/Kconfig  |1 -
 include/linux/memory_hotplug.h   |   14 --
 include/linux/numa.h |   30 +-
 mm/memory_hotplug.c  |   18 --
 8 files changed, 49 insertions(+), 35 deletions(-)

diff --git a/arch/ia64/include/asm/sparsemem.h 
b/arch/ia64/include/asm/sparsemem.h
index 336d0570e1fa..dd8c166ffd7b 100644
--- a/arch/ia64/include/asm/sparsemem.h
+++ b/arch/ia64/include/asm/sparsemem.h
@@ -18,4 +18,10 @@
 #endif
 
 #endif /* CONFIG_SPARSEMEM */
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int memory_add_physaddr_to_nid(u64 addr);
+#define memory_add_physaddr_to_nid memory_add_physaddr_to_nid
+#endif
+
 #endif /* _ASM_IA64_SPARSEMEM_H */
diff --git a/arch/powerpc/include/asm/sparsemem.h 
b/arch/powerpc/include/asm/sparsemem.h
index 1e6fa371cc38..52519d2c5713 100644
--- a/arch/powerpc/include/asm/sparsemem.h
+++ b/arch/powerpc/include/asm/sparsemem.h
@@ -16,6 +16,8 @@
 extern int create_section_mapping(unsigned long start, unsigned long end,
  int nid, pgprot_t prot);
 extern int remove_section_mapping(unsigned long start, unsigned long end);
+extern int memory_add_physaddr_to_nid(u64 start);
+#define memory_add_physaddr_to_nid memory_add_physaddr_to_nid
 
 #ifdef CONFIG_NUMA
 extern int hot_add_scn_to_nid(unsigned long scn_addr);
@@ -26,6 +28,5 @@ static inline int hot_add_scn_to_nid(unsigned long scn_addr)
 }
 #endif /* CONFIG_NUMA */
 #endif /* CONFIG_MEMORY_HOTPLUG */
-
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_SPARSEMEM_H */
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index 6bfc878f6771..6a9ccc1b2be5 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -28,4 +28,14 @@
 #endif
 
 #endif /* CONFIG_SPARSEMEM */
+
+#ifndef __ASSEMBLY__
+#ifdef CONFIG_NUMA_KEEP_MEMINFO
+extern int phys_to_target_node(phys_addr_t start);
+#define phys_to_target_node phys_to_target_node
+extern int memory_add_physaddr_to_nid(u64 start);
+#define memory_add_physaddr_to_nid memory_add_physaddr_to_nid
+#endif
+#endif /* __ASSEMBLY__ */
+
 #endif /* 

Fwd: Env?o de contrato N? (865567)

2020-11-03 Thread Notificacion
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] x86/mm: Fix phys_to_target_node() export

2020-11-03 Thread Dan Williams
On Tue, Nov 3, 2020 at 5:38 PM Andrew Morton  wrote:
>
> On Mon, 2 Nov 2020 15:52:39 -0800 Dan Williams  
> wrote:
>
> > The attached patch is going through some kbuild-robot exposure to make
> > sure I did not break anything else.
>
> I'll duck this for now - please send it along formally if/when testing
> is successful.

Yeah, the robots are angry, some reworks needed.
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] x86/mm: Fix phys_to_target_node() export

2020-11-03 Thread Andrew Morton
On Mon, 2 Nov 2020 15:52:39 -0800 Dan Williams  wrote:

> The attached patch is going through some kbuild-robot exposure to make
> sure I did not break anything else.

I'll duck this for now - please send it along formally if/when testing
is successful.
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Low-Cost Loans for SMEs & Investment Funding.

2020-11-03 Thread Nicholas Toms
Dear 

My name is Nicholas Toms, an investment portfolio manager with ACLL . We offer 
the right loan Investment funding with low 
interest to finance your business or project ranging from US$1M to US$2BIllion.

Kindly contact me for more details as I am open to questions.


Sincerely,
Nicholas Toms
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Оптимизация и продвижение САЙТОВ на ПЕРВУЮ страницу поисковых систем!

2020-11-03 Thread SITE
Продвижение сайтов в ТОП-3

ТОЛЬКО "БЕЛЫЕ" МЕТОДЫ ПРОДВИЖЕНИЯ САЙТОВ!

Мы занимаемся продвижением сайтов в GOOGLE и YANDEX

Чтоб Ваш сайт начал приносить прибыль, нужно раскрутить его в поисковых 
системах по нужным Вам словам\фразам, к примеру если Ваша деятельность 
ПОЛИГРАФИЯ, то при вводе в поисковик этого слова, Ваш сайт должен быть на 
первой странице в поиске, иначе Ваших потенциальных клиентов заберут Ваши 
конкуренты. 

Три шага к продвижению Вашего сайта!

ПОДБОР КЛЮЧЕВЫХ ФРАЗ ДЛЯ САЙТА
Мы бесплатно проведем анализ Вашего сайта и вышлем Вам список фраз по Вашей 
тематике.

СЕО ОПТИМИЗАЦИЯ САЙТА 
Написание заголовков, правильных СЕО описаний, ключевых фраз и перелинковка 
страниц.

ПРОДВИЖЕНИЕ ПО КЛЮЧЕВЫМ ФРАЗАМ
Продвижение сайта по выбранным Вами ключевым фразам в ТОП поисковых систем.

Как заказать продвижение сайта

Для заказа продвижения Вашего сайта напишите нам его адрес, мы проведем анализ 
и Вышлем Вам предложение по продвижению в ТОП поисковых систем.

С Уважением Владимир Александрович

tel: +380634010741

skype: org-reestr

mail: infoo...@bigmir.net

Отказаться от рассылки или изменить контактную информацию.


-- 
Это сообщение проверено на вирусы антивирусом Avast.
https://www.avast.com/antivirus
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


New York tech startups

2020-11-03 Thread fieldengineer59
New York tech startups

A report compiled by Paul Tostevin, Associate Director of Savills in London, 
showed that, in terms of business and technological environment, bustle of the 
city, talent groups, real estate costs and mobility, New York has become the 
leading technologica
https://www.fieldengineer.com/blogs/new-york-tech-city-enables-startups-prosper
l city of world, surpassing San Francisco.
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


hi

2020-11-03 Thread fieldengineer59
hi
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH V2 05/10] x86/pks: Add PKS kernel API

2020-11-03 Thread Ira Weiny
On Tue, Nov 03, 2020 at 07:14:07PM +0100, Greg KH wrote:
> On Tue, Nov 03, 2020 at 09:53:36AM -0800, Ira Weiny wrote:
> > On Tue, Nov 03, 2020 at 07:50:24AM +0100, Greg KH wrote:
> > > On Mon, Nov 02, 2020 at 12:53:15PM -0800, ira.we...@intel.com wrote:
> > > > From: Fenghua Yu 
> > > > 
> > 
> > [snip]
> > 
> > > > diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
> > > > index 2955ba976048..0959a4c0ca64 100644
> > > > --- a/include/linux/pkeys.h
> > > > +++ b/include/linux/pkeys.h
> > > > @@ -50,4 +50,28 @@ static inline void copy_init_pkru_to_fpregs(void)
> > > >  
> > > >  #endif /* ! CONFIG_ARCH_HAS_PKEYS */
> > > >  
> > > > +#define PKS_FLAG_EXCLUSIVE 0x00
> > > > +
> > > > +#ifndef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
> > > > +static inline int pks_key_alloc(const char * const pkey_user, int 
> > > > flags)
> > > > +{
> > > > +   return -EOPNOTSUPP;
> > > > +}
> > > > +static inline void pks_key_free(int pkey)
> > > > +{
> > > > +}
> > > > +static inline void pks_mk_noaccess(int pkey)
> > > > +{
> > > > +   WARN_ON_ONCE(1);
> > > 
> > > So for panic-on-warn systems, this is ok to reboot the box?
> > 
> > I would not expect this to reboot the box no.  But it is a violation of the 
> > API
> > contract.  If pky_key_alloc() returns an error calling any of the other
> > functions is an error.
> > 
> > > 
> > > Are you sure, that feels odd...
> > 
> > It does feel odd and downright wrong...  But there are a lot of 
> > WARN_ON_ONCE's
> > out there to catch this type of internal programming error.  Is 
> > panic-on-warn
> > commonly used?
> 
> Yes it is, and we are trying to recover from that as it is something
> that you should recover from.  Properly handle the error and move on.

Sorry, I did not know that...  Ok I'll look at the series because I probably
have others I need to change.

Thanks,
Ira
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH V2 05/10] x86/pks: Add PKS kernel API

2020-11-03 Thread Greg KH
On Tue, Nov 03, 2020 at 09:53:36AM -0800, Ira Weiny wrote:
> On Tue, Nov 03, 2020 at 07:50:24AM +0100, Greg KH wrote:
> > On Mon, Nov 02, 2020 at 12:53:15PM -0800, ira.we...@intel.com wrote:
> > > From: Fenghua Yu 
> > > 
> 
> [snip]
> 
> > > diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
> > > index 2955ba976048..0959a4c0ca64 100644
> > > --- a/include/linux/pkeys.h
> > > +++ b/include/linux/pkeys.h
> > > @@ -50,4 +50,28 @@ static inline void copy_init_pkru_to_fpregs(void)
> > >  
> > >  #endif /* ! CONFIG_ARCH_HAS_PKEYS */
> > >  
> > > +#define PKS_FLAG_EXCLUSIVE 0x00
> > > +
> > > +#ifndef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
> > > +static inline int pks_key_alloc(const char * const pkey_user, int flags)
> > > +{
> > > + return -EOPNOTSUPP;
> > > +}
> > > +static inline void pks_key_free(int pkey)
> > > +{
> > > +}
> > > +static inline void pks_mk_noaccess(int pkey)
> > > +{
> > > + WARN_ON_ONCE(1);
> > 
> > So for panic-on-warn systems, this is ok to reboot the box?
> 
> I would not expect this to reboot the box no.  But it is a violation of the 
> API
> contract.  If pky_key_alloc() returns an error calling any of the other
> functions is an error.
> 
> > 
> > Are you sure, that feels odd...
> 
> It does feel odd and downright wrong...  But there are a lot of WARN_ON_ONCE's
> out there to catch this type of internal programming error.  Is panic-on-warn
> commonly used?

Yes it is, and we are trying to recover from that as it is something
that you should recover from.  Properly handle the error and move on.

thanks,

greg k-h
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH V2 05/10] x86/pks: Add PKS kernel API

2020-11-03 Thread Ira Weiny
On Tue, Nov 03, 2020 at 07:50:24AM +0100, Greg KH wrote:
> On Mon, Nov 02, 2020 at 12:53:15PM -0800, ira.we...@intel.com wrote:
> > From: Fenghua Yu 
> > 

[snip]

> > diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
> > index 2955ba976048..0959a4c0ca64 100644
> > --- a/include/linux/pkeys.h
> > +++ b/include/linux/pkeys.h
> > @@ -50,4 +50,28 @@ static inline void copy_init_pkru_to_fpregs(void)
> >  
> >  #endif /* ! CONFIG_ARCH_HAS_PKEYS */
> >  
> > +#define PKS_FLAG_EXCLUSIVE 0x00
> > +
> > +#ifndef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
> > +static inline int pks_key_alloc(const char * const pkey_user, int flags)
> > +{
> > +   return -EOPNOTSUPP;
> > +}
> > +static inline void pks_key_free(int pkey)
> > +{
> > +}
> > +static inline void pks_mk_noaccess(int pkey)
> > +{
> > +   WARN_ON_ONCE(1);
> 
> So for panic-on-warn systems, this is ok to reboot the box?

I would not expect this to reboot the box no.  But it is a violation of the API
contract.  If pky_key_alloc() returns an error calling any of the other
functions is an error.

> 
> Are you sure, that feels odd...

It does feel odd and downright wrong...  But there are a lot of WARN_ON_ONCE's
out there to catch this type of internal programming error.  Is panic-on-warn
commonly used?

Ira
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


DO YOU NEED A FINANCIAL LOAN?

2020-11-03 Thread Global Financial Services SA
Greetings,



To Whom It May Concern,



We, the Global Financial Services are offering loans at a very low interest 
rate of 3% per year. We offer Personal loans, Debt Consolidation Loan, Venture 
Capital, Business Loan, Education Loan, Home Loan or "Loan for any reason and 
for all citizens and non-citizens with either a good or bad credit history.



Have you been turned down by your bank? Do you have bad credit? Do you have 
unpaid bills? Are you in debt? Blacklisted? Are you under Debt review? Do you 
need to set up a business? Worry no more as we are here to offer you a low 
interest loan.



Our loan ranges from US$10, 000.00 (Ten Thousand United States Dollars) to 
US$25,000,000.00 (Twenty Five Million United States Dollars).



Locally our loan ranges from R20, 000.00 (Twenty Thousand Rand) up to the sum 
of R5, 000,000.00(Five Million Rand).





If you are interested kindly contact us with your:

1. Full Names

2. Contact Address

3. Occupation

4. Contact Telephone Numbers

5. Type of loan

6. Loan Amount

7. Duration of repayment



Do not hesitate to contact us on the telephone and email address below for 
further clarification(s).

Tel/WhatsApp:  +27 68 231 5874

Email Address:  global.fs...@consultant.com



Warm Regards,

Customer Service

Global Financial Services Pty South Africa.

___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v6 0/6] mm: introduce memfd_secret system call to create "secret" memory areas

2020-11-03 Thread Mike Rapoport
On Tue, Nov 03, 2020 at 02:52:14PM +0100, Hagen Paul Pfeifer wrote:
> > On 11/02/2020 4:40 PM Mike Rapoport  wrote:
> 
> > > Isn't memfd_secret currently *unnecessarily* designed to be a "one task
> > > feature"? memfd_secret fulfills exactly two (generic) features:
> > > 
> > > - address space isolation from kernel (aka SECRET_EXCLUSIVE, not in 
> > > kernel's
> > >   direct map) - hide from kernel, great
> > > - disabling processor's memory caches against speculative-execution 
> > > vulnerabilities
> > >   (spectre and friends, aka SECRET_UNCACHED), also great
> > > 
> > > But, what about the following use-case: implementing a hardened IPC 
> > > mechanism
> > > where even the kernel is not aware of any data and optionally via 
> > > SECRET_UNCACHED
> > > even the hardware caches are bypassed! With the patches we are so close to
> > > achieving this.
> > > 
> > > How? Shared, SECRET_EXCLUSIVE and SECRET_UNCACHED mmaped pages for IPC
> > > involved tasks required to know this mapping (and memfd_secret fd). After 
> > > IPC
> > > is done, tasks can copy sensitive data from IPC pages into memfd_secret()
> > > pages, un-sensitive data can be used/copied everywhere.
> > 
> > As long as the task share the file descriptor, they can share the
> > secretmem pages, pretty much like normal memfd.
> 
> Including process_vm_readv() and process_vm_writev()? Let's take a 
> hypothetical
> "dbus-daemon-secure" service that receives data from process A and wants to
> copy/distribute it to data areas of N other processes. Much like dbus but 
> without
> SOCK_DGRAM rather direct copy into secretmem/mmap pages (ring-buffer). Should 
> be
> possible, right?

I'm not sure I follow you here.
For process_vm_readv() and process_vm_writev() secremem will be only
accessible on the local part, but not on the remote.
So copying data to secretmem pages using process_vm_writev wouldn't
work.

> > > One missing piece is still the secure zeroization of the page(s) if the
> > > mapping is closed by last process to guarantee a secure cleanup. This can
> > > probably done as an general mmap feature, not coupled to memfd_secret() 
> > > and
> > > can be done independently ("reverse" MAP_UNINITIALIZED feature).
> > 
> > There are "init_on_alloc" and "init_on_free" kernel parameters that
> > enable zeroing of the pages on alloc and on free globally.
> > Anyway, I'll add zeroing of the freed memory to secretmem.
> 
> Great, this allows page-specific (thus runtime-performance-optimized) zeroing
> of secured pages. init_on_free lowers the performance to much and is not 
> precice
> enough.
> 
> Hagen

-- 
Sincerely yours,
Mike.
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v6 0/6] mm: introduce memfd_secret system call to create "secret" memory areas

2020-11-03 Thread Hagen Paul Pfeifer
> On 11/02/2020 4:40 PM Mike Rapoport  wrote:

> > Isn't memfd_secret currently *unnecessarily* designed to be a "one task
> > feature"? memfd_secret fulfills exactly two (generic) features:
> > 
> > - address space isolation from kernel (aka SECRET_EXCLUSIVE, not in kernel's
> >   direct map) - hide from kernel, great
> > - disabling processor's memory caches against speculative-execution 
> > vulnerabilities
> >   (spectre and friends, aka SECRET_UNCACHED), also great
> > 
> > But, what about the following use-case: implementing a hardened IPC 
> > mechanism
> > where even the kernel is not aware of any data and optionally via 
> > SECRET_UNCACHED
> > even the hardware caches are bypassed! With the patches we are so close to
> > achieving this.
> > 
> > How? Shared, SECRET_EXCLUSIVE and SECRET_UNCACHED mmaped pages for IPC
> > involved tasks required to know this mapping (and memfd_secret fd). After 
> > IPC
> > is done, tasks can copy sensitive data from IPC pages into memfd_secret()
> > pages, un-sensitive data can be used/copied everywhere.
> 
> As long as the task share the file descriptor, they can share the
> secretmem pages, pretty much like normal memfd.

Including process_vm_readv() and process_vm_writev()? Let's take a hypothetical
"dbus-daemon-secure" service that receives data from process A and wants to
copy/distribute it to data areas of N other processes. Much like dbus but 
without
SOCK_DGRAM rather direct copy into secretmem/mmap pages (ring-buffer). Should be
possible, right?

> > One missing piece is still the secure zeroization of the page(s) if the
> > mapping is closed by last process to guarantee a secure cleanup. This can
> > probably done as an general mmap feature, not coupled to memfd_secret() and
> > can be done independently ("reverse" MAP_UNINITIALIZED feature).
> 
> There are "init_on_alloc" and "init_on_free" kernel parameters that
> enable zeroing of the pages on alloc and on free globally.
> Anyway, I'll add zeroing of the freed memory to secretmem.

Great, this allows page-specific (thus runtime-performance-optimized) zeroing
of secured pages. init_on_free lowers the performance to much and is not precice
enough.

Hagen
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] x86/mm: Fix phys_to_target_node() export

2020-11-03 Thread Thomas Gleixner
On Mon, Nov 02 2020 at 15:52, Dan Williams wrote:
> On Sat, Oct 31, 2020 at 2:10 AM Christoph Hellwig  wrote:
> The dependency on NUMA_KEEP_MEMINFO for DEV_DAX_HMEM_DEVICES is invalid
> now that the symbol is properly exported / stubbed in all combinations
> of CONFIG_NUMA_KEEP_MEMINFO and CONFIG_MEMORY_HOTPLUG.
>
> Reported-by: Randy Dunlap 
> Reported-by: Thomas Gleixner 
> Reported-by: kernel test robot 
> Reported-by: Christoph Hellwig 
> Fixes: a035b6bf863e ("mm/memory_hotplug: introduce default 
> phys_to_target_node() implementation")
> Cc: Joao Martins 
> Cc: Andrew Morton 
> Cc: x...@kernel.org
> Cc: Tony Luck 
> Cc: Fenghua Yu 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Vishal Verma 
> Signed-off-by: Dan Williams 

Tested-by: Thomas Gleixner 

Reviewed-by: Thomas Gleixner 
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v6 0/6] mm: introduce memfd_secret system call to create "secret" memory areas

2020-11-03 Thread David Hildenbrand

On 03.11.20 10:52, Mike Rapoport wrote:

On Mon, Nov 02, 2020 at 06:51:09PM +0100, David Hildenbrand wrote:

Assume you have a system with quite some ZONE_MOVABLE memory (esp. in
virtualized environments), eating up a significant amount of !ZONE_MOVABLE
memory dynamically at runtime can lead to non-obvious issues. It looks like
you have plenty of free memory, but the kernel might still OOM when trying
to do kernel allocations e.g., for pagetables. With CMA we at least know
what we're dealing with - it behaves like ZONE_MOVABLE except for the owner
that can place unmovable pages there. We can use it to compute statically
the amount of ZONE_MOVABLE memory we can have in the system without doing
harm to the system.


Why would you say that secretmem allocates from !ZONE_MOVABLE?
If we put boot time reservations aside, the memory allocation for
secretmem follows the same rules as the memory allocations for any file
descriptor. That means we allocate memory with GFP_HIGHUSER_MOVABLE.


Oh, okay - I missed that! I had the impression that pages are unmovable and
allocating from ZONE_MOVABLE would be a violation of that?


After the allocation the memory indeed becomes unmovable but it's not
like we are eating memory from other zones here.


... and here you have your problem. That's a no-no. We only allow it in very
special cases where it can't be avoided - e.g., vfio having to pin guest
memory when passing through memory to VMs.

Hotplug memory, online it to ZONE_MOVABLE. Allocate secretmem. Try to unplug
the memory again -> endless loop in offline_pages().

Or have a CMA area that gets used with GFP_HIGHUSER_MOVABLE. Allocate
secretmem. The owner of the area tries to allocate memory - always fails.
Purpose of CMA destroyed.




Ideally, we would want to support page migration/compaction and allow for
allocation from ZONE_MOVABLE as well. Would involve temporarily mapping,
copying, unmapping. Sounds feasible, but not sure which roadblocks we would
find on the way.


We can support migration/compaction with temporary mapping. The first
roadblock I've hit there was that migration allocates 4K destination
page and if we use it in secret map we are back to scrambling the direct
map into 4K pieces. It still sounds feasible but not as trivial :)


That sounds like the proper way for me to do it then.
  
Although migration of secretmem pages sounds feasible now, there maybe

other issues I didn't see because I'm not very familiar with
migration/compaction code.


Migration of PMDs might also be feasible -  and it would be even 
cleaner. But I agree that that might require more work and starting with 
something simpler (!movable) is the right way to move forward.


--
Thanks,

David / dhildenb
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH v6 0/6] mm: introduce memfd_secret system call to create "secret" memory areas

2020-11-03 Thread Mike Rapoport
On Mon, Nov 02, 2020 at 06:51:09PM +0100, David Hildenbrand wrote:
> > > Assume you have a system with quite some ZONE_MOVABLE memory (esp. in
> > > virtualized environments), eating up a significant amount of !ZONE_MOVABLE
> > > memory dynamically at runtime can lead to non-obvious issues. It looks 
> > > like
> > > you have plenty of free memory, but the kernel might still OOM when trying
> > > to do kernel allocations e.g., for pagetables. With CMA we at least know
> > > what we're dealing with - it behaves like ZONE_MOVABLE except for the 
> > > owner
> > > that can place unmovable pages there. We can use it to compute statically
> > > the amount of ZONE_MOVABLE memory we can have in the system without doing
> > > harm to the system.
> > 
> > Why would you say that secretmem allocates from !ZONE_MOVABLE?
> > If we put boot time reservations aside, the memory allocation for
> > secretmem follows the same rules as the memory allocations for any file
> > descriptor. That means we allocate memory with GFP_HIGHUSER_MOVABLE.
> 
> Oh, okay - I missed that! I had the impression that pages are unmovable and
> allocating from ZONE_MOVABLE would be a violation of that?
> 
> > After the allocation the memory indeed becomes unmovable but it's not
> > like we are eating memory from other zones here.
> 
> ... and here you have your problem. That's a no-no. We only allow it in very
> special cases where it can't be avoided - e.g., vfio having to pin guest
> memory when passing through memory to VMs.
> 
> Hotplug memory, online it to ZONE_MOVABLE. Allocate secretmem. Try to unplug
> the memory again -> endless loop in offline_pages().
> 
> Or have a CMA area that gets used with GFP_HIGHUSER_MOVABLE. Allocate
> secretmem. The owner of the area tries to allocate memory - always fails.
> Purpose of CMA destroyed.
> 
> > 
> > > Ideally, we would want to support page migration/compaction and allow for
> > > allocation from ZONE_MOVABLE as well. Would involve temporarily mapping,
> > > copying, unmapping. Sounds feasible, but not sure which roadblocks we 
> > > would
> > > find on the way.
> > 
> > We can support migration/compaction with temporary mapping. The first
> > roadblock I've hit there was that migration allocates 4K destination
> > page and if we use it in secret map we are back to scrambling the direct
> > map into 4K pieces. It still sounds feasible but not as trivial :)
> 
> That sounds like the proper way for me to do it then.
 
Although migration of secretmem pages sounds feasible now, there maybe
other issues I didn't see because I'm not very familiar with
migration/compaction code.

I've looked again at CMA and I'm inclined to agree with you that using
CMA for secretmem allocations could be the right thing. 

-- 
Sincerely yours,
Mike.
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


Re: [PATCH] x86/mm: Fix phys_to_target_node() export

2020-11-03 Thread Christoph Hellwig
This version looks sensible to me:

Reviewed-by: Christoph Hellwig 
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org