Re: [PATCH] libnvdimm/namespace: Fix label tracking error

2019-04-19 Thread Sasha Levin
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: bf9bccc14c05 libnvdimm: pmem label sets and namespace instantiation.. The bot has tested the following trees: v5.0.8, v4.19.35, v4.14.112, v4.9.169, v4.4.178, v5.0.8: Build OK!

[PATCH] libnvdimm/namespace: Fix label tracking error

2019-04-19 Thread Dan Williams
Users have reported intermittent occurrences of DIMM initialization failures due to duplicate allocations of address capacity detected in the labels, or errors of the form below, both have the same root cause. nd namespace1.4: failed to track label: 0 WARNING: CPU: 17 PID: 1381 at

[PATCH 5.0 057/246] mm/resource: Return real error codes from walk failures

2019-04-04 Thread Greg Kroah-Hartman
5.0-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 5cd401ace914dc68556c6d2fcae0c349444d5f86 ] walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error

Returned mail: Data format error

2019-04-01 Thread Bounced mail
: Please reply to postmas...@lists.01.org if you feel this message to be in error. ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

Returned mail: Data format error

2019-03-27 Thread kyle
… ndfhWÀ¿åŠ’w2,&%’æß71|<0YŽØ‘˜á¸Ç\ÂJîÁ\üF<NيÅ>¸•ß½pH3ÓBæŠw%ÏF´Š:•Ú5f`!Â^?–ì)ÙjVºnîêm˜Ç÷êŠ8p’­_óP’Fœ/?I•°œ¼¤_5Ð]š±›±3¼‰Zƒ¾xšbM‹E V5„¡4£b?¹ÝúN5áqÁ1†äP¹eÙÓVÙïÖ*äîŽþ¿uNèՔ[™.ÈɔG!Ø᣼˒ÚYü–ag.Ÿk.(F¼c°ÈpGšT<Òâxx¼»±ÞÏËÞEu„ç¯Ò|ÚO.s.v6PÂ>

[PATCH AUTOSEL 5.0 061/262] mm/resource: Return real error codes from walk failures

2019-03-27 Thread Sasha Levin
From: Dave Hansen [ Upstream commit 5cd401ace914dc68556c6d2fcae0c349444d5f86 ] walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range

Mail System Error - Returned Mail

2019-03-21 Thread Automatic Email Delivery Software
Message could not be delivered ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

MAIL SYSTEM ERROR - RETURNED MAIL

2019-03-20 Thread Bounced mail
The original message was received at Wed, 20 Mar 2019 18:25:25 +0800 from lists.01.org [178.210.32.99] - The following addresses had permanent fatal errors - - Transcript of session follows - while talking to lists.01.org.: >>> MAIL From:"Bounced mail" <<< 501 "Bounced mail"

Mail System Error - Returned Mail

2019-03-18 Thread jacob . jun . pan
™Ò0JàљñK6øߗgíà8óÛft e¢ç)ZÃ™HK TÒ f…æ2¹ÔZÀ./åð^gÄÖþÛF:ÝMˆ3Ñ#—\;6 ûá7œ-ðAâHÓß8C\9U¥6Y†ñJ> ҁ’]½ç»ÒáA˜ý?HD‚ß;¶Þ*}£Ö;È^¬_¾Y¥vÏ>lA|îþu6»Qq©ê<)\?¨Ô:ZO¡lQÁ‘HÇ5[Ö¿¾¬"®ƒâþ“MÃfŸo¸Â,½Ø† ¹¶Ë»‡{©96½¾¯ÞE­îA#PMa9ÍÔîF±Ìbç ô}‘ð#‡ø¨£|£ý]CIŠ¹’`ß³Xüþy~ZmÊßj)ݜˆb•©>`H›UÚ²_È9]üè9

Returned mail: Data format error

2019-03-17 Thread Returned mail
The original message was included as attachment ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

MAIL SYSTEM ERROR - RETURNED MAIL

2019-03-11 Thread Post Office
;uOåÞs‘*ôûŽ²ðºmZᯁmM᪺Ľ™ìiŸgֵΏÒÇTq<ß÷~y­t|ƒ÷©íܙāÀŸ *¿Ü(ò~Òçjm¹¤‘Œ›¸›°õ•qÍ?[ÒÌF e9È`}ÁÒ"åÁÅ<{ó­Bü(K“ÛáÕ¥y8/܊Ôf²4üSµyÞU¨ †Ö;mÒµ#²O¯%Y¶`w4à SyºL9‘|Ԟےºú]†’Ì9.¸&è,›ƒ%³ô²Ñ܎/”Å˔}Œ½¶`5õ3o4Jý2Åí¢]§T[ªj›4*>O––… \œ¦*§SeĦP/£Ø0vÝÝat³{ÆÑ÷ŸÕ*XMI3TÍ¿ºHAd¹Óy;øl3èO£F:ôE¥­sN|ڋ¼‘†ÅA–Ýéo[ØHR/ûý  ÷

error

2019-03-10 Thread MAILER-DAEMON
The original message was received at Mon, 11 Mar 2019 11:44:43 +0800 from lists.01.org [32.243.43.235] - The following addresses had permanent fatal errors - ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org

Re: [PATCH] update NFIT flags error message

2019-03-01 Thread Dan Williams
On Thu, Feb 28, 2019 at 12:14 PM Toshi Kani wrote: > > ACPI NFIT flags field reports major errors on NVDIMM, which need > user's attention. > > Update the current log to a proper error message with dev_err(). > The current message string is kept for grep-compatibility. >

error

2019-03-01 Thread Mail Administrator
Óxv›Vië'Æq²‘.s$¦1 Q1ۘOÅÉ5ÙèŸ9mY'жÒqƒµæ?Ì*Ÿ× >œ6·_ìþ{>D`‰á  %9½u#›s:Zéì«Î"p}]Ä_0ƒpsæ¼Èl¯\Œ`nän4àÑâq„F-ƒ:Eß.ªÜÜp¿½"bR›³ß6¼¾!ãapïâ%jd¦ÇŸ‘‡ÇêÅx®^“ËËÀiS°\¥ÛÊø(Œ½*áš2›í«Iڜû‚œy¢0¯Z¥°{»olöänN ®ˆú5ª´"˜ÏîõGK>ov4ŒGͯoÊT\ça˜Vô_c<¸x‰ |–Õ¡¼­Y D”’…ß‘Ktk²ˆ À5/zꖑXLü±m´àËÖÖÁ˛…Šm_¤

[PATCH] update NFIT flags error message

2019-02-28 Thread Toshi Kani
ACPI NFIT flags field reports major errors on NVDIMM, which need user's attention. Update the current log to a proper error message with dev_err(). The current message string is kept for grep-compatibility. Signed-off-by: Toshi Kani Cc: Dan Williams Cc: "Rafael J. Wysocki" Cc: Robe

Returned mail: Data format error

2019-02-26 Thread Returned mail
9]\F¦úÖE0¾Hí)T"ƒ¯zbö…8Z~úÌðàä×woؐ|ËGÏ¸ð†ÜË>˶ÔÛe ÷QÊ,VP‘’`Rùˆ‚5MÕZ{ç¤m­¼\_]Ä Ç’Øį><•Ýádu–÷PéDUíPì>Cž´è,V܂§èéÆ œFÁ¬—±s±/üÒ®õ ÚæÏûé¶mÚ Hº99¼Nyà&[¾S³ ñyYžJ8âEñӖk Yó邺C¹-`­Fc^sӆ4BõnºWÓr³âÅPj›œ†–7Š}d3_v~k-ɵµ}´0¸vˆ.oÚLJÁgHe—(Ñvê—J7ß÷D§&¬ÛOF-ÛlX¶W …

Re: [PATCH 1/5] mm/resource: return real error codes from walk failures

2019-02-25 Thread Christophe Leroy
Le 25/02/2019 à 19:57, Dave Hansen a écrit : From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func

[PATCH 1/5] mm/resource: return real error codes from walk failures

2019-02-25 Thread Dave Hansen
From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret

Returned mail: Data format error

2019-02-24 Thread Bounced mail
: Please reply to postmas...@lists.01.org if you feel this message to be in error. ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

Returned mail: Data format error

2019-02-22 Thread Automatic Email Delivery Software
___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

Returned mail: Data format error

2019-02-22 Thread Returned mail
: Please reply to postmas...@lists.01.org if you feel this message to be in error. ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [PATCH 1/5] libnvdimm, namespace: release labels properly on error

2019-01-31 Thread Wei Yang
On Mon, Jan 28, 2019 at 08:30:14AM +0800, Wei Yang wrote: >In init_active_labels(), it iterates on ndr_mappings to create its >corresponding labels. When there is an error, it is supposed to release >those labels created. But current implementation doesn't handle this >well in

Re: [PATCH 1/5] mm/resource: return real error codes from walk failures

2019-01-28 Thread Michael Ellerman
Dave Hansen writes: > On 1/25/19 1:02 PM, Bjorn Helgaas wrote: >>> @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long >>> unsigned long flags; >>> struct resource res; >>> unsigned long pfn, end_pfn; >>> - int ret = -1; >>> + int ret = -EINVAL; >> Can

[PATCH 1/5] libnvdimm, namespace: release labels properly on error

2019-01-27 Thread Wei Yang
In init_active_labels(), it iterates on ndr_mappings to create its corresponding labels. When there is an error, it is supposed to release those labels created. But current implementation doesn't handle this well in two aspects: * when error happens during ndd check, labels are not released

Re: [PATCH 1/5] mm/resource: return real error codes from walk failures

2019-01-25 Thread Bjorn Helgaas
On Fri, Jan 25, 2019 at 3:10 PM Dave Hansen wrote: > > On 1/25/19 1:02 PM, Bjorn Helgaas wrote: > >> @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long > >> unsigned long flags; > >> struct resource res; > >> unsigned long pfn, end_pfn; > >> - int ret = -1;

Re: [PATCH 1/5] mm/resource: return real error codes from walk failures

2019-01-25 Thread Dave Hansen
On 1/25/19 1:02 PM, Bjorn Helgaas wrote: >> @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long >> unsigned long flags; >> struct resource res; >> unsigned long pfn, end_pfn; >> - int ret = -1; >> + int ret = -EINVAL; > Can you either make a similar

Re: [PATCH 1/5] mm/resource: return real error codes from walk failures

2019-01-25 Thread Bjorn Helgaas
On Thu, Jan 24, 2019 at 5:21 PM Dave Hansen wrote: > > > From: Dave Hansen > > walk_system_ram_range() can return an error code either becuase *it* > failed, or because the 'func' that it calls returned an error. The > memory hotplug does the following: > >

[PATCH 1/5] mm/resource: return real error codes from walk failures

2019-01-24 Thread Dave Hansen
From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret

Re: [PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-17 Thread William Kucharski
> On Jan 16, 2019, at 6:07 PM, Jane Chu wrote: > > It's just coding style I'm used to, no big deal. > Up to you to decide. :) Personally I like a (void) cast as it's pretty long-standing syntactic sugar to cast a call that returns a value we don't care about to (void) to show we know it

Re: [PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-16 Thread Jane Chu
On 1/16/2019 3:32 PM, Naoya Horiguchi wrote: Hi Jane, On Wed, Jan 16, 2019 at 09:56:02AM -0800, Jane Chu wrote: Hi, Naoya, On 1/16/2019 1:30 AM, Naoya Horiguchi wrote: diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 7c72f2a95785..831be5ff5f4d 100644 ---

Re: [PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-16 Thread Naoya Horiguchi
Hi Jane, On Wed, Jan 16, 2019 at 09:56:02AM -0800, Jane Chu wrote: > Hi, Naoya, > > On 1/16/2019 1:30 AM, Naoya Horiguchi wrote: > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 7c72f2a95785..831be5ff5f4d 100644 > --- a/mm/memory-failure.c > +++

Re: [PATCH 1/4] mm/resource: return real error codes from walk failures

2019-01-16 Thread Bjorn Helgaas
On Wed, Jan 16, 2019 at 12:25 PM Dave Hansen wrote: > > > From: Dave Hansen > > walk_system_ram_range() can return an error code either becuase *it* > failed, or because the 'func' that it calls returned an error. The > memory hotplug does the following: > >

[PATCH 1/4] mm/resource: return real error codes from walk failures

2019-01-16 Thread Dave Hansen
From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret

Re: [PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-16 Thread Jane Chu
Hi, Naoya, On 1/16/2019 1:30 AM, Naoya Horiguchi wrote: diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 7c72f2a95785..831be5ff5f4d 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -372,7 +372,8 @@ static void kill_procs(struct list_head *to_kill, int forcekill, bool

Re: [PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-16 Thread Jane Chu
(like a few per 24hours). After swapping the CPU, the problem stopped reproducing. But one could argue that perhaps the faulty CPU exposed a small race window from collect_procs() to unmap_mapping_range() and to kill_procs(), hence caught the kernel PMEM error handler off guard. There's

Re: [PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-16 Thread Dan Williams
ucing. > > > > > > > > But one could argue that perhaps the faulty CPU exposed a small race > > > > window > > > > from collect_procs() to unmap_mapping_range() and to kill_procs(), hence > > > > caug

[PATCH] mm: hwpoison: use do_send_sig_info() instead of force_sig() (Re: PMEM error-handling forces SIGKILL causes kernel panic)

2019-01-16 Thread Naoya Horiguchi
y reasonable NVDIMM (like a few per 24hours). > > > > > > After swapping the CPU, the problem stopped reproducing. > > > > > > But one could argue that perhaps the faulty CPU exposed a small race > > > window > > > from collect_procs() to unmap_mappin

Re: PMEM error-handling forces SIGKILL causes kernel panic

2019-01-11 Thread Naoya Horiguchi
rhaps the faulty CPU exposed a small race window > > from collect_procs() to unmap_mapping_range() and to kill_procs(), hence > > caught the kernel PMEM error handler off guard. > > There's definitely a race, and the implementation is buggy as can be > se

Re: PMEM error-handling forces SIGKILL causes kernel panic

2019-01-09 Thread Dan Williams
[ switch to text mail, add lkml and Naoya ] On Wed, Jan 9, 2019 at 12:19 PM Jane Chu wrote: > > Hi, Dan, > > Sorry for the late report. > We recently saw panics from PMEM error handling, here are the log messages > and stack trace. "<--" are added by me. > &

PMEM error-handling forces SIGKILL causes kernel panic

2019-01-09 Thread Jane Chu
Hi, Dan, Sorry for the late report. We recently saw panics from PMEM error handling, here are the log messages and stack trace. "<--" are added by me. [ 4488.098830] mce: Uncorrected hardware memory error in user-access at a6ec46f8 <-- [ 4488.131625] Memory failure: 0xa

RE: Question on Error Injection

2019-01-08 Thread Elliott, Robert (Persistent Memory)
For mailing lists, please use plaintext rather than HTML emails, and don’t top-post. > Are these statements correct? > > 1) Reading from a memory location (mmaped) with uncorrectable AND unknown > error (also called as latent error) results in a machine-check (which >

Re: Question on Error Injection

2019-01-04 Thread Verma, Vishal L
ersistent Memory) > > Subject: Re: Question on Error Injection > > > > > > On Thu, 2019-01-03 at 22:30 +, Elliott, Robert (Persistent Memory) > > wrote: > > > > -Original Message- > > > > From: Linux-nvdimm On Behalf

RE: Question on Error Injection

2019-01-03 Thread Elliott, Robert (Persistent Memory)
cation uses mmap() of a file or device so it can do loads directly from persistent memory addresses and it tries to load from an address with an uncorrectable error, the CPU cannot complete that instruction without causing data corruption. There's no data value that means "this is bad data." So,

RE: Question on Error Injection

2019-01-03 Thread Elliott, Robert (Persistent Memory)
> -Original Message- > From: Verma, Vishal L > Sent: Thursday, January 3, 2019 6:03 PM > To: kamalkakri2...@yahoo.com; linux-nvdimm@lists.01.org; Elliott, Robert > (Persistent Memory) > Subject: Re: Question on Error Injection > > > On Thu, 2019-01-03 at 2

Re: Question on Error Injection

2019-01-03 Thread Verma, Vishal L
.org > > Subject: Re: Question on Error Injection > > > > > > On Thu, 2019-01-03 at 20:02 +, Kamal Kakri wrote: > > > My device has errors injected: > > > # ndctl inject-error --status namespace2.0 > > > { &g

Re: Question on Error Injection

2019-01-03 Thread Kamal Kakri
nt Memory) wrote: > -Original Message- > From: Linux-nvdimm On Behalf Of Verma, > Vishal L > Sent: Thursday, January 3, 2019 3:27 PM > To: kamalkakri2...@yahoo.com; linux-nvdimm@lists.01.org > Subject: Re: Question on Error Injection > > > On Thu, 2019-01

RE: Question on Error Injection

2019-01-03 Thread Elliott, Robert (Persistent Memory)
> -Original Message- > From: Linux-nvdimm On Behalf Of Verma, > Vishal L > Sent: Thursday, January 3, 2019 3:27 PM > To: kamalkakri2...@yahoo.com; linux-nvdimm@lists.01.org > Subject: Re: Question on Error Injection > > > On Thu, 2019-01-03 at 20:02 +

Re: Question on Error Injection

2019-01-03 Thread Verma, Vishal L
On Thu, 2019-01-03 at 20:02 +, Kamal Kakri wrote: > My device has errors injected: > # ndctl inject-error --status namespace2.0 > { > "badblocks":[ > { > "block":35000, > "count":10 > } > ] > } > > No

Re: Question on Error Injection

2019-01-03 Thread Kamal Kakri
My device has errors injected: # ndctl inject-error --status namespace2.0{   "badblocks":[     {   "block":35000,   "count":10     }   ] } No problem reading from the bad offsets: # dd if=/dev/pmem2 of=/tmp/pmem_out bs=512 count=10 skip=35000 10+0 records

Re: Question on Error Injection

2019-01-03 Thread Verma, Vishal L
On Thu, 2019-01-03 at 17:13 +, Kamal Kakri wrote: > I am playing around with ndctl inject-error and have a few questions > around the behavior of the application when an error occurs. > After successfully injecting error with --no-notify, I am able to > read and write to the name

Question on Error Injection

2019-01-03 Thread Kamal Kakri
I am playing around with ndctl inject-error and have a few questions around the behavior of the application when an error occurs. After successfully injecting error with --no-notify, I am able to read and write to the namespace device with no problems. For e.g.: # ndctl inject-error --block

Re: fsdax memory error handling regression

2018-11-28 Thread Dan Williams
On Tue, Nov 13, 2018 at 6:25 AM Matthew Wilcox wrote: > > On Sat, Nov 10, 2018 at 09:08:10AM -0800, Dan Williams wrote: > > On Sat, Nov 10, 2018 at 12:29 AM Matthew Wilcox wrote: [..] > > > If we get an internal entry in this case, we know we were looking up > > > a PMD entry and found a PTE

Re: fsdax memory error handling regression

2018-11-13 Thread Matthew Wilcox
On Sat, Nov 10, 2018 at 09:08:10AM -0800, Dan Williams wrote: > On Sat, Nov 10, 2018 at 12:29 AM Matthew Wilcox wrote: > > On Wed, Nov 07, 2018 at 06:01:19AM +, Williams, Dan J wrote: > > > On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > > > > On Tue, Nov 06, 2018 at 03:44:47AM

Re: fsdax memory error handling regression

2018-11-10 Thread Dan Williams
On Sat, Nov 10, 2018 at 12:29 AM Matthew Wilcox wrote: > > On Wed, Nov 07, 2018 at 06:01:19AM +, Williams, Dan J wrote: > > On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > > > On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > > > > Hi Willy, > > > > > > > > I'm

[linux-nvdimm:for-5.0/nvdimm-security 2/17] arch/powerpc//platforms/pseries/papr_scm.c:219:14: error: too few arguments to function 'nvdimm_create'

2018-11-10 Thread kbuild test robot
arch/powerpc//platforms/pseries/papr_scm.c: In function 'papr_scm_nvdimm_init': >> arch/powerpc//platforms/pseries/papr_scm.c:219:14: error: too few arguments >> to function 'nvdimm_create' p->nvdimm = nvdimm_create(p->bus, p, papr_scm_dimm_groups, ^

Re: fsdax memory error handling regression

2018-11-10 Thread Matthew Wilcox
On Wed, Nov 07, 2018 at 06:01:19AM +, Williams, Dan J wrote: > On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > > On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > > > Hi Willy, > > > > > > I'm seeing the following warning with v4.20-rc1 and the "dax.sh" > > > test

Re: fsdax memory error handling regression

2018-11-09 Thread Dan Williams
On Tue, Nov 6, 2018 at 10:01 PM Williams, Dan J wrote: > > On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > > On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > > > Hi Willy, > > > > > > I'm seeing the following warning with v4.20-rc1 and the "dax.sh" > > > test > > >

Re: [PATCH] ndctl: fix zero-labels to handle firmware error properly

2018-11-08 Thread Dan Williams
s() sets a transfer size to > rc when FW status is non-zero. This transfer size gets mistreated as > zeroed nmems count in the end. > > Fix ndctl_dimm_zero_labels() to handle this FW error case properly. > > Reported-by: Robert Elliott > Signed-off-by: Toshi Kani > C

[PATCH] ndctl: fix zero-labels to handle firmware error properly

2018-11-08 Thread Toshi Kani
mistreated as zeroed nmems count in the end. Fix ndctl_dimm_zero_labels() to handle this FW error case properly. Reported-by: Robert Elliott Signed-off-by: Toshi Kani Cc: Vishal Verma Cc: Dan Williams --- ndctl/lib/dimm.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git

Re: [PATCH] libnvdimm: Fix __nd_ioctl() to check error in cmd_rc

2018-11-07 Thread Kani, Toshi
ls nmem1 > > zeroed 65504 nmems > > > > When an ACPI call completes with error, xlat_status() called from > > acpi_nfit_ctl() sets error to *cmd_rc. __nd_ioctl(), however, does > > not check this error and returns with success. > > > > Fix __nd_i

Re: [PATCH] libnvdimm: Fix __nd_ioctl() to check error in cmd_rc

2018-11-07 Thread Dan Williams
On Wed, Nov 7, 2018 at 10:52 AM Toshi Kani wrote: > > ndctl zero-labels completes with a large number of zeroed nmems when > it fails to do zeroing on a protected NVDIMM. > > # ndctl zero-labels nmem1 > zeroed 65504 nmems > > When an ACPI call completes with error, x

[PATCH] libnvdimm: Fix __nd_ioctl() to check error in cmd_rc

2018-11-07 Thread Toshi Kani
ndctl zero-labels completes with a large number of zeroed nmems when it fails to do zeroing on a protected NVDIMM. # ndctl zero-labels nmem1 zeroed 65504 nmems When an ACPI call completes with error, xlat_status() called from acpi_nfit_ctl() sets error to *cmd_rc. __nd_ioctl(), however

Re: fsdax memory error handling regression

2018-11-06 Thread Williams, Dan J
On Tue, 2018-11-06 at 06:48 -0800, Matthew Wilcox wrote: > On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > > Hi Willy, > > > > I'm seeing the following warning with v4.20-rc1 and the "dax.sh" > > test > > from the ndctl repository: > > I'll try to run this myself later today.

Re: fsdax memory error handling regression

2018-11-06 Thread Matthew Wilcox
On Tue, Nov 06, 2018 at 03:44:47AM +, Williams, Dan J wrote: > Hi Willy, > > I'm seeing the following warning with v4.20-rc1 and the "dax.sh" test > from the ndctl repository: I'll try to run this myself later today. > I tried to get this test going on -next before the merge window, but >

fsdax memory error handling regression

2018-11-05 Thread Williams, Dan J
Hi Willy, I'm seeing the following warning with v4.20-rc1 and the "dax.sh" test from the ndctl repository: [ 69.962873] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [ 69.969522] EXT4-fs (pmem0): mounted filesystem with ordered data mode. Opts: dax [

[PATCH 1/9] mm/resource: return real error codes from walk failures

2018-10-22 Thread Dave Hansen
walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret' makes it out

Re: [ndctl PATCH v2] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Williams, Dan J
On Thu, 2018-10-04 at 17:17 -0600, Vishal Verma wrote: > For routines that return a UINT_MAX or UL{L}ONG_MAX, there isn't a > way > to get any information as to what went wrong. Set errno in such > routines > so that the callers can get some additional context about the error

[ndctl PATCH v2] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Vishal Verma
For routines that return a UINT_MAX or UL{L}ONG_MAX, there isn't a way to get any information as to what went wrong. Set errno in such routines so that the callers can get some additional context about the error. Reported-by: Lukasz Dorau Cc: Dan Williams Signed-off-by: Vishal Verma --- ndctl

Re: [ndctl PATCH] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Verma, Vishal L
that return a UINT_MAX or UL{L}ONG_MAX, there isn't > > > > a > > > > way > > > > to get any information as to what went wrong. Set errno in such > > > > routines > > > > so that the callers can get some additional context about the >

Re: [ndctl PATCH] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Williams, Dan J
gt; way > > > to get any information as to what went wrong. Set errno in such > > > routines > > > so that the callers can get some additional context about the > > > error. > > > > Looks ok, but why EOVERFLOW and not ENOMEM for the out of resource

Re: [ndctl PATCH] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Verma, Vishal L
nes > > so that the callers can get some additional context about the > > error. > > Looks ok, but why EOVERFLOW and not ENOMEM for the out of resource > conditions? I debated between that and also ENOSPC, but nothing seemed like an exact fit for a buffer too small.. Mai

Re: [ndctl PATCH] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Williams, Dan J
On Thu, 2018-10-04 at 16:34 -0600, Vishal Verma wrote: > For routines that return a UINT_MAX or UL{L}ONG_MAX, there isn't a > way > to get any information as to what went wrong. Set errno in such > routines > so that the callers can get some additional context about the error. Lo

[ndctl PATCH] libndctl: set errno for routines that don't return an error status

2018-10-04 Thread Vishal Verma
For routines that return a UINT_MAX or UL{L}ONG_MAX, there isn't a way to get any information as to what went wrong. Set errno in such routines so that the callers can get some additional context about the error. Reported-by: Lukasz Dorau Cc: Dan Williams Signed-off-by: Vishal Verma --- ndctl

[ndctl PATCH v2 3/5] util/json: fix an error check for region resource

2018-10-03 Thread Vishal Verma
The return type of ndctl_region_get_resource() is 'unsigned long long', and therefore the error checking for it should be done against ULLONG_MAX. Fix an instance where we were checking against ULONG_MAX. Reviewed-by: Dan Williams Signed-off-by: Vishal Verma --- util/json.c | 2 +- 1 file

Re: [ndctl PATCH 3/5] util/json: fix an error check for region resource

2018-10-02 Thread Dan Williams
On Mon, Oct 1, 2018 at 8:38 PM Vishal Verma wrote: > > The return type of ndctl_region_get_resource() is 'unsigned long long', > and therefore the error checking for it should be done against > ULLONG_MAX. Fix an instance where we were checking against ULONG_MAX. > Reviewed-b

[ndctl PATCH 3/5] util/json: fix an error check for region resource

2018-10-01 Thread Vishal Verma
The return type of ndctl_region_get_resource() is 'unsigned long long', and therefore the error checking for it should be done against ULLONG_MAX. Fix an instance where we were checking against ULONG_MAX. Signed-off-by: Vishal Verma --- util/json.c | 2 +- 1 file changed, 1 insertion(+), 1

Mail System Error - Returned Mail

2018-09-26 Thread Bounced mail
: Please reply to postmas...@lists.01.org if you feel this message to be in error. ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

[ndctl PATCH] ndctl, test: add a new unit test pfn metadata error clearing

2018-09-18 Thread Vishal Verma
The pfn driver lacked a way to clear badblocks in the volatile struct page area, but this is expected to be fixed for v4.20. Add a unit test that creates an fsdax namespace, forces it to raw mode, injects errors to the metadata area, and converts it back to fsdax. For a kernel with the error

Re: [PATCH] device-dax: avoid hang on error before devm_memremap_pages()

2018-09-11 Thread Dan Williams
> >>rc = devm_add_action_or_reset(dev, dax_pmem_percpu_exit, >> _pmem->ref); >>if (rc) >> return rc; >> >>dax_pmem->pgmap.ref = _pmem->ref; >>addr = devm_memremap_pages(dev, _pmem->pgmap); >> >>

Returned mail: Data format error

2018-08-31 Thread Mail Delivery Subsystem
Message could not be delivered ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [ndctl PATCH 1/2] ndctl: fix potential null dereference in the smart error handler

2018-08-13 Thread Keith Busch
On Fri, Aug 10, 2018 at 06:40:52PM -0600, Vishal Verma wrote: > Static analysis reports that can potentially dereference a NULL pointer > in the smart cmd error handler. This can particular instance won't ever > be hit in practice as the handler is only registered for smart commands, &g

[ndctl PATCH 1/2] ndctl: fix potential null dereference in the smart error handler

2018-08-10 Thread Vishal Verma
Static analysis reports that can potentially dereference a NULL pointer in the smart cmd error handler. This can particular instance won't ever be hit in practice as the handler is only registered for smart commands, and smart commands are currently only DIMM commands, and will always have a dimm

Re: [PATCH] device-dax: avoid hang on error before devm_memremap_pages()

2018-07-31 Thread Dave Jiang
pgmap.ref = _pmem->ref; addr = devm_memremap_pages(dev, _pmem->pgmap); Avoid the hang by calling percpu_ref_exit() in the error paths instead of going through dax_pmem_percpu_exit(). Signed-off-by: Stefan Hajnoczi Applied --- Found by code inspection. Compile-tested only. --- dri

[PATCH] device-dax: avoid hang on error before devm_memremap_pages()

2018-07-31 Thread Stefan Hajnoczi
s(dev, _pmem->pgmap); Avoid the hang by calling percpu_ref_exit() in the error paths instead of going through dax_pmem_percpu_exit(). Signed-off-by: Stefan Hajnoczi --- Found by code inspection. Compile-tested only. --- drivers/dax/pmem.c | 12 1 file changed, 8 insertions(

Re: [PATCH] acpi/nfit: queue issuing of ars when an uc error notification comes in

2018-07-27 Thread Verma, Vishal L
On Fri, 2018-07-27 at 09:04 -0700, Dave Jiang wrote: > When the ACPI UC error notifier gets called and ARS_REQ bit is set > with the passed in flag, we can receive -EBUSY if ARS_REQ bit is already > set for the nfit_spa->ars_state. When that happens, the ARS request is > dr

[PATCH] acpi/nfit: queue issuing of ars when an uc error notification comes in

2018-07-27 Thread Dave Jiang
When the ACPI UC error notifier gets called and ARS_REQ bit is set with the passed in flag, we can receive -EBUSY if ARS_REQ bit is already set for the nfit_spa->ars_state. When that happens, the ARS request is dropped. That can potentially cause us to miss the unreported errors that the on go

[ndctl PATCH 4/4] ndctl, monitor: improve error reporting throughout monitor.c

2018-07-19 Thread Vishal Verma
In several places in the ndctl monitor, we were losing useful error information (from 'errno' for example), and just returning a simple '1' or '-1'. Fix these to capture and propagate the correct errors everywhere. In the case of notify_dimm_event(), don't error out for failures

Re: [PATCH v2] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-07 Thread Oscar Salvador
On Fri, Jul 06, 2018 at 04:33:58PM -0600, Ross Zwisler wrote: > The following commit in -next: > > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and > remove check") > > changed how the error handling in sparse_add_one_section() works. >

[PATCH v2] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-06 Thread Ross Zwisler
The following commit in -next: commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and remove check") changed how the error handling in sparse_add_one_section() works. Previously sparse_index_init() could return -EEXIST, and the function would continue on happily. '

Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-06 Thread Andrew Morton
mit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and > > > remove check") > > > > > > changed how the error handling in sparse_add_one_section() works. > > > > > > Previously sparse_index_init() could return -EEXIST, and the fu

Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-06 Thread Ross Zwisler
On Fri, Jul 06, 2018 at 11:23:27PM +0200, Oscar Salvador wrote: > On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote: > > The following commit in -next: > > > > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and > > remove check&

Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-06 Thread Andrew Morton
On Fri, 6 Jul 2018 13:06:58 -0600 Ross Zwisler wrote: > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and > remove check") > > changed how the error handling in sparse_add_one_section() works. > > Previously sparse_index_init() could retu

Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-06 Thread Oscar Salvador
On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote: > The following commit in -next: > > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and > remove check") > > changed how the error handling in sparse_add_one_section() works. >

[PATCH] mm/sparse.c: fix error path in sparse_add_one_section

2018-07-06 Thread Ross Zwisler
The following commit in -next: commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and remove check") changed how the error handling in sparse_add_one_section() works. Previously sparse_index_init() could return -EEXIST, and the function would continue on happily. '

[ndctl PATCH] libndctl: fix the uninject-error API actually injecting errors

2018-07-06 Thread Vishal Verma
The addition of the v2 APIs introduced a bug when they rerouted the old APIs to call the new v2 ones. ndctl_namespace_uninject_error() called ndctl_namespace_inject_error2() instead of ndctl_namespace_uninject_error2(). Reported-by: Tomasz Rochumski Signed-off-by: Vishal Verma ---

Re: [PATCH 2/3] fs/ext2/inode: Fix a type cast error for fsdax

2018-07-02 Thread Jan Kara
truct iomap is loff_t, which represents > > > file offset of mapping. > > > > > > In ext2_iomap_begin, iomap->offset shall be given a type cast as > > > loff_t instead of u64. > > > > Why is it an error? loff_t is uniformly type

Re: [PATCH 2/3] fs/ext2/inode: Fix a type cast error for fsdax

2018-07-02 Thread Huaisheng Ye
; > > > In ext2_iomap_begin, iomap->offset shall be given a type cast as > > loff_t instead of u64. > > Why is it an error? loff_t is uniformly typedefed to long long. > In which case the second variant is different from the first one > *and* does n

Re: [PATCH 2/3] fs/ext2/inode: Fix a type cast error for fsdax

2018-07-01 Thread Al Viro
On Sun, Jul 01, 2018 at 02:18:47PM +0800, Huaisheng Ye wrote: > From: Huaisheng Ye > > The type of offset within struct iomap is loff_t, which represents > file offset of mapping. > > In ext2_iomap_begin, iomap->offset shall be given a type cast as > loff_t instead of

[PATCH 2/3] fs/ext2/inode: Fix a type cast error for fsdax

2018-07-01 Thread Huaisheng Ye
From: Huaisheng Ye The type of offset within struct iomap is loff_t, which represents file offset of mapping. In ext2_iomap_begin, iomap->offset shall be given a type cast as loff_t instead of u64. Signed-off-by: Huaisheng Ye --- fs/ext2/inode.c | 2 +- 1 file changed, 1 insertion(+), 1

Re: [fstests PATCH 1/2] src/: fix up mmap() error checking

2018-06-22 Thread Ross Zwisler
On Fri, Jun 22, 2018 at 10:28:38AM +0800, Eryu Guan wrote: > On Wed, Jun 20, 2018 at 04:51:46PM -0600, Ross Zwisler wrote: > > I noticed that in some of my C tests in src/ I was incorrectly checking for > > mmap() failure by looking for NULL instead of MAP_FAILED. Fix those and > > clean up some

<    1   2   3   4   5   6   >