Hi Bors, thanks for your quick reply. On Mon, Oct 30, 2017 at 3:23 AM, Boris Brezillon <boris.brezil...@free-electrons.com> wrote: > Hi Brent, > > Subject should be prefixed by "mtd: nand: ", so > > "mtd: nand: Fix writing mtdoops to nand flash"
Oops, yes, will fix. > > On Sun, 29 Oct 2017 23:23:43 -0500 > moto...@gmail.com wrote: > >> From: Brent Taylor <moto...@gmail.com> >> >> When mtdoops calls mtd_panic_write, it eventually calls >> panic_nand_write in nand_base.c. In order to properly >> wait for the nand chip to be ready in panic_nand_wait, >> the chip must first be selected. >> >> When using the atmel nand flash controller, a panic >> would occur due to a NULL pointer exception. >> >> Signed-off-by: Brent Taylor <moto...@gmail.com> >> --- >> drivers/mtd/nand/nand_base.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c >> index 12edaae17d81..0a8058a66d93 100644 >> --- a/drivers/mtd/nand/nand_base.c >> +++ b/drivers/mtd/nand/nand_base.c >> @@ -2802,9 +2802,14 @@ static int panic_nand_write(struct mtd_info *mtd, >> loff_t to, size_t len, >> struct mtd_oob_ops ops; >> int ret; >> >> + int chipnr = (int)(to >> chip->chip_shift); >> + chip->select_chip(mtd, chipnr); >> + >> /* Wait for the device to get ready */ >> panic_nand_wait(mtd, chip, 400); >> >> + chip->select_chip(mtd, -1); >> + > > Duh! Looks like a piece of code that is never tested. Did you face the > problem or did you find out by inspecting the code? I was playing with another driver on an atmel development board (at91sam9m10g45ek) and caused a panic with mtdoops enabled. While writing the mtdoops to nand, another panic occurred which caused a storm of panics to generated. > > Anyway, I fear the atmel driver is not the only one to rely on the chip > to be selected when ->dev_ready() is called so this should definitely > be fixed. Also, we should probably move the ->select_chip() and > panic_nand_wait() calls after panic_nand_get_device(), and I don't > think we need to unselect the chip (it will be taken care of by > nand_do_write_ops()). > >> /* Grab the device */ >> panic_nand_get_device(chip, mtd, FL_WRITING); >> > After looking at this closer, a panic could happen at any point correct? If that's the case, the kernel could have been in the middle of a nand read/write operation (which may or may not be on the same chip). Would the chip the mtdoops is being written to need to be reset? I haven't drilled down into the nand_reset function yet, but can that be called in a "panic" state?