On Mon, 19 Nov 2018 06:20:28 +0000
Naga Sureshkumar Relli <nagas...@xilinx.com> wrote:

> H Boris,
> 
> > -----Original Message-----
> > From: Boris Brezillon [mailto:boris.brezil...@bootlin.com]
> > Sent: Monday, November 19, 2018 1:13 AM
> > To: Naga Sureshkumar Relli <nagas...@xilinx.com>
> > Cc: miquel.ray...@bootlin.com; rich...@nod.at; dw...@infradead.org;
> > computersforpe...@gmail.com; marek.va...@gmail.com; 
> > linux-...@lists.infradead.org; linux-
> > ker...@vger.kernel.org; nagasures...@gmail.com; r...@kernel.org; Michal 
> > Simek
> > <mich...@xilinx.com>
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for 
> > Arasan NAND
> > Flash Controller
> > 
> > On Thu, 15 Nov 2018 09:34:16 +0000
> > Naga Sureshkumar Relli <nagas...@xilinx.com> wrote:
> >   
> > > Hi Boris & Miquel,
> > >
> > > I am updating the driver by addressing your comments, and I have one
> > > concern,  especially in anfc_read_page_hwecc(), there I am checking for 
> > > erased pages bit flips.
> > > Since Arasan NAND controller doesn't have multibit error detection
> > > beyond 24-bit( it can correct up to 24 bit), i.e. there is no indication 
> > > from controller to detect  
> > uncorrectable error beyond 24bit.
> > 
> > Do you mean that you can't detect uncorrectable errors, or just that it's 
> > not 100% sure to detect
> > errors above max_strength?  
> Yes, in Arasan NAND controller there is no way to detect uncorrectable errors 
> beyond 24-bit.

So how do you detect uncorrectable errors when the strength is less than
24bits?

> >   
> > > So I took some error count as default value(MULTI_BIT_ERR_CNT  16, I
> > > put this based on the error count that I got while reading erased page on 
> > > Micron device).
> > > And during a page read, will just read the error count register and
> > > compare this value with the default error count(16) and if it is more 
> > > Than default then I am  
> > checking for erased page bit flips.
> > 
> > Hm, that's wrong, especially if you set ecc_strength to something > 16.  
> Ok
> >   
> > > I am doubting that this will not work in all cases.  
> > 
> > It definitely doesn't.  
> Ok
> >   
> > > In my case it is just working because the error count that it got on an 
> > > erased page is 16.
> > > Could you please suggest a way to do detect erased_page bit flips when 
> > > reading a page with  
> > HW-ECC?.
> > 
> > I'm a bit lost. Is the problem only about bitflips in erase pages, or is it 
> > also impacting reads of
> > written pages that lead to uncorrectable errors.  
> Yes, it is for both. But in case of read errors that we can't detect beyond 
> 24-bit, then the answer from HW design team
> Is that the flash part is bad.
> Unfortunately till now we haven't ran into that situation(read errors of 
> written pages beyond 24-bit).

Can you please run nandbiterrs (availaible in mtd-utils). I fear your
device won't pass the test.

> But we are hitting this because of erased page reading(needed in case of 
> ubifs).
> 
> > 
> > Don't you have a bit (or several bits) reporting when the ECC engine was 
> > not able to correct
> > data? I you do, you should base the "detect bitflips in erase pages" logic 
> > on this information.  
> Bit reporting for several bit errors is there only for Hamming(1bit 
> correction and 2bit detection) but not in BCH.
> 

Then I tend to agree with Miquel: your ECC engine is broken, and I'm
not even sure how to deal with that yet.

Reply via email to