Hi Naga,

Boris Brezillon <boris.brezil...@bootlin.com> wrote on Tue, 20 Nov 2018
12:02:44 +0100:

> On Tue, 20 Nov 2018 07:02:08 +0000
> Naga Sureshkumar Relli <nagas...@xilinx.com> wrote:
> 
> 
> > > 
> > > Can you please run nandbiterrs (availaible in mtd-utils). I fear your
> > > device won't pass the test.    
> > Yes, nandbiterror test is passing till 24bit, after that it is failing.  
> 
> Can you paste the output of nandbiterrs please?

Apparently 'nandbiterrs -i 'just crashes the kernel because of a
segmentation fault. Please run this test (from the mtd-utils package)
and fix this issue. Then we would like to see the output.

> 
> > >     
> > > > But we are hitting this because of erased page reading(needed in case 
> > > > of ubifs).
> > > >    
> > > > >
> > > > > Don't you have a bit (or several bits) reporting when the ECC engine 
> > > > > was not able to    
> > > correct    
> > > > > data? I you do, you should base the "detect bitflips in erase pages" 
> > > > > logic on this information.    
> > > > Bit reporting for several bit errors is there only for Hamming(1bit 
> > > > correction and 2bit    
> > > detection) but not in BCH.    
> > > >    
> > > 
> > > Then I tend to agree with Miquel: your ECC engine is broken, and I'm
> > > not even sure how to deal with that yet.    
> > So as per the Miquel's suggestion, can I proceed to add the below one?
> > "you should re-read the page in raw mode and check for the number of 
> > bitflips manually (thanks to the helpers in the core). Again, if the number 
> > of BF is above 16, we can assume the page is bad and increment ->ecc.failed 
> > accordingly."  
> 
> But that's just partially fixing the problem. And you didn't answer my
> previous question: what happens when you configure the ECC engine in,
> say 12bit/1024 and you end up with uncorrectable errors (more than 12
> bitflips in a 1k block). What's the number reported ECC_ERR_CNT? Is it
> set to 13?

Please dump this register, and eventually what's the value of the
Packet_bound_Err_count field ([0:7]) for each iteration of nandbiterrs -i.
If there is no way, when the status bit is set, to discriminate if the
data is reliable or was not corrected at all, it is gonna be a real
issue and I don't think we want to support such engine.


Thanks,
Miquèl

Reply via email to