On Tue, Mar 22, 2022 at 10:13 AM Tim Harvey <thar...@gateworks.com> wrote: > > On Mon, Mar 21, 2022 at 12:42 PM Tim Harvey <thar...@gateworks.com> wrote: > > > > On Mon, Mar 21, 2022 at 1:34 AM Frieder Schrempf > > <frieder.schre...@kontron.de> wrote: > > > > > > Hi Ye, > > > > > > Am 17.03.22 um 14:54 schrieb Frieder Schrempf: > > > > Hi Han, > > > > > > > > Am 17.03.22 um 14:33 schrieb Han Xu: > > > >> > > > >> > > > >> On Thu, Mar 17, 2022 at 8:27 AM Frieder Schrempf > > > >> <frieder.schre...@kontron.de <mailto:frieder.schre...@kontron.de>> > > > >> wrote: > > > >> > > > >> Hi Stefano, > > > >> > > > >> this old patch was delegated to you in patchwork. If you're not the > > > >> correct maintainer to address, please let me know. As the NAND > > > >> layer > > > >> seems to be unmaintained at the moment, I'm not sure whom to ask. > > > >> > > > >> This patch fixes a regression that was introduced by 616f03dabacb > > > >> (" > > > >> mtd: gpmi: change the BCH layout setting for large oob NAND") which > > > >> alters the BCH layout in a way that doesn't match with the > > > >> implementation in the Linux kernel. > > > >> > > > >> This causes failures when loading an UBI image in U-Boot that was > > > >> flashed by Linux or vice versa (see [1]). > > > >> > > > >> There has been an approach to fix this through an optional > > > >> devicetree > > > >> property in 51cdf83eea ("mtd: gpmi: provide the option to use > > > >> legacy bch > > > >> geometry"), but this is not acceptable. The "legacy" BCH layout > > > >> compatible with Linux should be used by default. > > > >> > > > >> The approach to upstream the "new" layout to the kernel [2] seems > > > >> to be > > > >> stalled and even if it would succeed, it would break systems that > > > >> use an > > > >> old U-Boot and a new kernel, which is again not really acceptable > > > >> in my > > > >> opinion. > > > >> > > > >> > > > >> Hi Frieder, > > > >> > > > >> I am not in office this week. I will send another patch set to change > > > >> in > > > >> both kernel and u-boot to fix the compatible issue. > > > > > > > > You already claimed that months ago, but nothing happened: > > > > > > > >> I will send patches for both kernel and u-boot to use legacy bch > > > >> scheme by default, and add some code to treat few MLC nand chips as > > > >> corner cases. > > > > > > > > Sean's U-Boot patch is effectively reverting the default behavior to use > > > > the "legacy" BCH scheme. So that's in line with what you want to do and > > > > you can base your work on top of this fix. But we should get the basic > > > > fix in regardless. > > > > > > > > Even more so because switching the layout in U-Boot by using > > > > fsl,legacy-bch-geometry in the devicetree requires CONFIG_DM_MTD=y, > > > > which causes the bootloader size to increase by around 250 KiB in my > > > > case which might not be an option for some boards. > > > > > > Ye's reply copied over to not break the thread: > > > > > > > The dt nand driver will check "fsl,legacy-bch-geometry" property to > > > > use legacy bch. If this can't work for you in case you don't use DM > > > > driver, I prefer adding a config to select the legacy bch not > > > > reverting the patch. > > > > > > I think you miss my point. I don't really care about your preferences. > > > But I do care about not introducing breaking changes. IMHO the "correct" > > > way to introduce the new BCH layout would have been: > > > > > > 1. Introduce the new layout in Linux and U-Boot behind a feature-flag > > > (e.g. DT property) > > > 2. Make all upstream board configurations and DTs in Linux and U-Boot > > > use the feature-flag. > > > 3. Drop the flag and make the new BCH layout the default. > > > > > > That way the change would be much more controlled and raise the > > > awareness of board maintainers that are affected. > > > > > > The current situation (layout changes applied in U-Boot without any > > > questioning, layout changes in Linux rejected for good reasons) leaves > > > everyone who uses the GPMI NAND on i.MX in a situation where they > > > probably will see failures and need to spend some hours of debugging > > > until they find out the reason is that someone carelessly introduced > > > breaking changes. > > > > > > And even with the procedure described above we break compatibility > > > between old and new versions of U-Boot and Linux and create a dependency > > > between the two which is what we should try to avoid wherever possible. > > > > > > Frieder > > > > Frieder, > > > > I agree that we need to get this fixed ASAP. Because of this IMX GPMI > > NAND UBI has been broken since v2020.07 (6 releases ago!) and it would > > be really nice to get this fixed in v2022.01 which releases in a > > couple of weeks. I'm surprised neither of us noticed this problem but > > likely we've both been so busy trying to keep up with forced DM > > migrations we haven't fully tested the 'little things' like booting to > > an actual OS. I know the boards I support using IMX GPMI NAND UBI are > > still using a 2017 U-Boot where quite honestly everything worked fine > > so there hasn't been a reason to push people to something new. > > > > That said, applying this patch does 'not' fix things for my boards. > > For my boards mxs_nand_set_geometry is called with: > > oobsize=224 ecc_str_ds=0 ecc_step_ds=0 modern=0 > > > > and I still get ecc errors when mounting a UBI that worked fine with a > > v2020.04 U-Boot: > > Ventana > ubi part rootfs && ubifsmount ubi0:boot && ubifsls && ubifsumount > > ubi0: attaching mtd3 > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes > > from PEB 0:0, read 64 bytes > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes > > from PEB 1:0, read 64 bytes > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes > > from PEB 2:0, read 64 bytes > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 4096 > > bytes from PEB 2:4096, read 4096 bytes > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 64 bytes > > from PEB 3:0, read 64 bytes > > ubi0 error: ubi_io_read: error -74 (ECC error) while reading 4096 > > bytes from PEB 3:4096, read 4096 bytes > > > > The original offending commit 616f03dabacb ("mtd: gpmi: change the BCH > > layout setting for large oob NAND") no longer reverts and I haven't > > dug in to find out what I can do about that but i'm a little concerned > > I get different results than you do regarding the patch from this > > thread. > > > > Thank you for pursuing this issue! > > > > It looks like for my board this patch on top of commit 51cdf83eea > ("mtd: gpmi: provide the option to use legacy bch geometry") does > resolve the issue however on top of current 2022.04-rc4 there seems to > be a new problem for me. > > Adding some debugging I find that I used to see the following: > mxs_nand_set_geometry legacy_bch_geometry=1 ecc_strength_ds=0 > max_ecc_strength_supported=40 > mxs_nand_legacy_calc_ecc_layout oobsize=224 ecc_chunk_count=8 > ecc_chunk0_size=512 ecc_chunkn_size=512 ecc_strength=16 > > But now with this patch on top of 2022.04-rc4 I get: > mxs_nand_set_geometry legacy_bch_geometry=1 ecc_strength_ds=0 > max_ecc_strength_supported=40 > mxs_nand_legacy_calc_ecc_layout oobsize=128 ecc_chunk_count=8 > ecc_chunk0_size=512 ecc_chunkn_size=512 ecc_strength=8 > ^^^ oobsize is wrong causing the wrong ecc strength (8 instead of the > correct 16) > > I'm not sure yet what happened between 51cdf83eea and now that causes > oobsize to be wrong for my device. > > Frieder, are you really able to run ok on 2022.04-rc4 with legacy mode > selected? >
I discovered my issue regarding the wrong oobsize. When I converted to DM_MTD I removed CONFIG_SYS_NAND_ONFI_DETECTION which was needed. With that fixed, Sean's patch here resolves the issue and puts me in sync with your findings. I will take a look at Han's latest patch and respond to that thread. Best Regards, Tim