no malloc messages even if i remove the _DEBUG marco check in assert. maybe it can’t detected by do_check_inuse_chunk().
> 在 2022年3月23日,18:12,Heinrich Schuchardt <xypron.g...@gmx.de> 写道: > On 3/23/22 11:07, qianfan wrote: >> >> 在 2022/3/23 17:51, Heinrich Schuchardt 写道: >>> On 3/23/22 10:13, qianfan wrote: >>>> 在 2022/3/23 16:02, qianfan 写道: >>>>> 在 2022/3/23 15:45, qianfan 写道: >>>>>> 在 2022/3/23 10:28, qianfan 写道: >>>>>>> Hi: >>>>>>> I had a custom AM335X board connected my computer by usbnet. It >>>>>>> always report data abort when 'dhcp': >>>>>>> Next it the log: >>>>>>> U-Boot 2022.01-rc1-00183-gfa5b4e2d19-dirty (Feb 25 2022 - 15:45:02 >>>>>>> +0800) >>>>>>> CPU : AM335X-GP rev 2.1 >>>>>>> Model: WISDOM AM335X CCT >>>>>>> DRAM: 512 MiB >>>>>>> NAND: 256 MiB >>>>>>> MMC: OMAP SD/MMC: 0 >>>>>>> Loading Environment from NAND... *** Warning - bad CRC, using >>>>>>> default environment >>>>>>> Net: Could not get PHY for ethernet@4a100000: addr 0 >>>>>>> eth2: ethernet@4a100000, eth3: usb_ether >>>>>>> Hit any key to stop autoboot: 0 >>>>>>> => setenv autoload no >>>>>>> => dhcp >>>>>>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in >>>>>>> MAC de:ad:be:ef:00:01 >>>>>>> HOST MAC de:ad:be:ef:00:00 >>>>>>> RNDIS ready >>>>>>> musb-hdrc: peripheral reset irq lost! >>>>>>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS >>>>>>> USB RNDIS network up! >>>>>>> BOOTP broadcast 1 >>>>>>> BOOTP broadcast 2 >>>>>>> BOOTP broadcast 3 >>>>>>> DHCP client bound to address 192.168.200.4 (757 ms) >>>>>>> data abort >>>>>>> pc : [<9fe9b0a2>] lr : [<9febbc3f>] >>>>>>> reloc pc : [<808130a2>] lr : [<80833c3f>] >>>>>>> sp : 9de53410 ip : 9de53578 fp : 00000001 >>>>>>> r10: 9de5345c r9 : 9de67e80 r8 : 9febbae5 >>>>>>> r7 : 9de72c30 r6 : 9feec710 r5 : 0000000d r4 : 00000018 >>>>>>> r3 : 3fdd8e04 r2 : 00000002 r1 : 9feec728 r0 : 9feec700 >>>>>>> Flags: Nzcv IRQs off FIQs on Mode SVC_32 (T) >>>>>>> Code: f023 0303 60ca 4403 (6091) 685a >>>>>>> Resetting CPU ... >>>>>>> resetting ... >>>>>>> It's there has any doc about how to debug data abort? Or is the bug >>>>>>> is already fixed? >>>>>>> Thanks >>>>>> This bug doesn't fixed on master code. I found v2021.01 is good and >>>>>> v2021.04-rc2 is bad. >>>>>> Also I had tested this on beaglebone black with am335x_evm_defconfig, >>>>>> has the simliar problem. >>>>>> find the first bug commit via 'git bisect': it told me that commit >>>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917 broke it. But it is very >>>>>> strange due to this commit doesn't touch any dhcp or network code. >>>>>> ➜ u-boot-main git:(e97eb638de) ✗ git bisect bug >>>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917 is the first bug commit >>>>>> commit e97eb638de0dc8f6e989e20eaeb0342f103cb917 >>>>>> Author: Heinrich Schuchardt <xypron.g...@gmx.de> >>>>>> Date: Wed Jan 20 22:21:53 2021 +0100 >>>>>> fs: fat: consistent error handling for flush_dir() >>>>>> Provide function description for flush_dir(). >>>>>> Move all error messages for flush_dir() from the callers to the >>>>>> function. >>>>>> Move mapping of errors to -EIO to the function. >>>>>> Always check return value of flush_dir() (Coverity CID 316362). >>>>>> In fat_unlink() return -EIO if flush_dirty_fat_buffer() fails. >>>>>> Signed-off-by: Heinrich Schuchardt <xypron.g...@gmx.de> >>>>>> :040000 040000 2281a449f2d134078d7faa1ee735a367b55aad7e >>>>>> 77d188b1c99181fd71f2167fdeee3434a09db209 M fs >>>>>> 184aa6504143b452132e28cd3ebecc7b941cdfa1 is the first commit before >>>>>> e97eb638de0dc8f6e989e20eaeb0342f103cb917: >>>>>> * e97eb638de0dc8f6e989e20eaeb0342f103cb917 fs: fat: consistent error >>>>>> handling for flush_dir() >>>>>> * 184aa6504143b452132e28cd3ebecc7b941cdfa1 Merge tag >>>>>> 'u-boot-rockchip-20210121' of >>>>>> https://gitlab.denx.de/u-boot/custodians/u-boot-rockchip >>>>>> |\ >>>>>> | * 9ddc0787bd660214366e386ce689dd78299ac9d0 pci: Add Rockchip dwc >>>>>> based PCIe controller driver >>>>>> I checked 184aa6504143b452132e28cd3ebecc7b941cdfa1 can work fine. >>>>>> U-Boot 2021.01-00688-g184aa65041-dirty (Mar 23 2022 - 15:07:56 +0800) >>>>>> CPU : AM335X-GP rev 2.1 >>>>>> Model: TI AM335x BeagleBone Black >>>>>> DRAM: 512 MiB >>>>>> WDT: Started with servicing (60s timeout) >>>>>> NAND: 0 MiB >>>>>> MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1 >>>>>> Loading Environment from FAT... <ethaddr> not set. Validating first >>>>>> E-fuse MAC >>>>>> Net: eth2: ethernet@4a100000, eth3: usb_ether >>>>>> Hit any key to stop autoboot: 0 >>>>>> => dhcp >>>>>> ethernet@4a100000 Waiting for PHY auto negotiation to >>>>>> complete......... TIMEOUT ! >>>>>> using musb-hdrc, OUT ep1out IN ep1in STATUS ep2in >>>>>> MAC de:ad:be:ef:00:01 >>>>>> HOST MAC de:ad:be:ef:00:00 >>>>>> RNDIS ready >>>>>> musb-hdrc: peripheral reset irq lost! >>>>>> high speed config #2: 2 mA, Ethernet Gadget, using RNDIS >>>>>> USB RNDIS network up! >>>>>> BOOTP broadcast 1 >>>>>> BOOTP broadcast 2 >>>>>> BOOTP broadcast 3 >>>>>> DHCP client bound to address 192.168.200.157 (757 ms) >>>>>> Using usb_ether device >>>>>> TFTP from server 192.168.200.1; our IP address is 192.168.200.157 >>>>>> Filename 'u-boot.img'. >>>>>> Load address: 0x82000000 >>>>>> Loading: >>>>>> ################################################################# >>>>>> ################################################################# >>>>>> ################################################################# >>>>>> ######################### >>>>>> 2.5 MiB/s >>>>>> done >>>>>> Bytes transferred = 1123888 (112630 hex) >>>>>> => >>>> "data abort" messages: >>>> data abort >>>> pc : [<9ff8196c>] lr : [<9ffa1cd7>] >>>> reloc pc : [<8081496c>] lr : [<80834cd7>] >>>> sp : 9df38e60 ip : 9df38fc8 fp : 00000001 >>>> r10: 9df38eac r9 : 9df4ceb0 r8 : 9ffa1b7d >>>> r7 : 9df52fd0 r6 : 9ffdbba8 r5 : 0000000d r4 : 00000018 >>>> r3 : 3ff589e0 r2 : 9ffafa11 r1 : 9ffdbbc0 r0 : 9ffdbb00 >>>> Flags: Nzcv IRQs off FIQs on Mode SVC_32 (T) >>>> Code: 0303 60ca 4403 6091 (685a) f042 >>>> Resetting CPU ... >>>> objdump u-boot:pc is in malloc and lr is in env_attr_walk >>>> unlink(victim, bck, fwd); >>>> 80814966: 60ca str r2, [r1, #12] >>>> set_inuse_bit_at_offset(victim, victim_size); >>>> 80814968: 4403 add r3, r0 >>>> unlink(victim, bck, fwd); >>>> 8081496a: 6091 str r1, [r2, #8] >>>> set_inuse_bit_at_offset(victim, victim_size); >>>> 8081496c: 685a ldr r2, [r3, #4] >>>> 8081496e: f042 0201 orr.w r2, r2, #1 >>>> 80814972: 605a str r2, [r3, #4] >>>> r3 is 3ff589e0 and it's not a valid ram address on am335x. >>> I have seen crashes in common/dlmalloc.c before after double free() or >>> free() with an incorrect pointer. >>> The assert() statements in do_check_inuse_chunk() are meant to catch >>> this but assert() as defined in include/log.h does not stop the code and >>> even does not print without _DEBUG=1. >>> You should be able to get the assert output with >>> #include <common.h> >>> #define _DEBUG 1 >>> #include <log.h> >>> at the top of common/dlmalloc.c. >>> You should get full malloc debug output with >> >> Hi: I had try add DEBUG marco before <log.h> and no other malloc message > > assert() checks for _DEBUG. Defining DEBUG after common.h will not > define _DEBUG. > > Best regards > > Heinrich > >> printed. >> >>> #define DEBUG 1 >>> #include <common.h> >>> #include <log.h> >>> Best regards >>> Heinrich