Re: Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)
On 03/02/2012 06:17 PM, Grazvydas Ignotas wrote: IIRC NAND in mainline was broken for very long time on OMAP3, I think it was only fixed in 2.6.39.1. That seems to be the case; the 2.6.39.1 diff contains the OMAP NAND sub page write fix (applied locally). Anyone else who can testify on the volatile-ness of NAND ECC errors? I.e., are they expected to be more persistent? Thanks, Orjan -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)
On 03/05/2012 09:56 AM, Matthieu CASTET wrote: Note that the omap driver is still broken : http://article.gmane.org/gmane.linux.drivers.mtd/36079/match= We detected this when stressing a board. Because all of these bugs in omap driver, I wonder how many people really use the mainline version. Do you know any repo where this is working correctly (linux-omap, or one of the vendor trees etc)? Also if you use a nand that need 4-bit ECC, you need a better ecc than hamming. You can use the bch code ( http://article.gmane.org/gmane.linux.drivers.mtd/37864/match=omap ) Yes, I've been looking at the BCH 4-bit code (both generic implementations and the OMAP GPMC-enabled one) in u-boot and linux. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)
Hi, When running the mtd_subpagetest I'm seeing more or less spurious ECC corrections. I.e., one round may show 4 corrections and the next will show 7, only some of which are the same as the previous 4. Are the ECC errors expected to be that volatile and frequent? I've seen various discussions regarding the OMAP sub page support, as well as problems with the GPMC prefetch engine. Disabling both made no difference regarding this. I've also tried two different sets of NAND timings (relaxed and optimized), with no difference. I'm using a Micron NAND that requires 4-bit ECC correction but I'm running with only 1-bit (software) ECC. This is on an old kernel, 2.6.32. Thanks, Orjan -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Spurious ECC errors with mtd_subpagetest (OMAP3, NAND)
On 03/02/2012 05:17 PM, Orjan Friberg wrote: Hi, When running the mtd_subpagetest I'm seeing more or less spurious ECC corrections. I.e., one round may show 4 corrections and the next will show 7, only some of which are the same as the previous 4. FWIW * I'm seeing the same behaviour (i.e. transient ECC errors) when doing nanddump on a partition. * mtd_oobtest fails on verify failed at varying address, and read past end of device. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_PREEMPT and JFFS2 oops
On 01/25/2012 10:02 PM, Orjan Friberg wrote: That one-liner was boiled down from the following program, which still oopses instantly: The C program seems to work fine with CONFIG_PREEMPT_NONE=y. If that is indeed the problem I guess it's reasonable that it worked better with PREEMPT_VOLUNTARY than PREEMPT because there are fewer preemtion points. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_PREEMPT and JFFS2 oops
On 01/25/2012 10:18 PM, Paul Walmsley wrote: - If your oopses are consistently in the same places, add some debugging to that code to determine which line is actually causing the oops. (CC:d linux-mtd.) They are semi-consistent I'd say. The oops trace I posted is by far the most common. problem to mysteriously disappear. Doing this analysis should provide a good clue as to where to look next. I personally would be rather suspicious of that ri-data_crc = cpu_to_je32(crc32(0, comprbuf, cdatalen)); in jffs2_write_inode_range(). That is indeed the place where crc32 is called from . I'll see it I can track the use of comprbuf. - Try turning on JFFS2 debugging and seeing if you can reproduce it. The output might provide a clue as to where the problem would be. Here are two examples (immediately preceding the oops): jffs2_reserve_space(): Requested 0x30 bytes jffs2_reserve_space(): alloc sem got [JFFS2 DBG] (1189) jffs2_do_reserve_space: minsize=48 , jeb-free=46852 ,summary-size=16586 , sumsize=29 jffs2_do_reserve_space(): Giving 0x75f4 bytes at 0x3d48fc jffs2_write_dirent(ino #1, name at *0xdea7b93c file1-ino #111, name_crc 0x58c597f8) jffs2_write_begin() jffs2_read_inode_range: ino #12, range 0x-0x1000 Filling non-frag hole from 0-4096 end write_begin(). pg-flags 9 jffs2_write_end(): ino #12, page at 0x0, range 0-800, flags d jffs2_write_inode_range(): Ino #12, ofs 0x0, len 0x320 jffs2_reserve_space(): Requested 0xc4 bytes jffs2_reserve_space(): alloc sem got [JFFS2 DBG] (1454) jffs2_do_reserve_space: minsize=196 , jeb-free=123148 ,summary-size=1567 , sumsize=18 jffs2_do_reserve_space(): Giving 0x1dab0 bytes at 0xf941ef4 calling deflate with avail_in 788, avail_out 788 deflate returned with avail_in 0, avail_out 428, total_in 788, total_out 360 calling deflate with avail_in 12, avail_out 428 deflate returned with avail_in 0, avail_out 414, total_in 800, total_out 374 zlib compressed 800 bytes into 380 I'll take a look at what jffs2_do_reserve_space is up to. Thanks. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_PREEMPT and JFFS2 oops
On 01/26/2012 11:15 AM, Orjan Friberg wrote: problem to mysteriously disappear. Doing this analysis should provide a good clue as to where to look next. I personally would be rather suspicious of that ri-data_crc = cpu_to_je32(crc32(0, comprbuf, cdatalen)); in jffs2_write_inode_range(). That is indeed the place where crc32 is called from . I'll see it I can track the use of comprbuf. Ok, so comprbuf comes from jffs2_compress and becomes NULL for some reason (hence the oops). Initially I had CMODE_FAVOUR_LZO. With that, things only worked with PREEMPT_NONE. However, when changing to CMODE_PRIORITY or CMODE_NONE things do seem to work *with* PREEMPT. For what it's worth (with PREEMPT on): CMODE_FAVOUR_LZO with LZO disabled oopses. CMODE_FAVOUR_LZO with only ZLIB enabled oopses. CMODE_FAVOUR_LZO with ZLIB/LZO/RTIME/RUBIN disabled does not oops. Thus, the bug seems to be in the *selection* of compression algorithm (when there is at least one algoritm in the list), rather than in the specific compression algorithms themselves. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_PREEMPT and JFFS2 oops
Paul, Your patch works fine in that it doesn't oops, and I'm not seeing any BUGs from CONFIG_DEBUG_SPINLOCK. I haven't verified *anything else* (performance etc). We've had some discussions on the linux-mtd list during the day, starting at http://lists.infradead.org/pipermail/linux-mtd/2012-January/039442.html if you're interested (though that discussion didn't result in a patch). Thanks, Orjan -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CONFIG_PREEMPT and JFFS2 oops
On 01/26/2012 05:57 PM, Paul Walmsley wrote: You just throw away best_buf here, don't you? You're right. It's even worse than that. best_buf will contain the data from the last compressor used. And it will be prematurely freed. Here's a fixed version. I've tested this version for a while now with the same result as before. No oopses, no spinlock violations. I copied a 2MB file from the SD/MMC partition to the two JFFS2 partitions and md5summ'ed it a bunch of times. After that I unmounted and remounted both partitions. I do see a steady memory usage increase when doing continuous testing, but whether that's normal I don't know. I see at least some of it being reclaimed when unmounting the JFFS2 partitions (grep jffs2 /proc/slabinfo). -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
CONFIG_PREEMPT and JFFS2 oops
Hi, With CONFIG_PREEMPT=y and hammering away on two different JFFS2 partitions on a NAND flash I get an oops within ~10 seconds. This is on a BeagleBoard xM (rev A2, with NAND). I've boiled it down to whether CONFIG_PREEMPT (bug happens) or CONFIG_PREEMPT_VOLUNTARY (bug doesn't happen) is selected. Of course, changing that affects a other things like inline spinlocking. Turning on CONFIG_DEBUG_SPINLOCK reveals nothing. By changing this option, I've made the bug go away in a 2.6.32 and 2.6.37 setup where it previously happened, and I've made it appear in a 2.6.39 setup where it previously didn't happen. Pointers on what to look at next are appreciated. (I've posted this on the mtd-utils mailing list too.) More details below. Thanks, Orjan The setup is simply two JFFS2-formatted partitions, and launching a while :; do dd if=/dev/zero of=file bs=800 count=1; done on each of them. Sometimes the oops trace originates from the garbage collector, sometimes the result is a JFFS2 decompress error. -- Orjan Friberg FlatFrog Laboratories AB [ 81.200805] Unable to handle kernel NULL pointer dereference at virtual address [ 81.217529] pgd = ce13c000 [ 81.220855] [] *pgd=8e172031, *pte=, *ppte= [ 81.236480] Internal error: Oops: 17 [#1] PREEMPT [ 81.241210] last sysfs file: /sys/kernel/uevent_seqnum [ 81.246368] Modules linked in: ftdi_sio usbserial [ 81.251129] CPU: 0Not tainted (2.6.32 #6) [ 81.255584] PC is at crc32_le+0x6c/0xf4 [ 81.259460] LR is at jffs2_write_inode_range+0x2a0/0x420 [ 81.264801] pc : [c0211f28]lr : [c01ae930]psr: 2013 [ 81.264801] sp : ce24bcd0 ip : 0001 fp : ce11f840 [ 81.276336] r10: 000c r9 : ce5231d0 r8 : fffc [ 81.281585] r7 : 0002 r6 : r5 : c03fcf9c r4 : [ 81.288146] r3 : r2 : 0008 r1 : r0 : [ 81.294677] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 81.301849] Control: 10c5387d Table: 8e13c019 DAC: 0015 [ 81.307617] Process dd (pid: 5270, stack limit = 0xce24a2f0) [ 81.313323] Stack: (0xce24bcd0 to 0xce24c000) [ 81.317687] bcc0: 0002 0003 [ 81.325897] bce0: c01ae930 ce24bd1c ce24bd18 0008 [ 81.334136] bd00: 0002 cdca7000 ce1a8800 0008 0320 [ 81.342346] bd20: 0001326c 0320 ce11f840 ce523208 c07754e0 [ 81.350555] bd40: 0320 ce1a8800 c01a8ac4 0320 ce24bd74 [ 81.358764] bd60: 0320 0320 0320 0320 [ 81.367004] bd80: 0320 ce5232b0 c0097d1c [ 81.375213] bda0: 0320 0320 c07754e0 ce523208 ce24a000 cebf4140 ce5232b0 1000 [ 81.383422] bdc0: c03efe38 ce24bf40 0001 0320 ce523208 c07754e0 [ 81.391632] bde0: 0320 0320 0320 ce523208 [ 81.399871] be00: c009846c ce24bf00 0320 [ 81.408081] be20: 0002 ce24bf00 ce24bf40 ce24beb0 cebf4140 ce5232b0 0320 0001 [ 81.416290] be40: ce24a000 ce523278 000ad008 c03dd658 0320 ce523278 [ 81.424530] be60: ce24bf40 ce24beb0 0001 cebf4140 000ad008 c009851c [ 81.432739] be80: ce24beb0 ce24bf40 ce24beb0 cebf4140 ce24bf80 ce24a000 [ 81.440948] bea0: 000aad28 c00bf584 00020242 ce1ae000 0001 [ 81.449157] bec0: cebf4140 ce12d6c0 00020241 [ 81.457397] bee0: 0200 ce12d6c0 c0077028 ce24bef4 ce24bef4 0004 [ 81.465606] bf00: 000aad28 0300 0320 00100073 [ 81.473815] bf20: 000ad000 ce24a000 000ce000 0002 ceb450e0 ce4b0618 0001 [ 81.482025] bf40: 000ad008 0320 cebf4140 000ad008 ce24bf80 0320 0320 c00c01c8 [ 81.490264] bf60: cebf4140 000ad008 cebf4140 0320 000ad008 c00c036c [ 81.498474] bf80: 0320 0320 0001 000ad008 0004 [ 81.506683] bfa0: c00390c4 c0038f40 0320 0001 0001 000ad008 0320 000acd34 [ 81.514923] bfc0: 0320 0001 000ad008 0004 0320 000ad008 000aad28 000ad008 [ 81.523132] bfe0: 4001e3e0 bece4b60 00010e34 40188abc 6010 0001 [ 81.531372] [c0211f28] (crc32_le+0x6c/0xf4) from [c01ae930] (jffs2_write_inode_range+0x2a0/0x420) [ 81.540618] [c01ae930] (jffs2_write_inode_range+0x2a0/0x420) from [c01a8ac4] (jffs2_write_end+0x190/0x2d4) [ 81.550689] [c01a8ac4] (jffs2_write_end+0x190/0x2d4) from [c0097d1c] (generic_file_buffered_write+0x180/0x264) [ 81.561096] [c0097d1c
Re: CONFIG_PREEMPT and JFFS2 oops
On 01/25/2012 09:12 PM, Orjan Friberg wrote: I've boiled it down to whether CONFIG_PREEMPT (bug happens) or CONFIG_PREEMPT_VOLUNTARY (bug doesn't happen) is selected. No, I haven't. The problem disappeared only for while :; do dd if=/dev/zero of=file bs=800 count=1; done That one-liner was boiled down from the following program, which still oopses instantly: #include stdio.h #include unistd.h #include sys/types.h #include sys/stat.h #include fcntl.h int main() { int fd; struct stat st; char buf[800]; do { unlink(file2); fd = open(file1, O_RDWR|O_CREAT|O_TRUNC, 0666); stat(file1, st); lseek(fd, 0, SEEK_SET); write(fd, buf, 800); close(fd); rename(file1, file2); } while (1); return 0; } (Apologies for spamming.) -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
USB gadget unreliable on software reboot (BeagleBoard xM)
Hi, On my BeagleBoard xM, configuring the MUSB controller in Linux to peripheral mode (i.e. not OTG mode) and using a built-in gadget driver, the gadget device sometimes does not appear after a software reboot. I've seen this with both 2.6.32 and 2.6.39 (Angstrom 2008.1 and 2010.x distros, respectively). Our own board exhibits the same behaviour. However: configuring the MUSB controller in u-boot as a device and only booting as far as u-boot before a software reset, the device always appears. To me this suggests a MUSB driver issue in Linux (as opposed to, say, PHY initialization). I checked with a USB analyzer what happens on the bus: when it doesn't show up there is a reset on the bus when we reboot, but it doesn't re-enter full speed mode. No SOFs are sent either. I did a rudimentary check of the OTG registers in the TPS chip (over i2c) but saw nothing out of the ordinary. I set musb_debug = 5 in musb_core.c, but no errors are reported. I haven't looked at the MUSB controller registers yet; that's next. Any ideas? Thanks, Orjan -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
copy_to_user speed from dma_alloc_coherent vs. kmalloc buffer
Hi, I have a driver where I do memory to memory DMA between GPMC and SDRAM. Adding a read function, I found that copy_to_user from a dma_alloc_coherent buffer is significantly slower than from a kmalloc'd one. Looking at arch/arm/include/asm/pgtable.h I suspect this difference in speed is due to the fact that the dma_alloc_coherent buffer is unbuffered. What are my options (besides using mmap)? * Reserve a portion of memory at boot time to be used as the DMA destination buffer, use ioremap_cached + manual cache flush as needed? * Turn on buffering for the DMA destination buffer for the duration of the copy_to_user call, then turn it off again (and flush it from the cache)? * Something else entirely? This is on a 3730, on Linux 2.6.32. Thanks, Orjan -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copy_to_user speed from dma_alloc_coherent vs. kmalloc buffer
On 2011-04-20 17:12, Orjan Friberg wrote: What are my options (besides using mmap)? It looks like kmalloc + dma_map_single for the DMA destination buffer and then dma_sync_single_for_{cpu,device} around the call to copy_to_user pretty much does the trick. At least the %sys load measured with mpstat goes from 13% to 2%. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
OMAP 3730 200 MHz SDRAM config
Hi, I'm looking at configuring an OMAP 3730 board for 200 MHz SDRAM. I've been looking at the kernel code (arch/arm/mach-omap2) the last couple of days to try and figure out what I need to do. We're basing ourselves off of the Beagleboard, so I tried copying the 200 MHz Hynix SDRAM entry for Beagleboard-xM but that didn't help: it still (re)programs the SDRC clock to 166 MHz. * Does the kernel at all use or depend on the boot loader's SDRAM config? (I'm using u-boot with a prepended configuration header.) * Does the SDRAM setup/clocking depend on the MPU rate at all? I.e. do I need to boot Linux in 1 GHz to be able to set 200 MHz SDRC clock? The clock config is a bit convoluted, so I'd appreciate any help. Thanks, Orjan Appendix: I'm using a program (user-mode app) called 'bandwidth' (which has an ARM port): http://home.comcast.net/~fbui/bandwidth.html for measurements. With big (several MB) sequential writes I get ~1170 MB/s. The theoretical max for a 166 MHz is 166*2 * 4 bytes = 1328 MB/s, so we're almost at 90%. We're not the only process accessing memory, and maybe there's some loss due to SDRAM refresh etc. -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: OMAP 3730 200 MHz SDRAM config
On 2011-03-07 16:19, Elvis Dowson wrote: You probably need update x-loader. Try using the beagleboard x-loader project located at gitorious (v1.44) or the ti arago one (1.48, but not quite the latest in terms of support for beagleboard xm parts). Looking at board/omap3530beagle/omap3530beagle.c for the memory part definitions. For the XM, the Numonyx part is at 165Mhz, and the Micron part is at 200Mhz. I'm using u-boot with a configuration header, and there I have set the new CTRLA, CTRLB and RFR values (and I did compare the values with the Micron data sheet; apart from the TCKE value they are all identical). But are you saying that the values set by the boot loader are preserved by the kernel? (In that case I wonder what the sdram-micron header file is for :) Thanks, Orjan -- Orjan Friberg FlatFrog Laboratories AB -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html