Re: Intermittent SD write corruption on DM355, kernel 2.6.36
Jon Povey wrote: Jon Povey wrote: I am seeing rare SD card write corruption on DM355 I think I have a fix for this now - just CC'd you on the patch RFC [RFC] mmc: davinci: fix corruption after surprise card eject (Which fixes it, but may not be the best way of going about it). Tested on DM368, where this appears to fix the issue as well. -- Gilles Chanteperdrix NexVision, http://www.nexvision.fr Office: 99, av Clot-Bey, 13008 Marseille, France Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10 ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
RE: Intermittent SD write corruption on DM355, kernel 2.6.36
Jon Povey wrote: I am seeing rare SD card write corruption on DM355 I think I have a fix for this now - just CC'd you on the patch RFC [RFC] mmc: davinci: fix corruption after surprise card eject (Which fixes it, but may not be the best way of going about it). -- Jon Povey jon.po...@racelogic.co.uk Racelogic is a limited company registered in England. Registered number 2743719 . Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, MK18 1TB . The information contained in this electronic mail transmission is intended by Racelogic Ltd for the use of the named individual or entity to which it is directed and may contain information that is confidential or privileged. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email so that the sender's address records can be corrected. The views expressed by the sender of this communication do not necessarily represent those of Racelogic Ltd. Please note that Racelogic reserves the right to monitor e-mail communications passing through its network ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
Re: Intermittent SD write corruption on DM355, kernel 2.6.36
Jon Povey wrote: Jon Povey wrote: I am seeing rare SD card write corruption on DM355 I think I have a fix for this now - just CC'd you on the patch RFC [RFC] mmc: davinci: fix corruption after surprise card eject (Which fixes it, but may not be the best way of going about it). If I read you correctly, it seems that the fix does things which are already done, right? So, how come it fixes ? :-) Although, maybe it may makes sense to post the patch on the linux-arm-kernel mailing list too. -- Gilles Chanteperdrix NexVision, http://www.nexvision.fr Office: 99, av Clot-Bey, 13008 Marseille, France Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10 ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
RE: Intermittent SD write corruption on DM355, kernel 2.6.36
Gilles Chanteperdrix wrote: Jon Povey wrote: I think I have a fix for this now - just CC'd you on the patch RFC [RFC] mmc: davinci: fix corruption after surprise card eject (Which fixes it, but may not be the best way of going about it). If I read you correctly, it seems that the fix does things which are already done, right? So, how come it fixes ? :-) Not sure I understand. The existing code resets the controller core when handling certain errors, but more MMC commands are run after those error handlers. Presumably one of those puts the core into a bad state, a later reset (my patch) fixes it. Feel free to try it out! Although, maybe it may makes sense to post the patch on the linux-arm-kernel mailing list too. Hmm, should maybe have CC'd them. And David Brownell. Ho well. -- Jon Povey jon.po...@racelogic.co.uk Racelogic is a limited company registered in England. Registered number 2743719 . Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, MK18 1TB . The information contained in this electronic mail transmission is intended by Racelogic Ltd for the use of the named individual or entity to which it is directed and may contain information that is confidential or privileged. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email so that the sender's address records can be corrected. The views expressed by the sender of this communication do not necessarily represent those of Racelogic Ltd. Please note that Racelogic reserves the right to monitor e-mail communications passing through its network ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
RE: Intermittent SD write corruption on DM355, kernel 2.6.36
Hi Jon, On Mon, Jan 31, 2011 at 10:15:10, Jon Povey wrote: Gilles Chanteperdrix wrote: Jon Povey wrote: I am seeing rare SD card write corruption on DM355 running 2.6.36. The system will get itself into a state where it appears all SD writes are offset by two bytes. This is using a vfat filesystem on the SD, and affects the FAT and directories at least. we observe something similar with the 2.6.32 kernel. In our case, it happens almost systematically if we remove the SD card while a transfer is occuring. Thanks for the report. That sounds like what we are seeing. I suspect some kind of incomplete DMA or controller state clearing on surprise removal. It will also be useful to see if using journalling filesystems like ext3 or ext4 helps the situation. Thanks, Sekhar Will see if I can debug it and post a patch if I work it out. -- Jon Povey jon.po...@racelogic.co.uk Racelogic is a limited company registered in England. Registered number 2743719 . Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, MK18 1TB . The information contained in this electronic mail transmission is intended by Racelogic Ltd for the use of the named individual or entity to which it is directed and may contain information that is confidential or privileged. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email so that the sender's address records can be corrected. The views expressed by the sender of this communication do not necessarily represent those of Racelogic Ltd. Please note that Racelogic reserves the right to monitor e-mail communications passing through its network ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
Re: Intermittent SD write corruption on DM355, kernel 2.6.36
Jon Povey wrote: Gilles Chanteperdrix wrote: Jon Povey wrote: I am seeing rare SD card write corruption on DM355 running 2.6.36. The system will get itself into a state where it appears all SD writes are offset by two bytes. This is using a vfat filesystem on the SD, and affects the FAT and directories at least. we observe something similar with the 2.6.32 kernel. In our case, it happens almost systematically if we remove the SD card while a transfer is occuring. Thanks for the report. That sounds like what we are seeing. I suspect some kind of incomplete DMA or controller state clearing on surprise removal. We did not investigate this issue any further (and notably, we did not hexdump the SD card contents when corrupted), but it certainly looks like something like this. The issue is kind of deterministic with regard to the timing when we remove the SD card. When we do so, we observe the SD card controller, or its driver has one of three reactions: - it continues to work normally - it fails to detect later re-insertion of an SD card - next dosfsck reports error in available clusters count and repairing the filesystem actually corrupts the FAT32 FS information sector, making it unmountable, and fails to detect the superblock on any later SD card. In our case, removing the SD card after recording 45s triggers the third case almost certainly. I assumed 45s is the time when the FS information sector is first modified in our case, but this timing is probably specific to our application. Some other information: we have this issue on DM368 using the 2.6.32 kernel from the Arago tree. And our SD card are correctly formatted, i.e. the same way as how the SD association SD card formatter does. -- Gilles Chanteperdrix NexVision, http://www.nexvision.fr Office: 99, av Clot-Bey, 13008 Marseille, France Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10 ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
RE: Intermittent SD write corruption on DM355, kernel 2.6.36
Gilles Chanteperdrix wrote: Jon Povey wrote: I am seeing rare SD card write corruption on DM355 We did not investigate this issue any further (and notably, we did not hexdump the SD card contents when corrupted), but it certainly looks like something like this. The issue is kind of deterministic with regard to the timing when we remove the SD card. When we do so, we observe the SD card controller, or its driver has one of three reactions: - it continues to work normally - it fails to detect later re-insertion of an SD card - next dosfsck reports error in available clusters count and repairing the filesystem actually corrupts the FAT32 FS information sector, making it unmountable, and fails to detect the superblock on any later SD card. Thanks again for the info. I have been doing hexdumps of cards, using HxD on Windows (http://mh-nexus.de/en/) I have now been able to reproduce the state where the system corrupts all writes, shifting the data by 2 bytes. Pulling the card during write seems to be the trigger as you found. Assuming you are seeing the same thing, the system will corrupt the file contents, directories, FAT and FSINFO sectors the same way, whenever it writes them. The last two bytes from a sector write show up as the first two bytes of the next sector written, for example the last two bytes of a test file appear as the first two bytes of a (corrupted) directory sector. I have seen similar complaints about FSINFO from Linux and corruption / lost clusters from Windows. In our case, removing the SD card after recording 45s triggers the third case almost certainly. I assumed 45s is the time when the FS information sector is first modified in our case, but this timing is probably specific to our application. That sounds about right. Under Linux the OS will cache those writes for some (tuneable) amount of time before flushing them to disk. Some other information: we have this issue on DM368 using the 2.6.32 kernel from the Arago tree. And our SD card are correctly formatted, i.e. the same way as how the SD association SD card formatter does. That is interesting. It suggests that at least it's not a silicon errata in the MMC/SD block that was found and fixed between DM355 and DM368. I am busy getting familiar with the sources and working out how best to approach debugging this. Will send info when/if I learn something.. -- Jon Povey jon.po...@racelogic.co.uk Racelogic is a limited company registered in England. Registered number 2743719 . Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, MK18 1TB . The information contained in this electronic mail transmission is intended by Racelogic Ltd for the use of the named individual or entity to which it is directed and may contain information that is confidential or privileged. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email so that the sender's address records can be corrected. The views expressed by the sender of this communication do not necessarily represent those of Racelogic Ltd. Please note that Racelogic reserves the right to monitor e-mail communications passing through its network ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
RE: Intermittent SD write corruption on DM355, kernel 2.6.36
Gilles Chanteperdrix wrote: Jon Povey wrote: I am seeing rare SD card write corruption on DM355 running 2.6.36. The system will get itself into a state where it appears all SD writes are offset by two bytes. This is using a vfat filesystem on the SD, and affects the FAT and directories at least. we observe something similar with the 2.6.32 kernel. In our case, it happens almost systematically if we remove the SD card while a transfer is occuring. Thanks for the report. That sounds like what we are seeing. I suspect some kind of incomplete DMA or controller state clearing on surprise removal. Will see if I can debug it and post a patch if I work it out. -- Jon Povey jon.po...@racelogic.co.uk Racelogic is a limited company registered in England. Registered number 2743719 . Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, MK18 1TB . The information contained in this electronic mail transmission is intended by Racelogic Ltd for the use of the named individual or entity to which it is directed and may contain information that is confidential or privileged. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email so that the sender's address records can be corrected. The views expressed by the sender of this communication do not necessarily represent those of Racelogic Ltd. Please note that Racelogic reserves the right to monitor e-mail communications passing through its network ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
Re: Intermittent SD write corruption on DM355, kernel 2.6.36
Jon Povey wrote: I am seeing rare SD card write corruption on DM355 running 2.6.36. The system will get itself into a state where it appears all SD writes are offset by two bytes. This is using a vfat filesystem on the SD, and affects the FAT and directories at least. Hi, we observe something similar with the 2.6.32 kernel. In our case, it happens almost systematically if we remove the SD card while a transfer is occuring. Regards. -- Gilles Chanteperdrix NexVision, http://www.nexvision.fr Office: 99, av Clot-Bey, 13008 Marseille, France Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10 ___ Davinci-linux-open-source mailing list Davinci-linux-open-source@linux.davincidsp.com http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source