Re: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-02-14 Thread Gilles Chanteperdrix
Jon Povey wrote:
 Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355
 
 I think I have a fix for this now - just CC'd you on the patch RFC
 
 [RFC] mmc: davinci: fix corruption after surprise card eject
 
 (Which fixes it, but may not be the best way of going about it).

Tested on DM368, where this appears to fix the issue as well.

-- 
Gilles Chanteperdrix
NexVision, http://www.nexvision.fr
Office: 99, av Clot-Bey, 13008 Marseille, France
Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10
___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


RE: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-02-03 Thread Jon Povey
 Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355

I think I have a fix for this now - just CC'd you on the patch RFC

[RFC] mmc: davinci: fix corruption after surprise card eject

(Which fixes it, but may not be the best way of going about it).

--
Jon Povey
jon.po...@racelogic.co.uk

Racelogic is a limited company registered in England. Registered number 2743719 
.
Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, 
MK18 1TB .

The information contained in this electronic mail transmission is intended by 
Racelogic Ltd for the use of the named individual or entity to which it is 
directed and may contain information that is confidential or privileged. If you 
have received this electronic mail transmission in error, please delete it from 
your system without copying or forwarding it, and notify the sender of the 
error by reply email so that the sender's address records can be corrected. The 
views expressed by the sender of this communication do not necessarily 
represent those of Racelogic Ltd. Please note that Racelogic reserves the right 
to monitor e-mail communications passing through its network


___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


Re: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-02-03 Thread Gilles Chanteperdrix
Jon Povey wrote:
 Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355
 
 I think I have a fix for this now - just CC'd you on the patch RFC
 
 [RFC] mmc: davinci: fix corruption after surprise card eject
 
 (Which fixes it, but may not be the best way of going about it).

If I read you correctly, it seems that the fix does things which are
already done, right? So, how come it fixes ? :-)

Although, maybe it may makes sense to post the patch on the
linux-arm-kernel mailing list too.

-- 
Gilles Chanteperdrix
NexVision, http://www.nexvision.fr
Office: 99, av Clot-Bey, 13008 Marseille, France
Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10
___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


RE: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-02-03 Thread Jon Povey
Gilles Chanteperdrix wrote:
 Jon Povey wrote:
 I think I have a fix for this now - just CC'd you on the patch RFC

 [RFC] mmc: davinci: fix corruption after surprise card eject

 (Which fixes it, but may not be the best way of going about it).

 If I read you correctly, it seems that the fix does things which are
 already done, right? So, how come it fixes ? :-)

Not sure I understand.
The existing code resets the controller core when handling certain errors,
but more MMC commands are run after those error handlers. Presumably one
of those puts the core into a bad state, a later reset (my patch) fixes it.

Feel free to try it out!

 Although, maybe it may makes sense to post the patch on the
 linux-arm-kernel mailing list too.

Hmm, should maybe have CC'd them. And David Brownell. Ho well.

--
Jon Povey
jon.po...@racelogic.co.uk

Racelogic is a limited company registered in England. Registered number 2743719 
.
Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, 
MK18 1TB .

The information contained in this electronic mail transmission is intended by 
Racelogic Ltd for the use of the named individual or entity to which it is 
directed and may contain information that is confidential or privileged. If you 
have received this electronic mail transmission in error, please delete it from 
your system without copying or forwarding it, and notify the sender of the 
error by reply email so that the sender's address records can be corrected. The 
views expressed by the sender of this communication do not necessarily 
represent those of Racelogic Ltd. Please note that Racelogic reserves the right 
to monitor e-mail communications passing through its network


___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


RE: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-01-31 Thread Nori, Sekhar
Hi Jon,

On Mon, Jan 31, 2011 at 10:15:10, Jon Povey wrote:
 Gilles Chanteperdrix wrote:
  Jon Povey wrote:
  I am seeing rare SD card write corruption on DM355 running 2.6.36.
  The system will get itself into a state where it appears all SD
  writes are offset by two bytes. This is using a vfat filesystem on
  the SD, and affects the FAT and directories at least.
 
  we observe something similar with the 2.6.32 kernel. In our case, it
  happens almost systematically if we remove the SD card while
  a transfer is occuring.
 
 Thanks for the report. That sounds like what we are seeing.
 I suspect some kind of incomplete DMA or controller state clearing
 on surprise removal.

It will also be useful to see if using journalling filesystems like ext3
or ext4 helps the situation.

Thanks,
Sekhar

 
 Will see if I can debug it and post a patch if I work it out.
 
 --
 Jon Povey
 jon.po...@racelogic.co.uk
 
 Racelogic is a limited company registered in England. Registered number 
 2743719 .
 Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, 
 Bucks, MK18 1TB .
 
 The information contained in this electronic mail transmission is intended by 
 Racelogic Ltd for the use of the named individual or entity to which it is 
 directed and may contain information that is confidential or privileged. If 
 you have received this electronic mail transmission in error, please delete 
 it from your system without copying or forwarding it, and notify the sender 
 of the error by reply email so that the sender's address records can be 
 corrected. The views expressed by the sender of this communication do not 
 necessarily represent those of Racelogic Ltd. Please note that Racelogic 
 reserves the right to monitor e-mail communications passing through its 
 network
 
 
 ___
 Davinci-linux-open-source mailing list
 Davinci-linux-open-source@linux.davincidsp.com
 http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
 

___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


Re: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-01-31 Thread Gilles Chanteperdrix
Jon Povey wrote:
 Gilles Chanteperdrix wrote:
 Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355 running
 2.6.36. The system will get itself into a state where it appears
 all SD writes are offset by two bytes. This is using a vfat
 filesystem on the SD, and affects the FAT and directories at
 least.
 we observe something similar with the 2.6.32 kernel. In our case,
 it happens almost systematically if we remove the SD card while a
 transfer is occuring.
 
 Thanks for the report. That sounds like what we are seeing. I suspect
 some kind of incomplete DMA or controller state clearing on surprise
 removal.

We did not investigate this issue any further (and notably, we did not
hexdump the SD card contents when corrupted), but it certainly looks
like something like this. The issue is kind of deterministic with regard
to the timing when we remove the SD card. When we do so, we observe the
SD card controller, or its driver has one of three reactions:
- it continues to work normally
- it fails to detect later re-insertion of an SD card
- next dosfsck reports error in available clusters count and repairing
the filesystem actually corrupts the FAT32 FS information sector,
making it unmountable, and fails to detect the superblock on any later
SD card.

In our case, removing the SD card after recording 45s triggers the third
case almost certainly. I assumed 45s is the time when the FS information
sector is first modified in our case, but this timing is probably
specific to our application.

Some other information: we have this issue on DM368 using the 2.6.32
kernel from the Arago tree. And our SD card are correctly formatted,
i.e. the same way as how the SD association SD card formatter does.

-- 
Gilles Chanteperdrix
NexVision, http://www.nexvision.fr
Office: 99, av Clot-Bey, 13008 Marseille, France
Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10
___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


RE: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-01-31 Thread Jon Povey
Gilles Chanteperdrix wrote:
 Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355

 We did not investigate this issue any further (and notably, we did not
 hexdump the SD card contents when corrupted), but it certainly looks
 like something like this. The issue is kind of deterministic
 with regard
 to the timing when we remove the SD card. When we do so, we
 observe the
 SD card controller, or its driver has one of three reactions:
 - it continues to work normally
 - it fails to detect later re-insertion of an SD card
 - next dosfsck reports error in available clusters count and
 repairing the filesystem actually corrupts the FAT32 FS
 information sector, making it unmountable, and fails to detect the
 superblock on any later SD card.

Thanks again for the info. I have been doing hexdumps of cards,
using HxD on Windows (http://mh-nexus.de/en/)
I have now been able to reproduce the state where the system corrupts
all writes, shifting the data by 2 bytes. Pulling the card during
write seems to be the trigger as you found.

Assuming you are seeing the same thing, the system will corrupt the
file contents, directories, FAT and FSINFO sectors the same way,
whenever it writes them. The last two bytes from a sector write show
up as the first two bytes of the next sector written, for example
the last two bytes of a test file appear as the first two bytes of
a (corrupted) directory sector.

I have seen similar complaints about FSINFO
from Linux and corruption / lost clusters from Windows.

 In our case, removing the SD card after recording 45s
 triggers the third
 case almost certainly. I assumed 45s is the time when the FS
 information sector is first modified in our case, but this timing is
 probably specific to our application.

That sounds about right. Under Linux the OS will cache those writes
for some (tuneable) amount of time before flushing them to disk.

 Some other information: we have this issue on DM368 using the 2.6.32
 kernel from the Arago tree. And our SD card are correctly formatted,
 i.e. the same way as how the SD association SD card formatter does.

That is interesting. It suggests that at least it's not a silicon errata
in the MMC/SD block that was found and fixed between DM355 and DM368.

I am busy getting familiar with the sources and working out how best to
approach debugging this. Will send info when/if I learn something..

--
Jon Povey
jon.po...@racelogic.co.uk

Racelogic is a limited company registered in England. Registered number 2743719 
.
Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, 
MK18 1TB .

The information contained in this electronic mail transmission is intended by 
Racelogic Ltd for the use of the named individual or entity to which it is 
directed and may contain information that is confidential or privileged. If you 
have received this electronic mail transmission in error, please delete it from 
your system without copying or forwarding it, and notify the sender of the 
error by reply email so that the sender's address records can be corrected. The 
views expressed by the sender of this communication do not necessarily 
represent those of Racelogic Ltd. Please note that Racelogic reserves the right 
to monitor e-mail communications passing through its network


___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


RE: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-01-30 Thread Jon Povey
Gilles Chanteperdrix wrote:
 Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355 running 2.6.36.
 The system will get itself into a state where it appears all SD
 writes are offset by two bytes. This is using a vfat filesystem on
 the SD, and affects the FAT and directories at least.

 we observe something similar with the 2.6.32 kernel. In our case, it
 happens almost systematically if we remove the SD card while
 a transfer is occuring.

Thanks for the report. That sounds like what we are seeing.
I suspect some kind of incomplete DMA or controller state clearing
on surprise removal.

Will see if I can debug it and post a patch if I work it out.

--
Jon Povey
jon.po...@racelogic.co.uk

Racelogic is a limited company registered in England. Registered number 2743719 
.
Registered Office Unit 10, Swan Business Centre, Osier Way, Buckingham, Bucks, 
MK18 1TB .

The information contained in this electronic mail transmission is intended by 
Racelogic Ltd for the use of the named individual or entity to which it is 
directed and may contain information that is confidential or privileged. If you 
have received this electronic mail transmission in error, please delete it from 
your system without copying or forwarding it, and notify the sender of the 
error by reply email so that the sender's address records can be corrected. The 
views expressed by the sender of this communication do not necessarily 
represent those of Racelogic Ltd. Please note that Racelogic reserves the right 
to monitor e-mail communications passing through its network


___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source


Re: Intermittent SD write corruption on DM355, kernel 2.6.36

2011-01-28 Thread Gilles Chanteperdrix
Jon Povey wrote:
 I am seeing rare SD card write corruption on DM355 running 2.6.36.
 The system will get itself into a state where it appears all SD writes are
 offset by two bytes. This is using a vfat filesystem on the SD, and
 affects the FAT and directories at least.

Hi,

we observe something similar with the 2.6.32 kernel. In our case, it
happens almost systematically if we remove the SD card while a transfer
is occuring.

Regards.

-- 
Gilles Chanteperdrix
NexVision, http://www.nexvision.fr
Office: 99, av Clot-Bey, 13008 Marseille, France
Phone: +33 (0)4 91 77 62 87 / Fax: +33 (0)4 91 77 64 10
___
Davinci-linux-open-source mailing list
Davinci-linux-open-source@linux.davincidsp.com
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source