On Fri, 1 Jun 2007, David Fox wrote:

> We're recently started experimenting with using ntfs-3g (1.417) in a 
> system where we we pack a filesystem as full as we can with a set of 
> files with a fairly unpredictable characteristics in terms of size and 
> content. We can control the file names and the directory structure of the 
> data we write to the filesystem. We've recently added the constraint that 
> it has to be an NTFS filesystem; previously we were simply using ext2.

Why NTFS? Why not XFS?

> With ntfs-3g, we find that unless we happen to be working with mostly 
> "large" files, we invariably run afoul of one of the two MFT 
> extension/expansion cases that seem not to be implemented in 
> libntfs-3g/mft.c. The errors occur long before the filesystem is 
> significantly populated if the files are small and numerous enough. The 
> particular one we hit is the one that says:
>
> > May 31 15:13:03 appl2 ntfs-3g[29281]: Not enough space to extended mft data 
> > attribute.
> > May 31 15:13:03 appl2 ntfs-3g[29281]: Could not allocate new MFT record

Yes, documented at http://ntfs-3g.org/support.html#filecreate

> This is uniquely emitted by ntfs_mft_data_extend_allocation() at line 
> 1010. There a nice big tantalizing TODO right there.

Yes, there are about 300 TODO's all over in the code, on notes, etc. They 
are planned to be categorized when the most important missing things are 
sorted out.

> I haven't really started looking carefully at this yet, but even before I 
> try to familiarize myself with the ntfs-3g code (and the filesystem 
> itself) I have questions that you folks might be able to answer:
>
> 1. Is anybody already actively working on implementing this particular
> operation?

No. I plan to do it in about 2-3 months, depending on how I'll have spare 
time. It could happen sooner but much later too.

Of course correct and well tested patch is welcome anytime :)

> 2. Is there a good reason why it hasn't been implemented? Like, is it 
> pretty mysterious and impossibly difficult? :-)

No. It's a very rare corner case most people never notice and complain 
about, so it's not the highest priority. The driver has several ways trying 
to prevent this situation -- though filling a fragmented volume with small 
files can overcome them at some point.

On the other hand, if the implementation is not perfect then one will lose 
many of his files since the $MFT stores the information about them.

> 3. Although we can't control file sizes, does anybody know of any 
> strategy we can undertake with the factors that we can control (directory 
> structure, file names, number of concurrent application threads writing 
> files to the filesystem, I guess) to try to ensure that this particular 
> kind of MFT extension doesn't need to occur when we're populating a 
> filesystem?

Every (16th) new file will extend $MFT. So you either need to keep 
compacting the volume and defragmenting $MFT, or create a lot of 0 byte 
size temporary files (1 million files will take only 1 GB non-reclaimable 
disk space) on a freshly formatted volume then remove them. Then $MFT can't 
easily fragment during normal usage since it's preallocated contiguously. 

The more you preallocate, the less chance you hit EOPNOTSUPP but also the 
less disk space you can use to store data.

> Thanks! 

You're welcome.

        Szaka

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
ntfs-3g-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel

Reply via email to