On 10/22/2013 12:18 PM, Dyer, Rodney wrote:
> This problem is affecting us as well.
> 
> I'm not entirely convinced this problem couldn't be fixed with some AFS 
> client trickery.

As you so often like to tell me the application should attempt the
operation to the file system and let it fail if it is going to fail.
The problem with the Explorer Shell is that it tries to be smarter than
the file system.  The Explorer Shell maintains a cache of all of path
components including which volume that path is in, what the volume
attributes are and how much free space is present there.  When that
cache becomes corrupted the explorer shell makes incorrect decisions
about which volume a path refers to and how much free space there is.

To the best of my knowledge there is no ability to reset the explorer
cache other than killing the explorer process.  When the explorer shell
guesses that there is no free space at a particular path it will never
even try the operation.  It assumes it is smarter.

> Our issue is that we have a drive mapped to the cell root.
> 
> Since our drive is mapped to the root, the application 'thinks' the entire 
> file system on that 'drive' is one 'volume'.

I'm not sure how you have come to this conclusion.  Lets say that your
cell layout looks like:

 \\afs\college.edu\user\username\windows\profile.V2\

where

  college.exe mp-> root.cell.readonly

  user        mp-> root.user.readonly

  username    mp-> user.username

  windows     = directory

  profile.V2  = directory

and you have mapped R: to \\afs\college.edu

If the application thinks that the entire tree under R: is one volume
that is not the fault of AFS.  AFS tells the application this is not the
case by setting the Reparse Point File Attribute on each mount point and
by setting the Surrogate Bit in the Reparse Point Tag value.  The RP
Surrogate bit tells the application that the reparse point is a place
holder for some other object which might not be in the same volume.
The application doesn't need to know anything about AFS.  It just needs
to follow the rules for parsing the output of Win32 Directory Queries
and File Information Queries.

If you have an application that is incapable of processing reparse
points you should raise that issue with the application vendor because
that application is going to break on every Windows File System that
supports reparse points.

> Some of our applications try to be 'smart' and check the 'disk free' before 
> they perform their writes.

The applications do this because the Windows Cache (not the AFS cache)
will permit an application to write into the cache as much data as it
wants until the file system tells the Windows Cache that there is no
more room.  Unfortunately some network file system clients have a very
poor idea of the free space on the server (think SMB/CIFS).  As a result
the application can write hundreds of MBs or GBs of data into the
Windows cache before the Windows cache begins to lazy write the data to
the network.  If it turns out there is no room, the application is told
a network error occurred and the data is lost.  In XP a balloon would
appear with an error message indicating that data was lost.  In Vista
and above the error is written to the Windows Application Event Log but
the end user isn't even told.

So applications have a very good reason to check the free space or use
non-cached writes if they actually care about their data getting to the
file server when the disk is "remote" because aren't all "remote" disks
"SMB/CIFS"?

> (Note: In my history as a programmer, we would just open the file for 'write' 
> and keep writing until we run out of space with a 'out of space' error occurs 
> from the 'C' calls.  I think the error was ENOSPC, 'No space left on device'. 
> )

I have just explained the problem with this approach.

> I understand that -if- the AFS client were to return an arbitrarily large 
> size number for the 'disk free' on them mapped drive, and the application 
> writes into a sub-folder, then that sub-folder may be an AFS volume that 
> doesn't have that 'disk space' available.  In that case the application would 
> lose data when going over quota for the volume.  So some decision was made to 
> return 'zero bytes available' on all drives mapped to the AFS root.

The AFS redirector reports the actual amount of free space for each
volume when asked.  If the AFS redirector is never asked, it never
tells.  Readonly volumes have 0 free space and therefore the reported
free space is 0 bytes.

No one is forcing applications to query the amount of free space on the
volume.  If an application is doing so its because the author felt that
it was necessary to protect the user.

Lying about the size of the partition, the size of the volume and the
amount of free space accessible to the user puts the application at
risk.  By lying the application developer that does the right thing is
being punished in place of the application developer that does the wrong
thing.

> You get around the issue by mapping a drive directly to the AFS volume in 
> which you need to write to.

You could also contact the application author and file a bug report to
get the application fixed.

> If you don't care about losing data, one solution is to 'symlink' the cell 
> into the C: drive, as follows...
> 
>      C:\>mklink /d c:\afs \\afs\root.afs
> 
>      A DIR of c:\ will then return what is available on your C: drive.

Which is great until the free space on the C: drive drops to nothing
because of shadow volume snapshots, pagefile increases, or other
reasons.   Current free space on my C: disk.

         618,112 bytes free
              99.9 % in use

It was 5GB free about 12 hours ago.  During disk backups the volume
checkpoint and NTFS journal uses up all of the free space.

All you are doing is trading one problem for another.

> We ran the AFS client for years on Windows XP and it always returns 2 TB as 
> free on a mapped drive.

This statement is incorrect.   As an SMB server it is not possible for
the individual AFS volumes to be reported as independent devices.
Therefore the size of the one volume reported to the SMB client is that
of the entire AFS name space.  That is effectively infinite but such a
volume cannot be reported.  So the volume size was always reported as
2TB and the free space as 1TB.

Then you would run Microsoft Office and find that sometimes your
powerpoint or word or excel documents would get corrupted because the
application believed the data was written to disk and it was only during
the cache flush that occurs when the last handle is closed that the data
would be pushed to the file server where there would be no space.

There was also the problem that applications couldn't write a file
larger than 1TB and some applications would get very confused when the
sum of all files in a directory tree were larger than the partition size.

Finally because there was no mechanism of exposing volume boundaries,
when a user tried to delete a tree in the explorer shell it would walk
the entire tree to the leafs and start deleting stuff.   What it should
have done is deleted the mount point if the user had permission and done
nothing outside the current volume if the user didn't have permission.

> This is a VERY annoying problem.

Yes it is and the correct way to get it fixed is to file bug reports
with the application vendors.  Asking the AFS redirector to break the
rules only makes matters worse.

> I wish AFS would prevent you from writing more into a volume than the quota 
> size that you were allocated.  In that case returning ENOSPC, 'No space left 
> on device' should be applicable.

The AFS file server does prevent you from writing more into a volume
than the quota size and the AFS redirector queries the file server after
each 1MB of data is written on a given file to update its estimates.
But the AFS redirector cannot return an error on a WriteFile() request
it has not yet been asked to perform.

Whether or not the AFS redirector returns out of space errors is
irrelevant.  The applications that have issues are having issues because
they are attempting to work around issues with other file systems by
querying the free space or the volume attributes and are doing so
incorrectly.

The AFS redirector returns two errors:  Out of space when the partition
is full and out of quota when the volume quota is reached.

Jeffrey Altman



Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to