RE: [OpenAFS] not enough space in target directory
-Original Message- From: openafs-info-ad...@openafs.org [mailto:openafs-info- ad...@openafs.org] On Behalf Of Jeffrey Altman Sent: Sunday, October 27, 2013 3:42 AM To: openafs-info@openafs.org Subject: Re: [OpenAFS] not enough space in target directory On 10/22/2013 12:18 PM, Dyer, Rodney wrote: (Note: In my history as a programmer, we would just open the file for 'write' and keep writing until we run out of space with a 'out of space' error occurs from the 'C' calls. I think the error was ENOSPC, 'No space left on device'. ) I have just explained the problem with this approach. We have a problem then, as there is no way that I know of to check what space is left on the device first, then begin writing with a lock on the free space you want. Between the time that you've checked the free space, and you begin writing, you may have lost the space to another user. To solve this problem you would need to try and allocate all the space needed first, then write into it. If you don't care about losing data, one solution is to 'symlink' the cell into the C: drive, as follows... C:\mklink /d c:\afs \\afs\root.afs A DIR of c:\ will then return what is available on your C: drive. Which is great until the free space on the C: drive drops to nothing because of shadow volume snapshots, pagefile increases, or other reasons. Current free space on my C: disk. 618,112 bytes free 99.9 % in use It was 5GB free about 12 hours ago. During disk backups the volume checkpoint and NTFS journal uses up all of the free space. All you are doing is trading one problem for another. That was not really the point. This was simply an alternative to someone wanting to deliberately live life in the edge (with possible failure). You could always use a spare drive, or partition where you never wrote files in to symlink AFS. This is a VERY annoying problem. Yes it is and the correct way to get it fixed is to file bug reports with the application vendors. Asking the AFS redirector to break the rules only makes matters worse. So you must be saying this problem is also prevalent with other file systems such as those that represent themselves as CIFS like NetApp, and even Microsoft's own DFS? I wish AFS would prevent you from writing more into a volume than the quota size that you were allocated. In that case returning ENOSPC, 'No space left on device' should be applicable. The AFS file server does prevent you from writing more into a volume than the quota size and the AFS redirector queries the file server after each 1MB of data is written on a given file to update its estimates. But the AFS redirector cannot return an error on a WriteFile() request it has not yet been asked to perform. Whether or not the AFS redirector returns out of space errors is irrelevant. The applications that have issues are having issues because they are attempting to work around issues with other file systems by querying the free space or the volume attributes and are doing so incorrectly. That's certainly a huge list of vendors, and programmers everywhere. I don't know any graduates who get trained in such obscure file system level information. Rodney
Re: [OpenAFS] not enough space in target directory
On 10/29/2013 5:19 PM, Dyer, Rodney wrote: We have a problem then, as there is no way that I know of to check what space is left on the device first, then begin writing with a lock on the free space you want. Between the time that you've checked the free space, and you begin writing, you may have lost the space to another user. This is why the AFS file server allows a user to go over quota by a small amount before it begins failing writes. To solve this problem you would need to try and allocate all the space needed first, then write into it. This is a very common pattern for exactly this reason. Especially for file copies. The target file is opened, the file size is allocated and then the data is written into the target location. The Win32 API calls are: CreateFile to create the file SetFilePointerEx to advance the file pointer to the size to allocate SetEndOfFile to commit that size to disk smime.p7s Description: S/MIME Cryptographic Signature
Re: [OpenAFS] not enough space in target directory
On 10/22/2013 12:18 PM, Dyer, Rodney wrote: This problem is affecting us as well. I'm not entirely convinced this problem couldn't be fixed with some AFS client trickery. As you so often like to tell me the application should attempt the operation to the file system and let it fail if it is going to fail. The problem with the Explorer Shell is that it tries to be smarter than the file system. The Explorer Shell maintains a cache of all of path components including which volume that path is in, what the volume attributes are and how much free space is present there. When that cache becomes corrupted the explorer shell makes incorrect decisions about which volume a path refers to and how much free space there is. To the best of my knowledge there is no ability to reset the explorer cache other than killing the explorer process. When the explorer shell guesses that there is no free space at a particular path it will never even try the operation. It assumes it is smarter. Our issue is that we have a drive mapped to the cell root. Since our drive is mapped to the root, the application 'thinks' the entire file system on that 'drive' is one 'volume'. I'm not sure how you have come to this conclusion. Lets say that your cell layout looks like: \\afs\college.edu\user\username\windows\profile.V2\ where college.exe mp- root.cell.readonly usermp- root.user.readonly usernamemp- user.username windows = directory profile.V2 = directory and you have mapped R: to \\afs\college.edu If the application thinks that the entire tree under R: is one volume that is not the fault of AFS. AFS tells the application this is not the case by setting the Reparse Point File Attribute on each mount point and by setting the Surrogate Bit in the Reparse Point Tag value. The RP Surrogate bit tells the application that the reparse point is a place holder for some other object which might not be in the same volume. The application doesn't need to know anything about AFS. It just needs to follow the rules for parsing the output of Win32 Directory Queries and File Information Queries. If you have an application that is incapable of processing reparse points you should raise that issue with the application vendor because that application is going to break on every Windows File System that supports reparse points. Some of our applications try to be 'smart' and check the 'disk free' before they perform their writes. The applications do this because the Windows Cache (not the AFS cache) will permit an application to write into the cache as much data as it wants until the file system tells the Windows Cache that there is no more room. Unfortunately some network file system clients have a very poor idea of the free space on the server (think SMB/CIFS). As a result the application can write hundreds of MBs or GBs of data into the Windows cache before the Windows cache begins to lazy write the data to the network. If it turns out there is no room, the application is told a network error occurred and the data is lost. In XP a balloon would appear with an error message indicating that data was lost. In Vista and above the error is written to the Windows Application Event Log but the end user isn't even told. So applications have a very good reason to check the free space or use non-cached writes if they actually care about their data getting to the file server when the disk is remote because aren't all remote disks SMB/CIFS? (Note: In my history as a programmer, we would just open the file for 'write' and keep writing until we run out of space with a 'out of space' error occurs from the 'C' calls. I think the error was ENOSPC, 'No space left on device'. ) I have just explained the problem with this approach. I understand that -if- the AFS client were to return an arbitrarily large size number for the 'disk free' on them mapped drive, and the application writes into a sub-folder, then that sub-folder may be an AFS volume that doesn't have that 'disk space' available. In that case the application would lose data when going over quota for the volume. So some decision was made to return 'zero bytes available' on all drives mapped to the AFS root. The AFS redirector reports the actual amount of free space for each volume when asked. If the AFS redirector is never asked, it never tells. Readonly volumes have 0 free space and therefore the reported free space is 0 bytes. No one is forcing applications to query the amount of free space on the volume. If an application is doing so its because the author felt that it was necessary to protect the user. Lying about the size of the partition, the size of the volume and the amount of free space accessible to the user puts the application at risk. By lying the application developer that does the right thing is being punished in place of the application developer that does the wrong thing. You get around the issue by
RE: [OpenAFS] not enough space in target directory
This problem is affecting us as well. I'm not entirely convinced this problem couldn't be fixed with some AFS client trickery. Our issue is that we have a drive mapped to the cell root. Since our drive is mapped to the root, the application 'thinks' the entire file system on that 'drive' is one 'volume'. Some of our applications try to be 'smart' and check the 'disk free' before they perform their writes. (Note: In my history as a programmer, we would just open the file for 'write' and keep writing until we run out of space with a 'out of space' error occurs from the 'C' calls. I think the error was ENOSPC, 'No space left on device'. ) I understand that -if- the AFS client were to return an arbitrarily large size number for the 'disk free' on them mapped drive, and the application writes into a sub-folder, then that sub-folder may be an AFS volume that doesn't have that 'disk space' available. In that case the application would lose data when going over quota for the volume. So some decision was made to return 'zero bytes available' on all drives mapped to the AFS root. You get around the issue by mapping a drive directly to the AFS volume in which you need to write to. If you don't care about losing data, one solution is to 'symlink' the cell into the C: drive, as follows... C:\mklink /d c:\afs \\afs\root.afs A DIR of c:\ will then return what is available on your C: drive. We ran the AFS client for years on Windows XP and it always returns 2 TB as free on a mapped drive. This is a VERY annoying problem. I wish AFS would prevent you from writing more into a volume than the quota size that you were allocated. In that case returning ENOSPC, 'No space left on device' should be applicable. Rodney Rodney M. Dyer Operations and Systems (Specialist) Mosaic Computing Group William States Lee College of Engineering University of North Carolina at Charlotte Email: rmd...@uncc.edu Phone: (704)687-1942 Help Desk Line: (704)687-5080 FAX: (704)687-2352 Office: Cameron Hall, Room 232 -Original Message- From: openafs-info-ad...@openafs.org [mailto:openafs-info-ad...@openafs.org] On Behalf Of Jeffrey Altman Sent: Saturday, October 19, 2013 10:27 PM To: Ian Crowther; openafs-info@openafs.org Subject: Re: [OpenAFS] not enough space in target directory On 10/15/2013 7:07 PM, Ian Crowther wrote: One thing I noticed in my case was that process monitor reported that explorer checked free space on the root volume instead of the volume that was being written to. That is the underlying problem. Instead of using GetVolumeInformationByHandleW() called on the destination directory the Explorer Shell is using GetVolumeInformation() which must be provided the root directory of the volume. When the Explorer Shell gets the root directory wrong and choose say the root of a drive letter mapping, then it gets the wrong volume attributes and free space. Would somehow ensuring that the AFS server's root volume reports sufficient free space be an adequate workaround? I have no idea, but if I were still affected I'd try it... Readonly volumes have 0 bytes free. That is why this is a problem for AFS and not Windows File Shares. Windows doesn't support the concept of booting from a readonly volume and then accessing writable areas from a read/write volume. If it did, the Explorer Shell would be immune to this issue. The problem doesn't occur all of the time and Microsoft doesn't have internal AFS to test against so it is a challenge for them. Jeffrey Altman
Re: [OpenAFS] not enough space in target directory
Hi Jeff, Thanks for the further clarification of this problem. Your description of the problem below -- and on the MS tech forum back in March -- seems to imply that until this issue is fixed by MS, that there are potential work-arounds that could be taken by the OpenAFS client. Possibly undesirable from a developer's perspective, but effective for end-users... thoughts? Cheers, Stephen On Sat, 19 Oct 2013, Jeffrey Altman wrote: On 10/15/2013 7:07 PM, Ian Crowther wrote: One thing I noticed in my case was that process monitor reported that explorer checked free space on the root volume instead of the volume that was being written to. That is the underlying problem. Instead of using GetVolumeInformationByHandleW() called on the destination directory the Explorer Shell is using GetVolumeInformation() which must be provided the root directory of the volume. When the Explorer Shell gets the root directory wrong and choose say the root of a drive letter mapping, then it gets the wrong volume attributes and free space. Would somehow ensuring that the AFS server's root volume reports sufficient free space be an adequate workaround? I have no idea, but if I were still affected I'd try it... Readonly volumes have 0 bytes free. That is why this is a problem for AFS and not Windows File Shares. Windows doesn't support the concept of booting from a readonly volume and then accessing writable areas from a read/write volume. If it did, the Explorer Shell would be immune to this issue. The problem doesn't occur all of the time and Microsoft doesn't have internal AFS to test against so it is a challenge for them. Jeffrey Altman ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] not enough space in target directory
On 10/15/2013 12:58 PM, Jack Hill wrote: Also, since it appears that we won't be able to get it fixed, I never said it won't be fixed. I said as with anything else fixing it will be prioritized based upon the number of customers that are impacted and believe it is important to them or their organization. One way of measuring importance is whether or not you or your organization is willing to spend money to get it fixed. I apply the same measurement to determining when and whether to fix a bug in the Windows client. How many users are impacted? Will it result in data loss? How much time will it take to diagnose the cause? Is there someone willing to pay for that time? How disruptive will the eventual fix be? If this community wants Microsoft to prioritize the interaction with the OpenAFS client on Windows, then it is going to have communicate that to Microsoft's development organizations. The most common way for that to occur is by filing bug reports when preview releases are shipped or by paying for bug reports after the final release is out. Jeffrey Altman smime.p7s Description: S/MIME Cryptographic Signature
Re: [OpenAFS] not enough space in target directory
On 10/15/2013 7:07 PM, Ian Crowther wrote: One thing I noticed in my case was that process monitor reported that explorer checked free space on the root volume instead of the volume that was being written to. That is the underlying problem. Instead of using GetVolumeInformationByHandleW() called on the destination directory the Explorer Shell is using GetVolumeInformation() which must be provided the root directory of the volume. When the Explorer Shell gets the root directory wrong and choose say the root of a drive letter mapping, then it gets the wrong volume attributes and free space. Would somehow ensuring that the AFS server's root volume reports sufficient free space be an adequate workaround? I have no idea, but if I were still affected I'd try it... Readonly volumes have 0 bytes free. That is why this is a problem for AFS and not Windows File Shares. Windows doesn't support the concept of booting from a readonly volume and then accessing writable areas from a read/write volume. If it did, the Explorer Shell would be immune to this issue. The problem doesn't occur all of the time and Microsoft doesn't have internal AFS to test against so it is a challenge for them. Jeffrey Altman smime.p7s Description: S/MIME Cryptographic Signature
Re: [OpenAFS] not enough space in target directory
Jeffrey, thanks for the hint. I had been blaming this on myself, suspecting something was not correctly configured. Now I have a really dumb question: is this one of the things you can only do with a support contract? Or via connect.microsoft.com? Is there any additional information I should submit? We are a University in Germany and run Windows 7 Enterprise... Best, Christian Am 05.10.2013 02:18, schrieb Jeffrey Altman: File a bug report with Microsoft if the problem is experienced when using the explorer shell or applications relying upon the shell api for file access. This is a known bug in the explorer shell and Microsoft has been working on it for more than six months. As with all Windows bugs, a fix is prioritized based upon the number of complaints received from paying support customers. Jeffrey Altman On 10/4/2013 6:36 PM, Christian wrote: All, we are seeing some weird issues with the windows client (1.7.26, but hat also seen that with previous 1.7 versions). Often, when attempting to write data, my users get a popup box complaining about insufficient space in the target directory. In those cases, writing the data to the RW path (.cell.name) instead works just fine. Note that the volumes which are being accessed in those cases do NOT have RO replicas, just some of the volumes from which they are mounted. Write access just fails intermittently when accessed through a path which contains OTHER replicated volumes. So, for example, say that the volume users containing the mount points for the individual user volumes is replicated. Then write access to /afs/our.cell/users/joe.user will fail intermittently, while writing to /afs/.our.cell/users/joe.user always works. We use dynroot and SRV records. I have read the debugging instructions, but I am a little unsure about how we should proceed here. What should I do? Try fs trace? Thanks, Christian ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] not enough space in target directory
Christian, Feel free to make noise wherever you wish but the reality is that when Microsoft has a the choice to make between developers spending time on the Shell and addressing bugs with its own tools (SkyDrive, ReFS, etc) or those of third party products, Microsoft is going to focus on its own stuff unless an entity (or an effected community) is paying them sufficient money to make it worthwhile. In the end it requires multiple paid support contract reports to raise the profile of the bug enough that it will be fixed. Jeffrey Altman On 10/14/2013 9:33 AM, Christian wrote: Jeffrey, thanks for the hint. I had been blaming this on myself, suspecting something was not correctly configured. Now I have a really dumb question: is this one of the things you can only do with a support contract? Or via connect.microsoft.com? Is there any additional information I should submit? We are a University in Germany and run Windows 7 Enterprise... Best, Christian Am 05.10.2013 02:18, schrieb Jeffrey Altman: File a bug report with Microsoft if the problem is experienced when using the explorer shell or applications relying upon the shell api for file access. This is a known bug in the explorer shell and Microsoft has been working on it for more than six months. As with all Windows bugs, a fix is prioritized based upon the number of complaints received from paying support customers. Jeffrey Altman On 10/4/2013 6:36 PM, Christian wrote: All, we are seeing some weird issues with the windows client (1.7.26, but hat also seen that with previous 1.7 versions). Often, when attempting to write data, my users get a popup box complaining about insufficient space in the target directory. In those cases, writing the data to the RW path (.cell.name) instead works just fine. Note that the volumes which are being accessed in those cases do NOT have RO replicas, just some of the volumes from which they are mounted. Write access just fails intermittently when accessed through a path which contains OTHER replicated volumes. So, for example, say that the volume users containing the mount points for the individual user volumes is replicated. Then write access to /afs/our.cell/users/joe.user will fail intermittently, while writing to /afs/.our.cell/users/joe.user always works. We use dynroot and SRV records. I have read the debugging instructions, but I am a little unsure about how we should proceed here. What should I do? Try fs trace? Thanks, Christian ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info smime.p7s Description: S/MIME Cryptographic Signature
Re: [OpenAFS] not enough space in target directory
Jeffrey, Hm. Sorry if I wasn't clear. I am not sure if we have a support contract or not. I am just a part-time sysadmin at a University institute. My main job is running a research group in physics. I hadn't been able to find out from the central IT people at the University level whether we have a support contract for Windows :-( so we probably don't... Is there a way to find out whether one has a support contract? Christian Am 14.10.2013 15:59, schrieb Jeffrey Altman: Christian, Feel free to make noise wherever you wish but the reality is that when Microsoft has a the choice to make between developers spending time on the Shell and addressing bugs with its own tools (SkyDrive, ReFS, etc) or those of third party products, Microsoft is going to focus on its own stuff unless an entity (or an effected community) is paying them sufficient money to make it worthwhile. In the end it requires multiple paid support contract reports to raise the profile of the bug enough that it will be fixed. Jeffrey Altman On 10/14/2013 9:33 AM, Christian wrote: Jeffrey, thanks for the hint. I had been blaming this on myself, suspecting something was not correctly configured. Now I have a really dumb question: is this one of the things you can only do with a support contract? Or via connect.microsoft.com? Is there any additional information I should submit? We are a University in Germany and run Windows 7 Enterprise... Best, Christian Am 05.10.2013 02:18, schrieb Jeffrey Altman: File a bug report with Microsoft if the problem is experienced when using the explorer shell or applications relying upon the shell api for file access. This is a known bug in the explorer shell and Microsoft has been working on it for more than six months. As with all Windows bugs, a fix is prioritized based upon the number of complaints received from paying support customers. Jeffrey Altman On 10/4/2013 6:36 PM, Christian wrote: All, we are seeing some weird issues with the windows client (1.7.26, but hat also seen that with previous 1.7 versions). Often, when attempting to write data, my users get a popup box complaining about insufficient space in the target directory. In those cases, writing the data to the RW path (.cell.name) instead works just fine. Note that the volumes which are being accessed in those cases do NOT have RO replicas, just some of the volumes from which they are mounted. Write access just fails intermittently when accessed through a path which contains OTHER replicated volumes. So, for example, say that the volume users containing the mount points for the individual user volumes is replicated. Then write access to /afs/our.cell/users/joe.user will fail intermittently, while writing to /afs/.our.cell/users/joe.user always works. We use dynroot and SRV records. I have read the debugging instructions, but I am a little unsure about how we should proceed here. What should I do? Try fs trace? Thanks, Christian ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] not enough space in target directory
File a bug report with Microsoft if the problem is experienced when using the explorer shell or applications relying upon the shell api for file access. This is a known bug in the explorer shell and Microsoft has been working on it for more than six months. As with all Windows bugs, a fix is prioritized based upon the number of complaints received from paying support customers. Jeffrey Altman On 10/4/2013 6:36 PM, Christian wrote: All, we are seeing some weird issues with the windows client (1.7.26, but hat also seen that with previous 1.7 versions). Often, when attempting to write data, my users get a popup box complaining about insufficient space in the target directory. In those cases, writing the data to the RW path (.cell.name) instead works just fine. Note that the volumes which are being accessed in those cases do NOT have RO replicas, just some of the volumes from which they are mounted. Write access just fails intermittently when accessed through a path which contains OTHER replicated volumes. So, for example, say that the volume users containing the mount points for the individual user volumes is replicated. Then write access to /afs/our.cell/users/joe.user will fail intermittently, while writing to /afs/.our.cell/users/joe.user always works. We use dynroot and SRV records. I have read the debugging instructions, but I am a little unsure about how we should proceed here. What should I do? Try fs trace? Thanks, Christian ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info smime.p7s Description: S/MIME Cryptographic Signature