On Thu, 4 Jul 2002, Terry Lambert wrote:

> Richard Sharpe wrote:
> > [1] Samba, because it has to support the Windows case insensitive file
> > system, must do some pretty ugly things :-) When a client asks that a file
> > be opened, for example, Samba tries with exactly the case that was
> > presented. If that fails, it must do a readdir scan of the directory so it
> > can do a case-insensitive match. So, even negative caching does not buy us
> > much in the the case of Samba. What would help is a case insensitive
> > filesystem.
> 
> It is useful to be able to do "case sensitive on storage, case insensitive
> on lookup" on a per process basis.  The easiest is if you wire this in
> as a flag on the proc itself.  The normal way this is done is a flag to
> sfork, but... it should also be possible to have the proc open itself
> in procfs, and then ioctl() down a flag setting for this.

I have come to the conslusion that Terry is right.

Having watched a cygwin-based build of a package, the behavio[u]r is just 
too ugly. When an include file is looked for, it causes Samba to do 
readdir scans for every directory in the -I chain that the include file is 
not in until it is found. If we could eliminate all those readdir scans 
performance would improve dramatically.

Fundamentally, what I want to support is both UNIX clients (say, via NFS 
etc) and Windows clients to be able to share files in the same directory.

Samba already does case-preserving file name creation, and indeed, the 
problem does not go away even if Samba always case-folds all names to 
lower case, because a UNIX-user or -client might still create two files 
that differ only by the case of one or more characters in their names.

This means that Terry is right when he says I need an IOCTL. Basically, 
normal users get the normal case sensitive file system, while Windows 
clients, via an IOCTL which says, give ME case-independed lookups, get a 
slightly different file system.

To support that, however, I need to change the name cache hash function to 
be case-insensitive (there's more--see below). This means that name cache 
hash chains could get longer. In the worst case, if a file system contains 
large numbers of files with long names, all using the same characters that 
only differ by case of indivual characters, the hash chain becomes a 
linear search. However, UNIX file systems generally don't get like that. I 
imagine that the hash chains will grow to no more that twice their current 
size, but will probably grow by a factor close to one.

Another problem is the extra complexity required in cache_lookup. When we 
want cache-insensitive lookups, we have to do extra work, even if we find 
a match in the cache. The problem is with files that differ by only the 
case of one or more characters. When this occurs, my view is that we 
should return the file with the longest string of exactly matching 
characters, however, we might allow the sys admin to set policy, at the 
expense of complicating things.

When we search a hash chain, if we get an exact match, we are done, but if 
we don't get an exact match, we still have to do a readdir scan to find a 
better match, and to ensure that we return consistent results. Similarly,
when we do a readdir scan, if we get an exact match, we are done, but if 
we don't, we need to keep going.

Another aspect that needs consideration is the effect on negative caching. 
Getting a negative result on the exact name match is no good anylonger, 
since there may be a case-insensitive match in the directory. This seems 
to make negative cache entries useless for case-insensitive matching.

Finally, I think that persuing this subject some more is very important 
from the point of view of constructing high-performance CIFS servers, 
based on Samba or other software, so I would appreciate comments.

Regards
-----
Richard Sharpe, [EMAIL PROTECTED], [EMAIL PROTECTED], 
[EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to