Re: hardlink vs. symlink

Dave Ihnat Sat, 27 Jun 1998 01:32:08 -0400
Igmar Palsenberg asked:

 >How do hard links and sym links differ?

Several people--particularly James H. G. Redekop--answered with
explanations that were simply fully mistaken.  Now, allowing for the
evolution of Unix/ Linux since last I taught internals, I stopped long
enough to look at the source before posting; and I waited until I'd
read Andreas Prohaska's post, which is actually fully correct.

So why am I posting?  Simply to really reinforce Andreas' explanation,
and to stress the difference.  I am not dissing James, or anyone else
who responded; but this is an important concept that should be
understood, and is essential to Unix-based operating systems.  The
following is something on the order of a tutorial; if you don't need
it, please just skip this.  OR, if you know about this, it's quite a
bit of fun to catch me out in errors originating from cosmic ray memory
damage, lack of sleep, etc...

In common operating systems at the time Unix was developed, the
dataset, or file, was usually monolithic--everything about the file was
contained in the directory (or sometimes "partitioned data set") entry,
including allocations, ownership, date/time stamps, permissions, etc.
Pretty much like a FAT directory entry today.  CP/M also forced the
data storage block allocations to be part of that entry, if I recall
correctly.  A "link", as we think of it today, was not a fundamental
concept, and not easy to implement if you could think of a reason to do
all that work.

In Unix, they broke the directory out from the structure that actually
holds all the information about a file.  The directory consisted of
simply an integer value pointing to an information node, or "inode" (if
I've got the actual origin of "inode" wrong through memory cruft--I
KNOW I'll be corrected!), and a text string that gave the file the name
associated with it that humans needed to identify it.  The inode
actually stored all the gritty information needed to maintain the
file--including all the items described above, such as owner,
permissions, etc.  An additional field was added--the reference count.
This tells the system how _many_ references there still are to this
inode and its attendant data.  (But not _where_ those references lie;
correction of any corruption of this linkage is part of the job of fsck.)

Because of this segregation, it was an easily implemented ability to
implement the concept of a "link"--another name for the same dataset.
As usual, before this, few people saw much advantage in the feature.
Once it was available and easy to use, it became a strength of
Unix--for instance, a set of features with very much the same
underlying code but very different behavior could be implemented as a
linked set (as was mv/ln/rm).  Or you could provide access to a program
without opening its directory path.  Or so on...

[Note that this explains why Andreas' explanation of a hard link is
right, and the others are wrong.  The file is *not* copied; there is
only *1* copy of the actual data, and in fact, of the inode itself.
The reference count is bumped, and an additional slot lives in some
directory.]

Kernighan, Ritchie, Thompson, et. al. originally implemented the
concept of a hard link only.  They considered the equivalent of what
would eventually became a symlink too hard to maintain reliably, and
felt it would lend itself to abuse--in fact, the symlink, eventually
implemented at Berkeley, has become called in some circles the "goto of
the filesystem".

Their implementation was to simply create another special file, with its
primary data element a path to the referenced file.  And yes, K&R and
Dennis were right--it does get "stale", is harder to maintain, etc.

But damn, it's been useful, too.

Hope this bit of background helps put things in perspective...
-- 
        Dave "Getting ready for the old man's stump" Ihnat
        [EMAIL PROTECTED]       || [EMAIL PROTECTED]
        312/315.1075 [home office]      || 312/443.5860 [office]


-- 
  PLEASE read the Red Hat FAQ, Tips, Errata and the MAILING LIST ARCHIVES!
http://www.redhat.com/RedHat-FAQ /RedHat-Errata /RedHat-Tips /mailing-lists
         To unsubscribe: mail [EMAIL PROTECTED] with 
                       "unsubscribe" as the Subject.
Re: hardlink vs. symlink

Reply via email to