On Friday 13 November 2009 15:47, McKown, John wrote:
>This goes back to the person who wanted some way to emulate DD concatenation
> of multiple datasets so that they are read as if they were one. Everybody
> agrees that there isn't an easy way. Now, I don't know filesystem
> internals. But what about a new type of symlink? Normally, a symlink
> contains the real name of the file. Sometimes a symlink will point to
> another symlink, and so on (I don't know how deep). What about a
> multi-symlink. That's where a symlink points to multiple files in a
> specific order. When the symlink is opened and read, each file in the
> symlink is opened and read in order. I know this would require some changes
> to open() as well, in order to make sure that each file in the symlink
> chain is readable by the process.
>
>What think? Or is this just alien to the UNIX mindset?

An interesting idea, and yes it is wierd and rather alien to UNIX minds.  
You're implementing something at the filesystem level which is trivially 
implemented at the process level.  And all to avoid some IPC via pipes?  Has 
anyone calculated how much overhead there is in using cat to pipe some files 
into a process instead of having the process read the files itself?

The more I think about this, the less this seems like a symlink.  I thinking 
of it as a meta-file: a file of files.  This introduces the idea of a new 
type of file whose contents are known to and interpreted by the system, in 
the way a directory-file's contents are known.  Does this really have any 
value?

Regardless of its value, in thinking of how to implement this, I see a few 
problems:

- What happens if one of the files is missing?
- How do you seek() in such a file?
- Similarlly, how do you implement locks on byte ranges within such a file?
- What happens if another process appends to one of the files while you are 
reading a later one in the sequence?  Does your read position change?

You can solve those, perhaps, by requiring an open() of a meta-file to open 
all of the listed files.  If any file open fails, the meta-file open fails 
and closes all the others.  A meta-file's file descriptor would have to refer 
to a new kernel data structure that is a list of the open file descriptors of 
the listed files (or rather pointers to the data structures referenced by 
those file descriptors).  This structure would be used to map an offset 
within the meta-file to an offset within one of the list of files, using the 
file's lengths.  This solves the seek and lock problems.  I'm still not sure 
about the append problem, though.

Another possible implementation would be entirely within the filesystem, where 
the meta-file would have direct access to the data-blocks of the underlying 
files.  I think that opens up too many cans-o-worms to be a good solution, 
though.

Of course, once you have this kind of file, you have meta-files of meta-files 
of meta-files of ...  Isn't it better to represent such structures in 
user-space instead of kernel-space?

>ln -s symlink realfile1 realfile2 /etc/fstab /tmp/somefile

This command-line syntax is already used by ln (the third form listed in the 
manpage synopsis) to create several symlinks in a directory, which is the 
final argument.

It's an interesting idea, but I'm not convinced of its utility.  I'd like to 
know what percentage of the I/O time (or CPU cycles) is used by piping files 
via cat.  Anyone have any measurements?
        - MacK.
-----
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: m...@rs.com
Web: www.rocketsoftware.com  

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to