/dev/fd/0 Considered Harmful
Patrick ("No good deed goes unpunished") Powell
<papowell _AT_ astart _dot_ com>
Preface:
After writing the first draft of this article, I started looking at various
archives for discussion about /dev/fd/0 support in various operating systems.
In the foomatic-developer mailing lists there was a discussion
about the differences between using /dev/fd/0 and '-' as input
file specifications for GhostScript and other programs.
This issue strikes again. Read on and weep, gnash your teeth, or laugh
at me as you find suitable. And then look at the code that you have
been writing and weep, gnash your teeth, or scream in rage that you have
fallen into the trap of /dev/fd/0.
Introduction:
I have just spent the last couple of days trying to discover
why part of the LPRng printing system died after an upgrade to
some of the components. The cause was finally traced to the
use of /dev/fd/0 for a pathname instead of the classic '-' for
some utilities. To say that this was a suprise was, shall we
say, a tremendous understatement, and led to exploring the
depths of the source code of various programs, including
GhostScript, a2ps, enscript, and others where a '-' argument
for a pathname means 'read from fd 0' (i.e. - stdin).
What is the purpose of the /dev/fd/0 stuff?
Lets look at a typical example, say, the 'file' utility. It expects a
command line such as:
file /tmp/file
It will open /tmp/file, read its contents, and determine the file
type. Sometimes you would like to determine the file type of the
output of a pipe:
perl run_script | file
But file, bless its little heart, may REQUIRE a path. So, /dev/fd/0
to the rescue:
perl run_script | file /dev/fd/0
Through the magic of the Operation System,
fd = open(/dev/fd/0,...)
will have the same effect (broadly speaking) as
fd = dup(0)
and fd will be a 'duplicate' of fd 0.
The Dark Side of /dev/fd/0
On the surface, /dev/fd/0 appears to be harmless and a Good Idea.
But lets look at another convention. If no input files are
specified, then input will be taken from STDIN (fd 0). If there
is a path specified, then the file file will be opened and
input read from it. By convention, '-' will ALSO stand for
reading from stdin.
/* Good and reasonable implementation but screws up when /dev/fd/0 passed */
if( path && strcmp(path,"-") ){
close(0); /* covers the case where fd 0 is not open, note the
lack of error checking, which is deliberate here */
if( (fd = open(path,....)) == -1 ){
Die("cannot open '%s' - %s", path, strerror(errno) );
} else if( fd ){
Die("open '%s' returned FD %d", path, fd );
}
}
Lets see what happens here. FD 0 is closed. The Operating
System takes the usual actions associated with closing the file
descriptor, probably updating the internal process structure.
Then we open 'path'. If, as we expect, opening /dev/fd/0 should
try to open an 'unopenable' file, the open should fail. But
for some programs it does not. Why?
/* Evil and poor implementation */
if( path && strcmp(path,"-") ){
close(0);
open(path,....)
}
...
if( read(0,...) < 0 ){
/* treat as EOF */
}
As you see, the brutal assumpation is made that the file open
will always succeed, or if it fails, we get -1, which will case
an -1 value to be returned when used with read(). You would
suspect that nobody would write code like that, right? Ummm...
lets change the subject really fast. And stop using grep
to look at my code, you suspicious person, you.
Say you have a simple PostScript file:
%!PS-Adobe-3.0
/Courier findfont 200 scalefont setfont
72 300 moveto
(1) show showpage
% gs --help
AFPL Ghostscript 8.13 (2003-12-31)
% gs /tmp/one.ps
<you get page output>
% gs /dev/fd/0 </tmp/one.ps
AFPL Ghostscript 8.14 (2004-02-20)
Copyright (C) 2004 artofcode LLC, Benicia, CA. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Loading NimbusMonL-Regu font from
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2267236 903936 1456484 168258 1
done.
>>showpage, press <return> to continue<<
<you get no page output>
% gs - </tmp/one.ps
<you get page output>
Umm... interesting. Very interesting. It appears that AFPL ghostscript
has a /dev/fd/0 related problem.
So lets try Gnu-GhostScript:
# gs /tmp/one.ps
GNU Ghostscript 7.07 (2003-05-17)
Copyright (C) 2003 artofcode LLC, Benicia, CA. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Loading NimbusMonL-Regu font from
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2079048 716830 1622424 335985 0
done.
>>showpage, press <return> to continue<<
<page displayed>
GS>quit
# gs - </tmp/one.ps
GNU Ghostscript 7.07 (2003-05-17)
Copyright (C) 2003 artofcode LLC, Benicia, CA. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Loading NimbusMonL-Regu font from
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2079048 716823 1622424 333651 0
done.
<NO page displayed>
# gs /dev/fd/0 </tmp/one.ps
GNU Ghostscript 7.07 (2003-05-17)
Copyright (C) 2003 artofcode LLC, Benicia, CA. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Loading NimbusMonL-Regu font from
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2079048 716830 1622424 335979 0
done.
>>showpage, press <return> to continue<<
<NO page displayed>
So it appears that this (and most likely previous) version(s) suffer from this
problem.
What is the right way to do this?
/* Pure in spirit and thinks only clean thoughts Method
* This handles '-' and /dev/fd/0 but at the cost of having to do a bit
* more work. Note that this handles /dev/stdin as well.
*/
if( path && strcmp(path,"-") ){
if( (fd = open(path,....)) == -1 ){
Die("cannot open '%s' - %s", path, strerror(errno) );
} else if( fd ){
Die("open '%s' returned FD %d", path, fd );
}
if( fd != 0 ){
if( dup2(fd,0) == -1 ){
Die("dup2 failed - %s", strerror(errno) );
}
if( close(fd) == -1 ){
Die("close failed - %s", strerror(errno) );
}
}
}
This last example is the robust way of dealing with this problem.
Note that you do the open() first, then you dup() the file descriptor.
Why did you close fd 0?
The problem really starts when fd 0 is closed. These problems could
be avoided if fd 0 is never closed or is closed only after reading it.
However, this now opens a whole slew if of issues. What if you have
lpr /dev/fd/0 /dev/fd/0
i.e. - you expect to open a path multiple time and get multiple copies
of the document printed.
The best answer to this is 'undefined behavior'.
Perhaps avoiding using '-' is the best approach.
Why didn't you RTFM?
Lets see what the man pages say:
FD(4) FreeBSD Kernel Interfaces Manual FD(4)
NAME
fd, stdin, stdout, stderr -- file descriptor files
DESCRIPTION
The files /dev/fd/0 through /dev/fd/# refer to file descriptors which can
be accessed through the file system. If the file descriptor is open and
the mode the file is being opened with is a subset of the mode of the
existing descriptor, the call:
fd = open("/dev/fd/0", mode);
and the call:
fd = fcntl(0, F_DUPFD, 0);
are equivalent.
Opening the files /dev/stdin, /dev/stdout and /dev/stderr is equivalent
to the following calls:
fd = fcntl(STDIN_FILENO, F_DUPFD, 0);
fd = fcntl(STDOUT_FILENO, F_DUPFD, 0);
fd = fcntl(STDERR_FILENO, F_DUPFD, 0);
Flags to the open(2) call other than O_RDONLY, O_WRONLY and O_RDWR are
ignored.
FILES
/dev/fd/#
/dev/stdin
/dev/stdout
/dev/stderr
SEE ALSO
tty(4)
FCNTL(2) FreeBSD System Calls Manual FCNTL(2)
NAME
fcntl -- file control
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <fcntl.h>
int
fcntl(int fd, int cmd, ...);
DESCRIPTION
The fcntl() system call provides for control over descriptors. The argu-
ment fd is a descriptor to be operated on by cmd as described below.
Depending on the value of cmd, fcntl() can take an additional third argu-
ment int arg.
F_DUPFD Return a new descriptor as follows:
o Lowest numbered available descriptor greater than or
equal to arg.
o Same object references as the original descriptor.
o New descriptor shares the same file offset if the
object was a file.
o Same access mode (read, write or read/write).
o Same file status flags (i.e., both file descriptors
share the same file status flags).
o The close-on-exec flag associated with the new file
descriptor is set to remain open across execve(2) sys-
tem calls.
So, it appears that when the 'open(/dev/fd/0)' is done, then since /dev/fd/0 is
closed,
you should get -1 returned, and experimentally, you do:
% cat opentest.c
*
test open on /dev/fd/0
*/
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
int main( int argc, char *argv[], char *envp[] )
{
int fd;
close(0);
fd = open("/dev/fd/0",O_RDONLY);
if( fd == -1 ){
fprintf(stderr,"open /dev/fd/0 failed - %s\n", strerror(errno) );
exit(1);
}
return(0);
}
Linux and FreeBSD:
% make opentest
% opentest </etc/hosts
open /dev/fd/0 failed - Bad file descriptor
So, this looks like the same behavior.
As for LINUX, here is the information from the 'man proc' (RedHat 9 release):
fd This is a subdirectory containing one entry for each file
which the process has open, named by its file descriptor,
and which is a symbolic link to the actual file (as the
exe entry does). Thus, 0 is standard input, 1 standard
output, 2 standard error, etc.
Programs that will take a filename, but will not take the
standard input, and which write to a file, but will not
send their output to standard output, can be effectively
foiled this way, assuming that -i is the flag designating
an input file and -o is the flag designating an output
file:
foobar -i /proc/self/fd/0 -o /proc/self/fd/1 ...
>> and you have a working filter. Note that this will not
>> work for programs that seek on their files, as the files
>> in the fd directory are not seekable.
/proc/self/fd/N is approximately the same as /dev/fd/N in
some UNIX and UNIX-like systems. Most Linux MAKEDEV
scripts symbolically link /dev/fd to /proc/self/fd, in
fact.
OK, so seek should fail. Right? Says so in the documentation. So, unbeliever that
I am, I will try an experiment.
%cat seektest.c:
/*
test seek on /dev/fd/0
*/
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
int main( int argc, char *argv[], char *envp[] )
{
int fd;
fd = open("/dev/fd/0",O_RDONLY);
if( fd == -1 ){
fprintf(stderr,"open /dev/fd/0 failed - %s\n", strerror(errno) );
exit(1);
}
if( lseek(fd,0,SEEK_SET) == (off_t)(-1) ){
fprintf(stderr,"lseek /dev/fd/0 failed - %s\n", strerror(errno) );
exit(1);
}
fprintf(stderr,"lseek /dev/fd/0 succeeded\n" );
return(0);
}
% make seektest
% cat /etc/hosts |seektest
lseek /dev/fd/0 failed - Illegal seek
% seektest </etc/hosts
lseek /dev/fd/0 succeeded
We try this on FreeBSD and we get:
% make seektest
% cat /etc/hosts |seektest
lseek /dev/fd/0 failed - Illegal seek
% seektest </etc/hosts
lseek /dev/fd/0 succeeded
So at least the FreeBSD and Linux systems agree in this behavior.
Ummm... so much for believing the documentation.
Even when you RTFM the FM may not be correct.
Summary:
Many of the existing utilities that expect to have input on FD 0 (stdin)
and are passed /dev/fd/0 as a file parameter appear to fail or have 0 length
input. This appears to be caused by the application closing fd 0 and then
opening /dev/fd/0.
Avoid the use of /dev/fd/0 as a command line path unless you are sure that the
implementors of the software will open and close the file descriptors correctly.
Also, perhaps avoiding the use of '-' and /dev/fd/0 and sticking to the default
for reading from stdin is the best method.
Patrick Powell Astart Technologies
[EMAIL PROTECTED] 6741 Convoy Court
Network and System San Diego, CA 92111
Consulting 858-874-6543 FAX 858-751-2435
LPRng - Print Spooler (http://www.lprng.com)
-----------------------------------------------------------------------------
YOU MUST BE A LIST MEMBER IN ORDER TO POST TO THE LPRng MAILING LIST
The address you post from or your Reply-To address MUST be your
subscription address
If you need help, send email to [EMAIL PROTECTED] (or lprng-requests
or lprng-digest-requests) with the word 'help' in the body.
To subscribe to a list with name LIST, send mail to [EMAIL PROTECTED]
with: | example:
subscribe LIST <mailaddr> | subscribe lprng-digest [EMAIL PROTECTED]
unsubscribe LIST <mailaddr> | unsubscribe lprng [EMAIL PROTECTED]
If you have major problems, call Patrick Powell or one of the friendly
staff at Astart Technologies for help. Astart also does support for LPRng.
Also, check the Web Page at: http://www.lprng.com for any announcements.
Astart Technologies (LPRng - Print Spooler http://www.lprng.com)
6741 Convoy Court
San Diego, CA 92111
858-874-6543 FAX 858-751-2435
-----------------------------------------------------------------------------