Re: easy-to-fix bug in /dev/null driver
Okay, okay, I didn't really make my point persuasively enough. The file linux/drivers/char/mem.c contains this: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { return count; } === Now try this little program: --- #include #include #include main() { char buf[1]; int fd = open("/dev/null", O_RDWR); int i; for (i = 1; i <= 10; ++i) { int ret = write(fd, buf, 429496729 * i); if (ret < 0) { fprintf(stderr, "i = %d, errno = %d\n", i, errno); perror("write"); } } } --- Result is this: --- i = 6, errno = 0 write: Success i = 7, errno = 0 write: Success i = 8, errno = 0 write: Success i = 9, errno = 0 write: Success i = 10, errno = 6 write: Device not configured --- To me, it's pretty clear that an error has occurred here. The error "Device not configured" has _not_ occurred. So it is an error for the kernel to say that it has! The cause is obviously that fact that the people who worked out the input/output types for write(2) didn't allow for the possibility that someone might really be able to write 2 GBytes or more at a time. But this is now becoming credible in some people's computers (not mine). I.e. whenever anyone tries to write 2 GBytes or more to a device, they're going to get a negative return value and possibly a positive errno value - _if_ the device permits such a big chunk to be written at once, which /dev/null does. The device might be a 622 Mbit/sec ATM card or something. That's only about 25 seconds of transmission time. So it's not unrealistic. Questions: Should device drivers in general be written to truncate the user-space request down to 2 GByte - 1 (2^31 - 1) or less? Or should the device driver flag such excessive write-calls as erroneous? I still think that write_null() should be rewritten as: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { return (count <= 2147483647) ? count : 2147483647; } === This would fix the problem without introducing any new errors. (Unless someone change the definitions of ssize_t and size_t!!) Cheers, Alan Kennington. -------- name: Dr. Alan Kennington e-mail: [EMAIL PROTECTED] website: http://topology.org/ city: Adelaide, South Australia coords: 34.88051 S, 138.59334 E timezone: UTC+1030 http://topology.org/timezone.html pgp-key: http://topology.org/key_ak2.asc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: easy-to-fix bug in /dev/null driver
Okay, okay, I didn't really make my point persuasively enough. The file linux/drivers/char/mem.c contains this: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { return count; } === Now try this little program: --- #include stdio.h #include errno.h #include sys/fcntl.h main() { char buf[1]; int fd = open("/dev/null", O_RDWR); int i; for (i = 1; i = 10; ++i) { int ret = write(fd, buf, 429496729 * i); if (ret 0) { fprintf(stderr, "i = %d, errno = %d\n", i, errno); perror("write"); } } } --- Result is this: --- i = 6, errno = 0 write: Success i = 7, errno = 0 write: Success i = 8, errno = 0 write: Success i = 9, errno = 0 write: Success i = 10, errno = 6 write: Device not configured --- To me, it's pretty clear that an error has occurred here. The error "Device not configured" has _not_ occurred. So it is an error for the kernel to say that it has! The cause is obviously that fact that the people who worked out the input/output types for write(2) didn't allow for the possibility that someone might really be able to write 2 GBytes or more at a time. But this is now becoming credible in some people's computers (not mine). I.e. whenever anyone tries to write 2 GBytes or more to a device, they're going to get a negative return value and possibly a positive errno value - _if_ the device permits such a big chunk to be written at once, which /dev/null does. The device might be a 622 Mbit/sec ATM card or something. That's only about 25 seconds of transmission time. So it's not unrealistic. Questions: Should device drivers in general be written to truncate the user-space request down to 2 GByte - 1 (2^31 - 1) or less? Or should the device driver flag such excessive write-calls as erroneous? I still think that write_null() should be rewritten as: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { return (count = 2147483647) ? count : 2147483647; } === This would fix the problem without introducing any new errors. (Unless someone change the definitions of ssize_t and size_t!!) Cheers, Alan Kennington. -------- name: Dr. Alan Kennington e-mail: [EMAIL PROTECTED] website: http://topology.org/ city: Adelaide, South Australia coords: 34.88051 S, 138.59334 E timezone: UTC+1030 http://topology.org/timezone.html pgp-key: http://topology.org/key_ak2.asc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
easy-to-fix bug in /dev/null driver
It seems to me that this code in linux/drivers/char/mem.c is a bug: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { return count; } === To activate the bug try this little program: --- #include #include #include main() { char buf[1]; int fd = open("/dev/null", O_RDWR); int i; printf("fd = %d\n", fd); for (i = 1; i <= 10; ++i) { int ret = write(fd, buf, -i); if (ret < 0) { fprintf(stderr, "i = %d, errno = %d\n", i, errno); perror("write"); } } } --- On my legacy 2.4.0-test1-ac21 system, I get this: --- fd = 3 i = 1, errno = 1 write: Operation not permitted i = 2, errno = 2 write: No such file or directory i = 3, errno = 3 write: No such process i = 4, errno = 4 write: Interrupted system call i = 5, errno = 5 write: Input/output error i = 6, errno = 6 write: Device not configured i = 7, errno = 7 write: Argument list too long i = 8, errno = 8 write: Exec format error i = 9, errno = 9 write: Bad file descriptor i = 10, errno = 10 write: No child processes --- You could argue that user-space users shouldn't do such stupid things, but in some contexts, the bug might be hard to find, and having wrong error messages just makes such bugs hard to find. Arguably, write() should not be defined to return the count of written bytes when this is impossible for very large writes. I.e. it is the definition of the user-space write() function which is meaningless for large counts - so why bother to permit this if a negative error code is returned when this is attempted? Perhaps more correct code for the write_null function would be: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { if ((ssize_t)count >= 0) return (ssize_t)count; else return 0x7fff; } === with preferably a symbol instead of 0x7fff. Cheers, Alan Kennington. -------- name: Dr. Alan Kennington e-mail: [EMAIL PROTECTED] website: http://topology.org/ city: Adelaide, South Australia coords: 34.88051 S, 138.59334 E timezone: UTC+1030 http://topology.org/timezone.html pgp-key: http://topology.org/key_ak2.asc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
easy-to-fix bug in /dev/null driver
It seems to me that this code in linux/drivers/char/mem.c is a bug: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { return count; } === To activate the bug try this little program: --- #include stdio.h #include errno.h #include sys/fcntl.h main() { char buf[1]; int fd = open("/dev/null", O_RDWR); int i; printf("fd = %d\n", fd); for (i = 1; i = 10; ++i) { int ret = write(fd, buf, -i); if (ret 0) { fprintf(stderr, "i = %d, errno = %d\n", i, errno); perror("write"); } } } --- On my legacy 2.4.0-test1-ac21 system, I get this: --- fd = 3 i = 1, errno = 1 write: Operation not permitted i = 2, errno = 2 write: No such file or directory i = 3, errno = 3 write: No such process i = 4, errno = 4 write: Interrupted system call i = 5, errno = 5 write: Input/output error i = 6, errno = 6 write: Device not configured i = 7, errno = 7 write: Argument list too long i = 8, errno = 8 write: Exec format error i = 9, errno = 9 write: Bad file descriptor i = 10, errno = 10 write: No child processes --- You could argue that user-space users shouldn't do such stupid things, but in some contexts, the bug might be hard to find, and having wrong error messages just makes such bugs hard to find. Arguably, write() should not be defined to return the count of written bytes when this is impossible for very large writes. I.e. it is the definition of the user-space write() function which is meaningless for large counts - so why bother to permit this if a negative error code is returned when this is attempted? Perhaps more correct code for the write_null function would be: === static ssize_t write_null(struct file * file, const char * buf, size_t count, loff_t *ppos) { if ((ssize_t)count = 0) return (ssize_t)count; else return 0x7fff; } === with preferably a symbol instead of 0x7fff. Cheers, Alan Kennington. -------- name: Dr. Alan Kennington e-mail: [EMAIL PROTECTED] website: http://topology.org/ city: Adelaide, South Australia coords: 34.88051 S, 138.59334 E timezone: UTC+1030 http://topology.org/timezone.html pgp-key: http://topology.org/key_ak2.asc - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/