URL: <http://savannah.gnu.org/bugs/?27983>
Summary: Windows text mode mis-interpreted Project: findutils Submitted by: ilgiz Submitted on: Mon 09 Nov 2009 11:04:14 PM EST Category: xargs Severity: 3 - Normal Item Group: None Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Discussion Lock: Any Release: None Fixed Release: None _______________________________________________________ Details: When xargs receives its input from a pipe or redirect, it uses the stdin file descriptor prepared for it by the OS environment. Emulation of Unix on Windows introduces an extra entity into file operations, text vs. binary mode processing. One has to explicitly set it with the "r" vs. "rb" (O_BINARY) open mode. The Unix-on-Windows OS environment has to rely on heuristic such as the mount flag "binmode" and CYGWIN variable's "binmode" element to decide on the default mode for stdin and stdout file descriptors. I am not sure if the implementation of "r" and "w" modes in fopen() calls also relies on this heuristic, but this could be verified by running my script before and after the change. It appears that there is not such thing as "rt" and "wt" modes in the Open Group standard. These would be nice to turn the text mode processing explicitly. http://www.opengroup.org/onlinepubs/000095399/functions/fopen.html I believe text processing utilities that care about the Unix-on-Windows emulations need an option to specify input/output modes explicitly or, at least, to declare a certain mode of operation explicitly. The "sed" utility has a "--binary" option and other utilities such as "wc" and "tr" have switched to the certain binary mode. With xargs, the heuristic rules of choosing the text processing mode for the input stream work in synchronization with other tools producing such streams. There is a corner case where a text stream is obtained from a pipe. I believe pipes are always opened in binary mode by the Unix-on-Windows environment as it is difficult to predict the kind of processing that will consume the pipe. I am attaching a test script that shows how xargs fails to set the text processing mode on standard input obtained from a pipe. I believe this could be done in a way similar to the "tr" implementation but with the opposite intention, stdin = freopen( NULL, "r", stdin ); http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=gl/lib/xfreopen.c;h=32e68fa35c8f8cc5c95641a0fc3b761d23d6bf8d;hb=HEAD http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/tr.c;h=c5f18f98384e10ff871a4e4daac42270e96743d6;hb=HEAD http://www.opengroup.org/onlinepubs/000095399/functions/freopen.html (The xfreopen wrapper ignores the possibility that freopen may return a different pointer, so I would not recommend using this wrapper). _______________________________________________________ File Attachments: ------------------------------------------------------- Date: Mon 09 Nov 2009 11:04:14 PM EST Name: text-mode.sh Size: 3kB By: ilgiz A script showing how xargs 4.4.0 accepts the existing text processing mode of stdin <http://savannah.gnu.org/bugs/download.php?file_id=19022> _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?27983> _______________________________________________ Message sent via/by Savannah http://savannah.gnu.org/