[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #9, bug #27983 (project findutils): app | d2u | xargs That is a one-size-fits-all solution - it works no matter the windows app on the left and no matter the consumer on the right. This means that the developers should be aware about the newline subtlety that may be exposed by the use of Windows native applications and the Cygwin's text mode mounts. Some developers may test their changes on binary only mounts forgetting about other users who are not even aware about the kind of the mount they have. to make it easier to change an existing stream to binary-only I do not understand your point. Earlier you argued that using the 'b' option in the file open mode makes no sense on POSIX, and this is confirmed by the Open Group specs. Now you are saying that since the use of that option should be harmless on POSIX-compliant systems, we could use it so that on some specific systems this option would make a difference. But this hides the fact that Cygwin cannot be POSIX-compliant when processing arbitrary input streams or generating output streams. This is because Cygwin should take the notion of native newlines into account. As for the suggested change to either the use of freopen() or to the xfreopen() wrapper with regards to preserving the possible append mode, I agree that all uses of freopen() with w may need reviewing. I am afraid that just modifying freopen()'s implementation itself to preserve the append mode would be too drastic as this could change the existing behavior where the append mode was not supposed to be preserved. (In case of xargs, only the mode of the input stream needs to be changed, so preserving the append mode does not seem to be an issue to me). ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #10, bug #27983 (project findutils): Cygwin cannot be POSIX-compliant I meant Cygwin applications cannot rely on POSIX compliance of the system features they consume. ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #6, bug #27983 (project findutils): In the POSIX world, fopen() with 'b' has no effect, but with 't' has undefined behavior ('t' happens to be a cygwin extension, and you have no business trying to use it in an application trying to be portable to POSIX). I still argue that the problem is not in findutils, but in your (mis)use of cygwin text mounts. Cygwin recommends using binary mounts, not text mounts, for a reason - because apps written for POSIX expect POSIX behavior. And when you violate that assumption, the burden is on you, not on the apps. If r through pipes is a problem, then add d2u into the pipeline - that way, you have a single tool and a paradigm that can fix the issue for every situation, rather than having to patch every downstream tool to understand a new paradigm. So findutils' testsuite won't pass on a cygwin text mount. I don't care, as long as it continues to pass on a cygwin binary mount. ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #5, bug #27983 (project findutils): Then you use cat, which preserves binary mode, through a pipe which also preserves binary mode. But this is a valid sequence of commands, and Cygwin should preserve the behavior of this sequence regardless of the input's low-level line endings, as long as the latter are in accord with the text mode detection rules. Pipes are the corner case and this is why I appeal to force explicit input conversion in Cygwin's xargs. The attached patch made the CRs disappear on reading the pipe. I also realized that the t and b options in the fopen() mode argument are supposed to be irrelevant in the POSIX world. Here are the outputs of the improved test script. Before the patch: $ PATH=/usr/src/findutils-4.5.5-1/build/xargs:${PATH} ./text-mode.sh umount: /textmode: Invalid argument mount: warning - /textmode does not exist. CR found in /textmode/file1.txt. *** Unexpectedly, CR found in /textmode/file2.txt. CR found in /textmode/file3.txt. *** Unexpectedly, CR not found in /textmode/file4.txt. CR not found in /textmode/file5.txt. CR not found in /textmode/file6.txt. After the patch: $ PATH=/usr/src/findutils-4.5.5-1/build/xargs:${PATH} ./text-mode.sh umount: /textmode: Invalid argument mount: warning - /textmode does not exist. CR found in /textmode/file1.txt. CR not found in /textmode/file2.txt. CR found in /textmode/file3.txt. CR found in /textmode/file4.txt. CR not found in /textmode/file5.txt. CR not found in /textmode/file6.txt. (file #20165, file #20166) ___ Additional Item Attachment: File name: xargs-4.5.5-1-explicit-text-or-binary-mode.txt Size:0 KB File name: text-mode.sh Size:5 KB ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #3, bug #27983 (project findutils): Is the expectation in the test script valid? mount -t $(cygpath -w ${dir}) /textmode [..] tf1=/textmode/file1.txt tf2=/textmode/file2.txt [..] echo foo ${tf1} echo bar ${tf1} [..] cat ${tf1} | xargs -i echo -n =={}== ${tf2} [..] find_crs ${tf2} # CR found (Savannah bug 27983 in xargs 4.4.0) ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #4, bug #27983 (project findutils): In your example, $tf1 is created in text mode, so it contains file names listed with carriage returns. Then you use cat, which preserves binary mode, through a pipe which also preserves binary mode. But this is an example of a useless use of cat. You could have instead used: xargs -i echo -n =={}== $tf1 $tf2 which would have opened tf1 in text mode (per the mount point), and avoided the wasted cat process in the first place. If you insist on using cat, then you should also be able to use d2u: cat $tf1 | d2u | xargs -i echo -n =={}== $tf2 I still see no reason for findutils to work around this issue, especially given the fact that cygwin developers highly recommend binary mounts rather than text mounts for the very reason that text mounts are non-POSIX and receive less testing. ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Follow-up Comment #1, bug #27983 (project findutils): Correction of my phrase on the extra file entity, One has to explicitly set it with the r vs. rb (O_BINARY) open mode. There is no explicit text mode setting. The r mode is the default and will be interpreted by the emulation layer as a text or binary mode using heuristic. ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Additional Item Attachment, bug #27983 (project findutils): File name: text-mode.sh Size:2 KB ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/
[bug #27983] Windows text mode mis-interpreted
Additional Item Attachment, bug #27983 (project findutils): File name: text-mode.sh Size:2 KB ___ Reply to this item at: http://savannah.gnu.org/bugs/?27983 ___ Message sent via/by Savannah http://savannah.gnu.org/