2011/5/24 Reini Urban: > 2011/5/18 Sven Severus: >> let me report a strange behaviour with Cygwin Perl (I'm using cygwin1.dll >> 1.7.9-1, full installation 2 weeks ago). >> >> File foo.h is an ordinary text file, all lines are terminated with DOS >> style line endings <cr> <lf> (hex: 0d 0a). >> It is located in a directory with textmode mounting in cygwin. >> One <cr> <lf> sequence of foo.h is split by a 4096 byte boundary within >> the file: "od -c -Ax foo.h" shows a <cr> (='\r') at byte offset 4095 >> (0xfff) >> and a <lf> (='\n') at offset 4096 (0x1000): >> ... >> 000ff0 / / / / / / \r \n / / X X X X X \r >> 001000 \n / / \r \n / / \r \n >> 001009 >> >> Now I issued the command "perl -pe 's/12345/54321/' foo.h >foomod.h" >> to produce foomod.h, located in the same directory as foo.h, thus with >> textmode mounting too. >> When I examined the result, I noticed that foomod.h was one byte bigger >> then foo.h. I expected identical size, and "od -c -Ax foomod.h" reports: >> ... >> 000ff0 / / / / / / \r \n / / X X X X X \r >> 001000 \r \n / / \r \n / / \r \n >> 00100a >> >> Ups! The original <cr> <lf> sequence starting at offset 4095 (0xfff) >> became a three character sequence <cr> <cr> <lf>! The <cr> is duplicated! >> >> In other files created by Perl with output redirection I observed this >> behaviour with every <cr> <lf> line ending, that is split by a 4096 byte >> boundary (even multiple times in one output file). Line endings not >> split by a 4096 byte boundary do not show this behaviour. >> >> The behaviour does not occur, when the destination file is located >> in a directory with binmode mounting. It does not occur either, when >> I use sed instead of Perl ("sed -e 's/12345/54321/' foo.h >foomod.h"), >> so I think the problem is specific to Cygwin Perl, not to Cygwin in >> general. >> >> I this a bug of the output buffering mechanism of Cygwin Perl? >> Or do I anything wrong? >> Any answer is highly appreciated. Thanks in advance. > > Yes, this looks like a PerlIO buffering bug for MSWin32 and cygwin. > The last char of the buffer is not stored when checking the first char > of the new buffer. > I think first we have to provide a sample test case to perl core.
I could not reproduce it in perl core with the PerlIO :crlf layer, see attached test. I'm investigating cygwin buffer edge-case handling now. -- Reini Urban
crlf-bufedge.patch
Description: Binary data
-- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple