Re: Wrong file position after writing 65537 bytes to block device
On 18.12.2017 16:27, Steven Penny wrote: On Mon, 18 Dec 2017 14:10:35, Corinna Vinschen wrote: In general, the writes on disk devices is sector-oriented. Howewver, in this case ftell should have returned 65536. The problem here is that the newlib implmentation of ftell/ftello performs an fflush when called on a write stream since about 2008 to adjust for appending streams. Given your example (thanks for the testcase!) this seems pretty wrong. Looking further it turns out that neither glibc nor BSD actually calls fflush in this case. There's only a special case for appending streams, but this calls lseek, not fflush. Looks like a patch is required. Stay tuned. is it though? he says "write 65536 + 1 bytes", but as far as i can tell, you cant do that. quoting myself: Seeking, reading and writing must all be done in multiples of sector size, in my case 512 bytes http://web.archive.org/web/stackoverflow.com/questions/37228874/how-to-fwrite-to-removable-volume so it would make sense that the position becomes "65536 + 512" You can do that on a "block" device. It's "raw" devices that have transfer unit restrictions. A block device creates an abstraction over a disk, dividing it into blocks. Those blocks are not related to the underlying sector size; they could be larger (e.g. 4096 byte block size, versus 512 byte sectors) or even smaller (e.g. 4096 byte block size, versus 65536 byte flash erase block size). Unix block devices let you read, write and seek using byte offsets and sizes. The bytes which are affected by a write operation map to one or more ranges in one or more blocks. All of the blocks have to be read into memory (if they aren't already). The bytes are updated, and then the blocks are marked dirty and written out (eventually). More changes can take place before that happens. So for instance if we have a block device (4096 bytes) over a flash device with 64 kB erase blocks, we can write just one byte somewhere in a block. When this change is flushed, the entire erase block has to be erased and rewritten. Because of the abstract nature of block devices, it's largely pointless to use the "dd" utility; you can use "cp" to copy them. "dd" is required when you need to control the exact size of the read and write calls. Thus "cat /dev/zero > /dev/blockdevice" has the same effect as "dd if=/dev/zero of=/dev/blockdevice". -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Wrong file position after writing 65537 bytes to block device
On Tue, Dec 19, 2017 at 6:19 PM, Corinna Vinschen wrote: > On Dec 19 16:35, Ivan Kozik wrote: >> From what I observe on Linux, it supports writing at any offset to the >> block device because it does a read-modify-write behind the scenes, >> with accompanying nasty overhead (e.g. writes going at 64MB/s instead >> of an "expected" 180MB/s). > > That's what Cygwin was trying to emulate as well. Debugging pointed out > that it only works for reading, not for writing, because the latter > neglected to fix up buffer pointers. Those are used in lseek to report > the Linux-like byte-exact file position. > > I pushed a patch and uploaded new developer snapshots to > https://cygwin.com/snapshts/ > > Please give them a test. Hi Corinna, It is writing correctly now, thank you for the fix! Ivan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Wrong file position after writing 65537 bytes to block device
On Dec 19 16:35, Ivan Kozik wrote: > On Tue, Dec 19, 2017 at 4:13 PM, Eric Blake wrote: > > Can block devices report an unaligned offset to lseek()? If not, then when > > writing an unaligned value to a block device, don't we have to do a > > read-modify-write of the larger aligned cluster, and then put lseek() back > > to the unaligned boundary, and have extra magic in ftell() to track that we > > are at an unaligned position within the block device? But that sounds like > > a lot of nasty overhead; and that it would be better to make sure that block > > devices can report unaligned lseek() locations (caveat: I haven't tested > > what Linux does in that regards). > > >From what I observe on Linux, it supports writing at any offset to the > block device because it does a read-modify-write behind the scenes, > with accompanying nasty overhead (e.g. writes going at 64MB/s instead > of an "expected" 180MB/s). That's what Cygwin was trying to emulate as well. Debugging pointed out that it only works for reading, not for writing, because the latter neglected to fix up buffer pointers. Those are used in lseek to report the Linux-like byte-exact file position. I pushed a patch and uploaded new developer snapshots to https://cygwin.com/snapshts/ Please give them a test. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat signature.asc Description: PGP signature
Re: Wrong file position after writing 65537 bytes to block device
On Tue, Dec 19, 2017 at 4:13 PM, Eric Blake wrote: > Can block devices report an unaligned offset to lseek()? If not, then when > writing an unaligned value to a block device, don't we have to do a > read-modify-write of the larger aligned cluster, and then put lseek() back > to the unaligned boundary, and have extra magic in ftell() to track that we > are at an unaligned position within the block device? But that sounds like > a lot of nasty overhead; and that it would be better to make sure that block > devices can report unaligned lseek() locations (caveat: I haven't tested > what Linux does in that regards). >From what I observe on Linux, it supports writing at any offset to the block device because it does a read-modify-write behind the scenes, with accompanying nasty overhead (e.g. writes going at 64MB/s instead of an "expected" 180MB/s). I think you can observe this behavior on Linux by piping this program's stdout to a block device (note: must be python3, not python2): #!/usr/bin/python3 import sys block = b" " * 4096 while True: sys.stdout.buffer.write(block) sys.stdout.buffer.write(b" ") and watching the block device activity with `dstat -d -D sdN` - you should see a lot of reads. Ivan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Wrong file position after writing 65537 bytes to block device
On 12/19/2017 09:46 AM, Ivan Kozik wrote: Thanks, I can confirm that the 2017-12-18 snapshot fixed the test program I posted. What about the harder case where the program calls fflush, though? #include int main(int argc, char *argv[]) { FILE *f = fopen(argv[1], "w"); char x[65536 + 1]; fwrite(x, 1, 65536 + 1, f); fflush(f); printf("%ld", ftell(f)); Can block devices report an unaligned offset to lseek()? If not, then when writing an unaligned value to a block device, don't we have to do a read-modify-write of the larger aligned cluster, and then put lseek() back to the unaligned boundary, and have extra magic in ftell() to track that we are at an unaligned position within the block device? But that sounds like a lot of nasty overhead; and that it would be better to make sure that block devices can report unaligned lseek() locations (caveat: I haven't tested what Linux does in that regards). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Wrong file position after writing 65537 bytes to block device
On Tue, Dec 19, 2017 at 9:14 AM, Corinna Vinschen wrote: > Neither glibc nor FreeBSD show this behaviour. Keep in mind that stdio > is designed for buffered I/O. What should happen, basically, is that a > multiple of the stdio buffersize is written and the remainder is kept in > the stdio buffer: > > fwrite(65537) > -> write(65536) > -> store 1 byte in FILE._buf > > ftell calls lseek which returns 65536. It adds the number of bytes > still in the buffer, so it should return 65537. Further fwrite's > seemlessly append to the bytes already written, as expected. ftell > calling fflush and thus setting the current file position to the next > sector boundary breaks this expectation. > > I pushed a patch yesterday and uploaded new developer snapshots to > https://cygwin.com/snapshots/ > > Please test. Thanks, I can confirm that the 2017-12-18 snapshot fixed the test program I posted. What about the harder case where the program calls fflush, though? #include int main(int argc, char *argv[]) { FILE *f = fopen(argv[1], "w"); char x[65536 + 1]; fwrite(x, 1, 65536 + 1, f); fflush(f); printf("%ld", ftell(f)); return 0; } cygwin reports 66048, while Linux reports 65537. In cygwin, if such a write is done in a loop, for example, you can get garbled output on disk. fflush can be visibly unnecessary when done from C, but python3 (where I originally observed the problem) appears to do implicit flushing. If this is annoying to fix and I am the only one who notices, please don't worry about it, I can just write in proper block sizes to block devices. Best regards, Ivan -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Wrong file position after writing 65537 bytes to block device
On Dec 18 16:27, Steven Penny wrote: > On Mon, 18 Dec 2017 14:10:35, Corinna Vinschen wrote: > > In general, the writes on disk devices is sector-oriented. Howewver, > > in this case ftell should have returned 65536. The problem here is > > that the newlib implmentation of ftell/ftello performs an fflush > > when called on a write stream since about 2008 to adjust for appending > > streams. Given your example (thanks for the testcase!) this seems > > pretty wrong. Looking further it turns out that neither glibc nor BSD > > actually calls fflush in this case. There's only a special case for > > appending streams, but this calls lseek, not fflush. > > > > Looks like a patch is required. Stay tuned. > > is it though? he says "write 65536 + 1 bytes", but as far as i can tell, you > cant do that. quoting myself: > > > Seeking, reading and writing must all be done in multiples of sector size, > > in > > my case 512 bytes > > http://web.archive.org/web/stackoverflow.com/questions/37228874/how-to-fwrite-to-removable-volume > > so it would make sense that the position becomes "65536 + 512" Neither glibc nor FreeBSD show this behaviour. Keep in mind that stdio is designed for buffered I/O. What should happen, basically, is that a multiple of the stdio buffersize is written and the remainder is kept in the stdio buffer: fwrite(65537) -> write(65536) -> store 1 byte in FILE._buf ftell calls lseek which returns 65536. It adds the number of bytes still in the buffer, so it should return 65537. Further fwrite's seemlessly append to the bytes already written, as expected. ftell calling fflush and thus setting the current file position to the next sector boundary breaks this expectation. I pushed a patch yesterday and uploaded new developer snapshots to https://cygwin.com/snapshots/ Please test. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat signature.asc Description: PGP signature
Re: Wrong file position after writing 65537 bytes to block device
On Mon, 18 Dec 2017 14:10:35, Corinna Vinschen wrote: In general, the writes on disk devices is sector-oriented. Howewver, in this case ftell should have returned 65536. The problem here is that the newlib implmentation of ftell/ftello performs an fflush when called on a write stream since about 2008 to adjust for appending streams. Given your example (thanks for the testcase!) this seems pretty wrong. Looking further it turns out that neither glibc nor BSD actually calls fflush in this case. There's only a special case for appending streams, but this calls lseek, not fflush. Looks like a patch is required. Stay tuned. is it though? he says "write 65536 + 1 bytes", but as far as i can tell, you cant do that. quoting myself: Seeking, reading and writing must all be done in multiples of sector size, in my case 512 bytes http://web.archive.org/web/stackoverflow.com/questions/37228874/how-to-fwrite-to-removable-volume so it would make sense that the position becomes "65536 + 512" -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Re: Wrong file position after writing 65537 bytes to block device
On Dec 16 02:07, Ivan Kozik wrote: > Hello, > > I have discovered that if you write 65536 + 1 bytes to a block device > in cygwin, the file position can become 65536 + 512. > > With /dev/sdc as a throwaway USB block device: > > (cygwin_write.c is pasted below) > # gcc -O2 -Wall -o cygwin_write cygwin_write.c > # ./cygwin_write /dev/sdc > 66048 > > I am running 64-bit cygwin 2.9.0 on an updated Windows 8.1. I saw the > same results with an 8TB drive and a 512MB USB stick. In general, the writes on disk devices is sector-oriented. Howewver, in this case ftell should have returned 65536. The problem here is that the newlib implmentation of ftell/ftello performs an fflush when called on a write stream since about 2008 to adjust for appending streams. Given your example (thanks for the testcase!) this seems pretty wrong. Looking further it turns out that neither glibc nor BSD actually calls fflush in this case. There's only a special case for appending streams, but this calls lseek, not fflush. Looks like a patch is required. Stay tuned. Thanks, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat signature.asc Description: PGP signature
Wrong file position after writing 65537 bytes to block device
Hello, I have discovered that if you write 65536 + 1 bytes to a block device in cygwin, the file position can become 65536 + 512. With /dev/sdc as a throwaway USB block device: (cygwin_write.c is pasted below) # gcc -O2 -Wall -o cygwin_write cygwin_write.c # ./cygwin_write /dev/sdc 66048 I am running 64-bit cygwin 2.9.0 on an updated Windows 8.1. I saw the same results with an 8TB drive and a 512MB USB stick. Best regards, Ivan cygwin_write.c: #include int main(int argc, char *argv[]) { FILE *f = fopen(argv[1], "w"); char x[65536 + 1]; fwrite(x, 1, 65536 + 1, f); printf("%ld", ftell(f)); return 0; } -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple