Re: stat file -- cygwin vs. Windows size?
At 01:06 PM 6/24/2005, you wrote: This buffer is being built for SpamAssassin which later gives an error saying (to the effect) Content-Length mismatch: Expected 818 bytes, got 798 bytes My suspicion is that stat is counting cr-lf as two characters but the input routines are treating these as one. If the file has about 20 lines, then that's 20 missing characters??? Yes, this is right. And yes, this could be the cause of the situation you're noticing. -- Larry Hall http://www.rfk.com RFK Partners, Inc. (508) 893-9779 - RFK Office 838 Washington Street (508) 893-9889 - FAX Holliston, MA 01746 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
My suspicion is that stat is counting cr-lf as two characters but the input routines are treating these as one. If the file has about 20 lines, then that's 20 missing characters??? Yes, this is right. And yes, this could be the cause of the situation you're noticing. Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. Seems like I read (skimmed) something related to this in the Cygwin manual, probably near the back in the programming introduction I know I picked up the concept somewhere (somewhere recent that is, as I have dealt with this across at least five different OS conventions but not recently and specifically on Cygwin.) -- Thanks, Herb -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
On Fri, 24 Jun 2005, Herb Martin wrote: My suspicion is that stat is counting cr-lf as two characters but the input routines are treating these as one. If the file has about 20 lines, then that's 20 missing characters??? Yes, this is right. And yes, this could be the cause of the situation you're noticing. Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. Sure -- just force binary mode on the file (i.e., open it using O_BINARY for the open() call, or the rb mode for the fopen() call). The mount type only applies if the mode is unspecified. Seems like I read (skimmed) something related to this in the Cygwin manual, probably near the back in the programming introduction I know I picked up the concept somewhere (somewhere recent that is, as I have dealt with this across at least five different OS conventions but not recently and specifically on Cygwin.) HTH, Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_[EMAIL PROTECTED] ZZZzz /,`.-'`'-. ;-;;,_[EMAIL PROTECTED] |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! The Sun will pass between the Earth and the Moon tonight for a total Lunar eclipse... -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
On Fri, 24 Jun 2005, Igor Pechtchanski wrote: On Fri, 24 Jun 2005, Herb Martin wrote: My suspicion is that stat is counting cr-lf as two characters but the input routines are treating these as one. If the file has about 20 lines, then that's 20 missing characters??? Yes, this is right. And yes, this could be the cause of the situation you're noticing. Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. Sure -- just force binary mode on the file (i.e., open it using O_BINARY for the open() call, or the rb mode for the fopen() call). The mount type only applies if the mode is unspecified. A clarification: force binary mode on the opened file in your program, not the actual on-disk data. Note that if you do that, you'd also need to handle the CR ('\r', or 0x0d) characters explicitly in your program. HTH, Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_[EMAIL PROTECTED] ZZZzz /,`.-'`'-. ;-;;,_[EMAIL PROTECTED] |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! The Sun will pass between the Earth and the Moon tonight for a total Lunar eclipse... -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: stat file -- cygwin vs. Windows size?
Herb Martin wrote: Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. Sure. Don't use text mode. Open the file in binary mode (O_BINARY with open(), b with fopen()), or call setmode(fd, O_BINARY) once open, or link against binmode.o. Or just don't use text mode mounts. Brian -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
At 02:08 PM 6/24/2005, you wrote: My suspicion is that stat is counting cr-lf as two characters but the input routines are treating these as one. If the file has about 20 lines, then that's 20 missing characters??? Yes, this is right. And yes, this could be the cause of the situation you're noticing. Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. If you actually believe that you want the file without cr/nl conversion during a read, then you want to open it in binary mode (fopen() with rb instead of r or open() with '| O_BINARY' appended). This *may* be the solution in this case. Since the default mode for opening files is always text but there is no difference in format/behavior between text and binary on UNIX/Linux, you wouldn't see an issue there. -- Larry Hall http://www.rfk.com RFK Partners, Inc. (508) 893-9779 - RFK Office 838 Washington Street (508) 893-9889 - FAX Holliston, MA 01746 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Herb Martin Sent: Friday, June 24, 2005 1:08 PM To: 'Cygwin List' Subject: RE: stat file -- cygwin vs. Windows size? My suspicion is that stat is counting cr-lf as two characters but the input routines are treating these as one. If the file has about 20 lines, then that's 20 missing characters??? Yes, this is right. And yes, this could be the cause of the situation you're noticing. Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. As to the former, no, not Cygwin specifically. The problem appears to be that SpamAssassin is making the incorrect but all-too-common assumption that text file == file of 8-bit ASCII characters with '\n' EOL characters. This is as incorrect as thinking picture file == JPEG file. Cygwin does have a number of fetures to bandaid many such broken Unix codes, primarily the text mode mount feature, but these are just that, a band-aid, not a fix of the root problem (and in your case (and in fact in a similar case in mutt), it can't solve the problem). As others have indicated, the real and true solution here is to open the file in binary mode and handle the various EOL chachter combinations in the SpamAssasin code. Which, yeah, is unfortunately reinventing a wheel which should have been permanently reinvented in the last century. But hey, it's only the first few years of the 21st century, maybe by the 22nd we'll have this whole CRLF/LF/CR/LFCR thing sorted out. -- Gary R. Van Sickle -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
At 02:46 PM 6/24/2005, Gary R. Van Sickle wrote: But hey, it's only the first few years of the 21st century, maybe by the 22nd we'll have this whole CRLF/LF/CR/LFCR thing sorted out. Yeah, I'm guessing this will be solved just after the advent of practical fusion reactors and the development of warp drive. So we have a ways to go yet. ;-) -- Larry Hall http://www.rfk.com RFK Partners, Inc. (508) 893-9779 - RFK Office 838 Washington Street (508) 893-9889 - FAX Holliston, MA 01746 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. If you actually believe that you want the file without cr/nl conversion during a read, then you want to open it in binary mode (fopen() with rb instead of r or open() with '| O_BINARY' appended). This *may* be the solution in this case. Since the default mode for opening files is always text but there is no difference in format/behavior between text and binary on UNIX/Linux, you wouldn't see an issue there. Actually I am between a rock and hard place -- email server on one side and SpamD on the other. Apparently the SpamD 'protocol' requires passing the size to SpamD. I don't want to start re-writing code all over either program -- I just want to talk the source email system into telling spamd whatever it needs to know to be happy. Currently, I am accumulating bytes, and will use that, but I am missing something and not getting the write count (YET.) -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
The binary size is accurate, text, by its nature may never be correct on any operating system, since it is buffered, parsed, etc by the OS in an OS dependent way. If you use a binary mode then you will be fine. On Fri, 24 Jun 2005, Herb Martin wrote: Is there a standard Cygwin 'idiom' or function for dealing with this mismatch, or should I just re-invent the wheel. If you actually believe that you want the file without cr/nl conversion during a read, then you want to open it in binary mode (fopen() with rb instead of r or open() with '| O_BINARY' appended). This *may* be the solution in this case. Since the default mode for opening files is always text but there is no difference in format/behavior between text and binary on UNIX/Linux, you wouldn't see an issue there. Actually I am between a rock and hard place -- email server on one side and SpamD on the other. Apparently the SpamD 'protocol' requires passing the size to SpamD. I don't want to start re-writing code all over either program -- I just want to talk the source email system into telling spamd whatever it needs to know to be happy. Currently, I am accumulating bytes, and will use that, but I am missing something and not getting the write count (YET.) -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - - - Jason Pyeron PD Inc. http://www.pdinc.us - - Partner Sr. Manager 7 West 24th Street #100 - - +1 (410) 808-6646 (c) Baltimore, Maryland 21218 - - - -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- This message is for the designated recipient only and may contain privileged, proprietary, or otherwise private information. If you have received it in error, purge the message from your system and notify the sender immediately. Any other use of the email by you is prohibited. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
Thanks folks -- the confirmation that I was on the right path was a big help. The suggestions to do it right were well intentioned but impractical since I didn't want to take over support for TWO major software packages (or either one for that matter.) A small patch seems to work. (Keep the bytes spooled and send that number rather than whatever stat was showing.) Since the bytes spooled to the file are what gets sent to spamd that seems to be an accurate number. I had a little trouble at first since it seemed the file was cached (if it was written more than once it really was only written to disk ONE TIME -- so zero'ing the counter in the wrong place was hosing my first naive attempt. -- Herb -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: stat file -- cygwin vs. Windows size?
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jason Pyeron Sent: Friday, June 24, 2005 2:40 PM To: cygwin@cygwin.com Subject: RE: stat file -- cygwin vs. Windows size? The binary size is accurate, text, by its nature may never be correct on any operating system, since it is buffered, parsed, etc by the OS in an OS dependent way. Actually I am not sure that's correct. I am unaware of any *OS* that does anything like that (maybe the DOS INT13 stuff did, but we're talking ancient history there). The ones I can think of are sane enough to treat files as what they are, i.e. a string of bytes, at the system call level, and do no inspection of any kind on the contents (none that you're supposed to have to care about anyway). The culprit in this confusion is not the OSes but the C runtimes, and the fact that on different OSes, some text file formats are more common than others. The C runtime essentially assumes that all files are text files, when of course this is not and has never been the case. What really should be done is the deprecation of all texty features of the FILE object (e.g. stuff like fprintf()), and create a new FILE_TEXT object which inherits from FILE and adds all the texty operations such as fprintf(), fscanf(), etc, in addition to being able to handle any of the myriad text file formats in existence. But that would make too much sense, so I for one shall not hold my breath. -- Gary R. Van Sickle -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/