Re: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Larry Hall
At 01:06 PM 6/24/2005, you wrote:
This buffer is being built for SpamAssassin which later
gives an error saying (to the effect) 

Content-Length mismatch: Expected 818 bytes, got 798 bytes

My suspicion is that stat is counting cr-lf as two characters
but the input routines are treating these as one.

If the file has about 20 lines, then that's 20 missing
characters??? 


Yes, this is right.  And yes, this could be the cause of the 
situation you're noticing.


--
Larry Hall  http://www.rfk.com
RFK Partners, Inc.  (508) 893-9779 - RFK Office
838 Washington Street   (508) 893-9889 - FAX
Holliston, MA 01746 


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Herb Martin
 My suspicion is that stat is counting cr-lf as two 
 characters but the 
 input routines are treating these as one.
 
 If the file has about 20 lines, then that's 20 missing characters???
 
 
 Yes, this is right.  And yes, this could be the cause of the 
 situation you're noticing.

Is there a standard Cygwin 'idiom' or function for
dealing with this mismatch, or should I just re-invent
the wheel.

Seems like I read (skimmed) something related to this
in the Cygwin manual, probably near the back in the 
programming introduction

I know I picked up the concept somewhere (somewhere 
recent that is, as I have dealt with this across at
least five different OS conventions but not recently
and specifically on Cygwin.)

--
Thanks,
Herb


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Igor Pechtchanski
On Fri, 24 Jun 2005, Herb Martin wrote:

  My suspicion is that stat is counting cr-lf as two characters but the
  input routines are treating these as one.
  
  If the file has about 20 lines, then that's 20 missing characters???
 
  Yes, this is right.  And yes, this could be the cause of the
  situation you're noticing.

 Is there a standard Cygwin 'idiom' or function for dealing with this
 mismatch, or should I just re-invent the wheel.

Sure -- just force binary mode on the file (i.e., open it using O_BINARY
for the open() call, or the rb mode for the fopen() call).  The mount
type only applies if the mode is unspecified.

 Seems like I read (skimmed) something related to this in the Cygwin
 manual, probably near the back in the programming introduction

 I know I picked up the concept somewhere (somewhere recent that is, as I
 have dealt with this across at least five different OS conventions but
 not recently and specifically on Cygwin.)

HTH,
Igor
-- 
http://cs.nyu.edu/~pechtcha/
  |\  _,,,---,,_[EMAIL PROTECTED]
ZZZzz /,`.-'`'-.  ;-;;,_[EMAIL PROTECTED]
 |,4-  ) )-,_. ,\ (  `'-'   Igor Pechtchanski, Ph.D.
'---''(_/--'  `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

The Sun will pass between the Earth and the Moon tonight for a total
Lunar eclipse... -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Igor Pechtchanski
On Fri, 24 Jun 2005, Igor Pechtchanski wrote:

 On Fri, 24 Jun 2005, Herb Martin wrote:

   My suspicion is that stat is counting cr-lf as two characters but the
   input routines are treating these as one.
   
   If the file has about 20 lines, then that's 20 missing characters???
  
   Yes, this is right.  And yes, this could be the cause of the
   situation you're noticing.
 
  Is there a standard Cygwin 'idiom' or function for dealing with this
  mismatch, or should I just re-invent the wheel.

 Sure -- just force binary mode on the file (i.e., open it using O_BINARY
 for the open() call, or the rb mode for the fopen() call).  The mount
 type only applies if the mode is unspecified.

A clarification: force binary mode on the opened file in your program, not
the actual on-disk data.

Note that if you do that, you'd also need to handle the CR ('\r', or 0x0d)
characters explicitly in your program.

HTH,
Igor

-- 
http://cs.nyu.edu/~pechtcha/
  |\  _,,,---,,_[EMAIL PROTECTED]
ZZZzz /,`.-'`'-.  ;-;;,_[EMAIL PROTECTED]
 |,4-  ) )-,_. ,\ (  `'-'   Igor Pechtchanski, Ph.D.
'---''(_/--'  `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

The Sun will pass between the Earth and the Moon tonight for a total
Lunar eclipse... -- WCBS Radio Newsbrief, Oct 27 2004, 12:01 pm EDT

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Brian Dessent
Herb Martin wrote:

 Is there a standard Cygwin 'idiom' or function for
 dealing with this mismatch, or should I just re-invent
 the wheel.

Sure.  Don't use text mode.  Open the file in binary mode (O_BINARY with
open(), b with fopen()), or call setmode(fd, O_BINARY) once open, or
link against binmode.o.  Or just don't use text mode mounts.

Brian

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Larry Hall
At 02:08 PM 6/24/2005, you wrote:
 My suspicion is that stat is counting cr-lf as two 
 characters but the 
 input routines are treating these as one.
 
 If the file has about 20 lines, then that's 20 missing characters???
 
 
 Yes, this is right.  And yes, this could be the cause of the 
 situation you're noticing.

Is there a standard Cygwin 'idiom' or function for
dealing with this mismatch, or should I just re-invent
the wheel.


If you actually believe that you want the file without cr/nl conversion
during a read, then you want to open it in binary mode (fopen() with rb
instead of r or open() with '| O_BINARY' appended).  This *may* be the 
solution in this case.  Since the default mode for opening files is 
always text but there is no difference in format/behavior between 
text and binary on UNIX/Linux, you wouldn't see an issue there.




--
Larry Hall  http://www.rfk.com
RFK Partners, Inc.  (508) 893-9779 - RFK Office
838 Washington Street   (508) 893-9889 - FAX
Holliston, MA 01746 


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Gary R. Van Sickle
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Herb Martin
 Sent: Friday, June 24, 2005 1:08 PM
 To: 'Cygwin List'
 Subject: RE: stat file -- cygwin vs. Windows size?
 
  My suspicion is that stat is counting cr-lf as two
  characters but the
  input routines are treating these as one.
  
  If the file has about 20 lines, then that's 20 missing 
 characters???
  
  
  Yes, this is right.  And yes, this could be the cause of 
 the situation 
  you're noticing.
 
 Is there a standard Cygwin 'idiom' or function for dealing 
 with this mismatch, or should I just re-invent the wheel.
 

As to the former, no, not Cygwin specifically.  The problem appears to be
that SpamAssassin is making the incorrect but all-too-common assumption that
text file == file of 8-bit ASCII characters with '\n' EOL characters.
This is as incorrect as thinking picture file == JPEG file.

Cygwin does have a number of fetures to bandaid many such broken Unix
codes, primarily the text mode mount feature, but these are just that, a
band-aid, not a fix of the root problem (and in your case (and in fact in a
similar case in mutt), it can't solve the problem).  As others have
indicated, the real and true solution here is to open the file in binary
mode and handle the various EOL chachter combinations in the SpamAssasin
code.  Which, yeah, is unfortunately reinventing a wheel which should have
been permanently reinvented in the last century.  But hey, it's only the
first few years of the 21st century, maybe by the 22nd we'll have this whole
CRLF/LF/CR/LFCR thing sorted out.

-- 
Gary R. Van Sickle


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Larry Hall
At 02:46 PM 6/24/2005, Gary R. Van Sickle wrote:

But hey, it's only the
first few years of the 21st century, maybe by the 22nd we'll have this whole
CRLF/LF/CR/LFCR thing sorted out.

Yeah, I'm guessing this will be solved just after the advent of practical 
fusion reactors and the development of warp drive.  So we have a ways to 
go yet. ;-)


--
Larry Hall  http://www.rfk.com
RFK Partners, Inc.  (508) 893-9779 - RFK Office
838 Washington Street   (508) 893-9889 - FAX
Holliston, MA 01746 


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Herb Martin
 Is there a standard Cygwin 'idiom' or function for dealing with this 
 mismatch, or should I just re-invent the wheel.
 
 
 If you actually believe that you want the file without cr/nl 
 conversion during a read, then you want to open it in binary 
 mode (fopen() with rb
 instead of r or open() with '| O_BINARY' appended).  This 
 *may* be the solution in this case.  Since the default mode 
 for opening files is always text but there is no difference 
 in format/behavior between text and binary on UNIX/Linux, 
 you wouldn't see an issue there.


Actually I am between a rock and hard place -- 
email server on one side and SpamD on the
other.

Apparently the SpamD 'protocol' requires passing the
size to SpamD.

I don't want to start re-writing code all over either
program -- I just want to talk the source email system
into telling spamd whatever it needs to know to be happy.

Currently, I am accumulating bytes, and will use that,
but I am missing something and not getting the write count
(YET.)


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Jason Pyeron


The binary size is accurate, text, by its nature may never be correct on 
any operating system, since it is buffered, parsed, etc by the OS in an OS 
dependent way.


If you use a binary mode then you will be fine.


On Fri, 24 Jun 2005, Herb Martin wrote:


Is there a standard Cygwin 'idiom' or function for dealing with this
mismatch, or should I just re-invent the wheel.



If you actually believe that you want the file without cr/nl
conversion during a read, then you want to open it in binary
mode (fopen() with rb
instead of r or open() with '| O_BINARY' appended).  This
*may* be the solution in this case.  Since the default mode
for opening files is always text but there is no difference
in format/behavior between text and binary on UNIX/Linux,
you wouldn't see an issue there.



Actually I am between a rock and hard place --
email server on one side and SpamD on the
other.

Apparently the SpamD 'protocol' requires passing the
size to SpamD.

I don't want to start re-writing code all over either
program -- I just want to talk the source email system
into telling spamd whatever it needs to know to be happy.

Currently, I am accumulating bytes, and will use that,
but I am missing something and not getting the write count
(YET.)


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-   -
- Jason Pyeron  PD Inc. http://www.pdinc.us -
- Partner  Sr. Manager 7 West 24th Street #100 -
- +1 (410) 808-6646 (c) Baltimore, Maryland 21218   -
-   -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

This message is for the designated recipient only and may contain 
privileged, proprietary, or otherwise private information. If you 
have received it in error, purge the message from your system and 
notify the sender immediately.  Any other use of the email by you 
is prohibited.


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Herb Martin
Thanks folks -- the confirmation that I was on the right
path was a big help.

The suggestions to do it right were well intentioned 
but impractical since I didn't want to take over support
for TWO major software packages (or either one for that
matter.)

A small patch seems to work.  (Keep the bytes spooled
and send that number rather than whatever stat was 
showing.)

Since the bytes spooled to the file are what gets sent
to spamd that seems to be an accurate number.

I had a little trouble at first since it seemed the
file was cached (if it was written more than once
it really was only written to disk ONE TIME -- so 
zero'ing the counter in the wrong place was hosing
my first naive attempt.

--
Herb




--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: stat file -- cygwin vs. Windows size?

2005-06-24 Thread Gary R. Van Sickle
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jason Pyeron
 Sent: Friday, June 24, 2005 2:40 PM
 To: cygwin@cygwin.com
 Subject: RE: stat file -- cygwin vs. Windows size?
 
 
 The binary size is accurate, text, by its nature may never be 
 correct on any operating system, since it is buffered, 
 parsed, etc by the OS in an OS dependent way.
 

Actually I am not sure that's correct.  I am unaware of any *OS* that does
anything like that (maybe the DOS INT13 stuff did, but we're talking ancient
history there).  The ones I can think of are sane enough to treat files as
what they are, i.e. a string of bytes, at the system call level, and do no
inspection of any kind on the contents (none that you're supposed to have to
care about anyway).

The culprit in this confusion is not the OSes but the C runtimes, and the
fact that on different OSes, some text file formats are more common than
others.  The C runtime essentially assumes that all files are text files,
when of course this is not and has never been the case.  What really should
be done is the deprecation of all texty features of the FILE object (e.g.
stuff like fprintf()), and create a new FILE_TEXT object which inherits
from FILE and adds all the texty operations such as fprintf(), fscanf(),
etc, in addition to being able to handle any of the myriad text file formats
in existence.

But that would make too much sense, so I for one shall not hold my breath.

-- 
Gary R. Van Sickle


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/