Bug#448418: radioclkd segfaults on amd64

2007-10-30 Thread Jonathan Buzzard
Mark Scott wrote:
 Seems I spoke too soon.  While the segfault was prevented in test mode
 by applying Paul's patch, one occurred at a different place when not
 running in test mode.  The backtrace doesn't look that helpful -
 presumably the ?? in non-radioclkd code indicates library calls for
 which debugging info isn't available?  The binary used is one compiled
 with the patch applied.
 
 Core was generated by `/usr/sbin/radioclkd ttyS0'.
 Program terminated with signal 11, Segmentation fault.
 #0  0xf52508b3 in ?? ()
 (gdb) backtrace
 #0  0xf52508b3 in ?? ()
 #1  0x08cb08d6 in ?? ()
 #2  0xf8500963 in ?? ()
 #3  0x0f8e098e in ?? ()
 #4  0xf34a09cc in ?? ()
 #5  0xe0c80a71 in ?? ()
 #6  0xfdf803b6e03a in ?? ()
 #7  0xfc6a0ef2 in ?? ()
 #8  0x2ba81128 in ?? ()
 #9  0x40061128 in ?? ()
 #10 0xf5df15b2 in ?? ()
 #11 0x07b02c34 in ?? ()
 #12 0x3053 in ?? ()
 #13 0x4006 in ?? ()
 #14 0x4006 in ?? ()
 #15 0x00400e1b in _init ()
 #16 0x00402010 in ProcessTimeCode (c=0xb7606fd, radio=value
 optimized out) at radioclkd.c:709
 #17 0x7fffeef10cb8 in ?? ()
 #18 0x0002 in ?? ()
 #19 0x004024e0 in ProcessStatusChange () at radioclkd.c:845
 #20 0x in ?? ()
 
 
 Line 709 in the patched source is:
 
 if (CalculatePPSAverage(c, average)0) {
 
 and line 845 is:
 
 c-correct = 0;
 
 (which seems strange - how can that lead, 3 frames later, to line 845?).
 
 Commenting out the if block surrounding the call to CalculatePPSAverage
 on line 709 and always using the else block instead prevents the
 segfault.  The preceding comment if possible use an averaged offset
 suggests this is OK (but perhaps less accurate or less efficient?  I'm
 making changes now without understanding the implications...).
 

I guess the problem is in the libc qsort call, I pass in an array of
timediff which is int's but tell qsort that it is of size long. Not a
problem on an ia32 machine as the two are the same size. However if they
are different it will cause a problem.


JAB.

-- 
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Northumberland, United Kingdom.   Tel: +44 1661-832195



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-30 Thread Paul Martin
On Tue, Oct 30, 2007 at 07:31:58AM +, Jonathan Buzzard wrote:

 I guess the problem is in the libc qsort call, I pass in an array of
 timediff which is int's but tell qsort that it is of size long. Not a
 problem on an ia32 machine as the two are the same size. However if they
 are different it will cause a problem.

On amd64

#include stdio.h
int main() {
 printf(int=%d long=%d\n,sizeof(int), sizeof(long));
 return 0;
}

yields:

int=4 long=8

-- 
Paul Martin [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-29 Thread Mark Scott

Paul Martin wrote:


radioclkd tries a binary chop across the full range of time_t, starting with
the middle value that the type can hold. Unfortunately, gmtime() doesn't
like handing the value 0x4000

What happens if you add the following line at line 213 in radioclkd.c?

/* calculate the number of magnitude bits in a time_t */
for (bits=0,timep=1;timep0;bits++,timep=1)
;

+   if (bits  48) bits = 48;

/* if time_t is signed, 0 is the median value else 1bits is median */
timep = (timep0) ? 0 : ((time_t) 1bits);


This prevents the segfault.


This limits the range of time_t values to years 1970 to 8921556. If we're
still using longwave radio clocks in eight million years time, we've got
major problems.


So your fix is a practical one although I guess Jonathan's suggestions 
might provide a 'better' solution - I'll leave it to you to decide.  I'm 
no C coder so trying Jonathan's suggestions out would require some 
research on my part (that would have to wait until next weekend).  I'm 
more than happy to test other patches out sooner though.


Many thanks to both for the prompt response.

--
Mark Scott
[EMAIL PROTECTED]



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-29 Thread Jonathan Buzzard
On Mon, 2007-10-29 at 00:24 +, Paul Martin wrote:
 On Sun, Oct 28, 2007 at 10:07:59PM +, Mark Scott wrote:
  I have a home-built radio clock receiver for the DCF77 time signal
  that has been working fine for a year while attached to an i386
  machine.  I moved it to an amd64 machine and find that radioclkd
  segfaults when decoding the DCF77 signal.
 
 radioclkd tries a binary chop across the full range of time_t, starting with
 the middle value that the type can hold. Unfortunately, gmtime() doesn't
 like handing the value 0x4000
 
 What happens if you add the following line at line 213 in radioclkd.c?
 
   /* calculate the number of magnitude bits in a time_t */
   for (bits=0,timep=1;timep0;bits++,timep=1)
   ;
 
 + if (bits  48) bits = 48;
 
   /* if time_t is signed, 0 is the median value else 1bits is median */
   timep = (timep0) ? 0 : ((time_t) 1bits);
 
 
 This limits the range of time_t values to years 1970 to 8921556. If we're
 still using longwave radio clocks in eight million years time, we've got
 major problems.

Given that all the longwave radio clocks only broadcast two digit year
numbers, eight million years is not a problem.

However surely this is a libc bug and not a radioclkd bug. It is not in
my opinion appropriate in an open source system to patch radioclkd to
work around a bug in libc. We have access to the source of libc and the
gmtime function should be fixed in libc. It has the benefit of fixing
bugs in other programs potentially before they even manifest themselves.
I guess the problem is that libc is not 64bit clean :-(


JAB.

-- 
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Northumberland, United Kingdom.   Tel: +44 1661-832195






-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-29 Thread Mark Scott

Jonathan Buzzard wrote:


However surely this is a libc bug and not a radioclkd bug.


Presumably the same one reported (albeit for alpha) in :

glibc gmtime() broken for 64bit time_t (in arch=alpha)
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=200641

which is merged with :

gmtime returns NULL for time_ts larger than 40 bits on Alpha
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=238786)?

These have been open for over 4 years and over 3 years respectively :-(

The pragmatist in me says the radioclkd workaround is needed, if not ideal.

--
Mark Scott
[EMAIL PROTECTED]



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-28 Thread Paul Martin
On Sun, Oct 28, 2007 at 10:07:59PM +, Mark Scott wrote:
 I have a home-built radio clock receiver for the DCF77 time signal
 that has been working fine for a year while attached to an i386
 machine.  I moved it to an amd64 machine and find that radioclkd
 segfaults when decoding the DCF77 signal.
 
 I rebuilt the binary package locally to preserve debug info, and
 attach the executable and a core file.

Looks like it's not 64-bit clean.

I have an amd64 machine of my own, but never connected my MSF receiver to
it. Time for some digging.

-- 
Paul Martin [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-28 Thread Jonathan Buzzard
Paul Martin wrote:
 On Sun, Oct 28, 2007 at 10:07:59PM +, Mark Scott wrote:
 I have a home-built radio clock receiver for the DCF77 time signal
 that has been working fine for a year while attached to an i386
 machine.  I moved it to an amd64 machine and find that radioclkd
 segfaults when decoding the DCF77 signal.

 I rebuilt the binary package locally to preserve debug info, and
 attach the executable and a core file.
 
 Looks like it's not 64-bit clean.
 
 I have an amd64 machine of my own, but never connected my MSF receiver to
 it. Time for some digging.
 

I guess the UTCtime function is the culprit. The algorithm (though not
the code) was lifted from the mktime routine in libntp. I would check a
current version of this against UTCtime. You could also test by
replacing UTCtime with mktime from libc, and setting the timezone to
UTC. If that makes it go away it is definitely the problem.

I don't have any 64bit machines that I can hook a receiver up to, to
test. All the ones I have access to are either running in 32bit mode or
are servers that I am not going to mess with.


JAB.

-- 
Jonathan A. Buzzard Email: jonathan (at) buzzard.me.uk
Northumberland, United Kingdom.   Tel: +44 1661-832195



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#448418: radioclkd segfaults on amd64

2007-10-28 Thread Paul Martin
On Sun, Oct 28, 2007 at 10:07:59PM +, Mark Scott wrote:
 I have a home-built radio clock receiver for the DCF77 time signal
 that has been working fine for a year while attached to an i386
 machine.  I moved it to an amd64 machine and find that radioclkd
 segfaults when decoding the DCF77 signal.

radioclkd tries a binary chop across the full range of time_t, starting with
the middle value that the type can hold. Unfortunately, gmtime() doesn't
like handing the value 0x4000

What happens if you add the following line at line 213 in radioclkd.c?

/* calculate the number of magnitude bits in a time_t */
for (bits=0,timep=1;timep0;bits++,timep=1)
;

+   if (bits  48) bits = 48;

/* if time_t is signed, 0 is the median value else 1bits is median */
timep = (timep0) ? 0 : ((time_t) 1bits);


This limits the range of time_t values to years 1970 to 8921556. If we're
still using longwave radio clocks in eight million years time, we've got
major problems.

-- 
Paul Martin [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]