re: NetBSD Bad address failure (was Re: [HACKERS] Third call for platform testing)

2001-04-16 Thread matthew green



yes, this is a bug in netbsd-current that was introduced with about 5 month
ago with the new unified buffer cache system.  it has been fixed.


thanks.



From: Chuck Silvers [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: CVS commit: syssrc
Date: Mon, 16 Apr 2001 17:37:44 +0300 (EEST)


Module Name:syssrc
Committed By:   chs
Date:   Mon Apr 16 14:37:44 UTC 2001

Modified Files:
syssrc/sys/nfs: nfs_bio.c

Log Message:
reads at or after EOF should "succeed".


To generate a diff of this commit:
cvs rdiff -r1.65 -r1.66 syssrc/sys/nfs/nfs_bio.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: NetBSD Bad address failure (was Re: [HACKERS] Third call for platform testing)

2001-04-14 Thread Tom Ivar Helbekkmo

Tom Lane [EMAIL PROTECTED] writes:

  I think this is indisputably a bug in (some versions of) NetBSD.
 
 I forgot to mention a possible contributing factor: the files involved
 were NFS-mounted, in the case I was looking at.  So this may be an NFS
 problem more than a NetBSD problem.  Anyone want to try the given test
 case on NFS-mounted files on other systems?

I can verify, that with NetBSD-current on sparc, your test code works
the way you want it to on local disk, but fails (in the way you've
observed), if the target file is on an NFS-mounted file system.

-tih
-- 
The basic difference is this: hackers build things, crackers break them.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: NetBSD Bad address failure (was Re: [HACKERS] Third call for platform testing)

2001-04-14 Thread Tom Lane

Tom Ivar Helbekkmo [EMAIL PROTECTED] writes:
 I can verify, that with NetBSD-current on sparc, your test code works
 the way you want it to on local disk, but fails (in the way you've
 observed), if the target file is on an NFS-mounted file system.

FWIW, the test program succeeds (no error) using HPUX 10.20 and a couple
different Linux flavors as either client or server.  So I'm still
thinking that it's NetBSD-specific.  It would be useful to try it on
some other BSD derivatives though ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



NetBSD Bad address failure (was Re: [HACKERS] Third call for platform testing)

2001-04-13 Thread Tom Lane

Tom Ivar Helbekkmo [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
 + ERROR:  cannot read block 3 of hash_i4_index: Bad address
 
 "Bad address"?  That seems pretty bizarre.

 This is obviously something that shows up on _some_ NetBSD platforms.
 The above was on sparc64, but that same problem is the only one I see
 in the regression testing on NetBSD/vax that isn't just different
 floating point (the VAX doesn't have IEEE), different ordering of
 (unordered) collections or different wording of strerror() output.

 NetBSD/i386 doesn't have the "Bad address" problem.

After looking into it, I find that the problem is this: Postgres, or at
least the hash-index part of it, expects to be able to lseek() to a
position past the end of a file and then get a non-failure return from
read().  (This happens indirectly because it uses ReadBuffer for blocks
that it has never yet written.)  Given the attached test program, I get
this result on my own machine:

$ touch z   -- create an empty file
$ ./a.out z 0   -- read at offset 0
Read 0 bytes
$ ./a.out z 1   -- read at offset 8K
Read 0 bytes

Presumably, the same result appears everywhere else that the regress
tests pass.  But NetBSD 1.5T gives

$ touch z
$ ./a.out z 0
Read 0 bytes
$ ./a.out z 1
read: Bad address
$ uname -a
NetBSD varg.i.eunet.no 1.5T NetBSD 1.5T (VARG) #4: Thu Apr  5 23:38:04 CEST 2001 
[EMAIL PROTECTED]:/usr/src/sys/arch/vax/compile/VARG vax

I think this is indisputably a bug in (some versions of) NetBSD.  If I
can seek past the end of file, read() shouldn't consider it a hard error
to read there --- and in any case, EFAULT isn't a very reasonable error
code to return.  Since it seems not to be a widespread problem, I'm not
eager to change the hash code to try to avoid it.

regards, tom lane


#include stdio.h
#include errno.h
#include fcntl.h
#include unistd.h

int main (int argc, char** argv)
{
char *fname = argv[1];
int fd, readres;
long seekres;
char buf[8192];

fd = open(fname, O_RDONLY, 0);
if (fd  0)
{
perror(fname);
exit(1);
}
seekres = lseek(fd, atoi(argv[2]) * 8192, SEEK_SET);
if (seekres  0)
{
perror("seek");
exit(1);
}
readres = read(fd, buf, sizeof(buf));
if (readres  0)
{
perror("read");
exit(1);
}
printf("Read %d bytes\n", readres);

exit(0);
}

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: NetBSD Bad address failure (was Re: [HACKERS] Third call for platform testing)

2001-04-13 Thread Tom Lane

I wrote:
 I think this is indisputably a bug in (some versions of) NetBSD.  If I
 can seek past the end of file, read() shouldn't consider it a hard error
 to read there --- and in any case, EFAULT isn't a very reasonable error
 code to return.  Since it seems not to be a widespread problem, I'm not
 eager to change the hash code to try to avoid it.

I forgot to mention a possible contributing factor: the files involved
were NFS-mounted, in the case I was looking at.  So this may be an NFS
problem more than a NetBSD problem.  Anyone want to try the given test
case on NFS-mounted files on other systems?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-08 Thread Tom Ivar Helbekkmo

matthew green [EMAIL PROTECTED] writes:

 i also believe the `Bad address' errors were caused when the test
 was run in an NFS mounted directory.

You may have something, there.  My test run on the VAX was over NFS.
I set up NetBSD on a VAX specifically to test PostgreSQL 7.1, but I
didn't have any disk available that it could use, so I went for NFS.

-tih
-- 
The basic difference is this: hackers build things, crackers break them.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-07 Thread matthew green

   
i will be reinstalling this SS20 with a full installation sometime in
the next few days.  i will re-run the testsuite after this to see if
that is causing any of the lossage.
   
   Please let us know.


actually, i had a classic i could test with -- all except horology passed,
so if there were two expected failures there, all is fine on NetBSD/sparc.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-07 Thread matthew green

   
digging into the regression.diffs, i can see that:
- reltime failed because it just had:
! psql: Backend startup failed
   
   The postmaster log file should have more info, but a first thought is
   that you ran up against process or swap-space limitations.  The parallel
   check has fifty-odd processes going at its peak, which is more than the
   default per-user process limit on many Unixen.

hmm, maxproc=80 on this system currently and i wasn't really doing anything
else.  it has 256MB ram and 280MB swap (unused).  exactly what am i looking
for in the postmaster.log file?  it is 65kb long...

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-07 Thread matthew green


   matthew green [EMAIL PROTECTED] writes:
digging into the regression.diffs, i can see that:
- reltime failed because it just had:
! psql: Backend startup failed
  
   The postmaster log file should have more info, but a first thought is
   that you ran up against process or swap-space limitations.  The parallel
   check has fifty-odd processes going at its peak, which is more than the
   default per-user process limit on many Unixen.
   
hmm, maxproc=80 on this system currently and i wasn't really doing anything
else.  it has 256MB ram and 280MB swap (unused).  exactly what am i looking
for in the postmaster.log file?  it is 65kb long...
   
   Look for messages about "fork failed".  They should give a kernel error
   message too.


after running `unlimit' (tcsh) before `make check', the only failures i have
are the horology (expected) and the inherit sorted failures, on NetBSD/sparc64.


i also believe the `Bad address' errors were caused when the test was run in
an NFS mounted directory.


.mrg.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-07 Thread matthew green

   
 CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
 + ERROR:  cannot read block 3 of hash_i4_index: Bad address

"Bad address"?  That seems pretty bizarre.
   
   This is obviously something that shows up on _some_ NetBSD platforms.
   The above was on sparc64, but that same problem is the only one I see

that Bad address message was actually from sparc.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-05 Thread Tom Ivar Helbekkmo

Tom Lane [EMAIL PROTECTED] writes:

  CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
  + ERROR:  cannot read block 3 of hash_i4_index: Bad address
 
 "Bad address"?  That seems pretty bizarre.

This is obviously something that shows up on _some_ NetBSD platforms.
The above was on sparc64, but that same problem is the only one I see
in the regression testing on NetBSD/vax that isn't just different
floating point (the VAX doesn't have IEEE), different ordering of
(unordered) collections or different wording of strerror() output.

NetBSD/i386 doesn't have the "Bad address" problem.

-tih
-- 
The basic difference is this: hackers build things, crackers break them.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-05 Thread Tom Lane

Tom Ivar Helbekkmo [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
 + ERROR:  cannot read block 3 of hash_i4_index: Bad address
 
 "Bad address"?  That seems pretty bizarre.

 This is obviously something that shows up on _some_ NetBSD platforms.

If it's reproducible on more than one box then we should look into it.
Am I right to guess that "Bad address" means a bogus pointer handed to
a kernel call?  If so, it'll probably take some digging with gdb to find
out the cause.  I'd be happy to do the digging if anyone can give me an
account reachable via telnet or ssh on one of these machines.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [lockhart@alumni.caltech.edu: [HACKERS] Third call for platform testing]

2001-04-04 Thread Tom Lane

Thomas Lockhart [EMAIL PROTECTED] writes:
 Anyone have suggestions for Mathew?

 for postgresql-7.1RC2.tar.gz, here is my `make check' for NetBSD/sparc64:

 digging into the regression.diffs, i can see that:
 - reltime failed because it just had:
 ! psql: Backend startup failed

The postmaster log file should have more info, but a first thought is
that you ran up against process or swap-space limitations.  The parallel
check has fifty-odd processes going at its peak, which is more than the
default per-user process limit on many Unixen.

 - inherit fails because the ordering is invalid, eg:

Ordering issues are not really bugs (cf documentation about interpreting
regression results), although it'd be interesting to know if these diffs
still occur after you resolve the other failures.

 - create_index failed because of some weird error that may
 have more to do with the quick-n-dirty installation i have
 on the SS20 i'm doing the test on:
 
 CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
 + ERROR:  cannot read block 3 of hash_i4_index: Bad address

"Bad address"?  That seems pretty bizarre.

 i will be reinstalling this SS20 with a full installation sometime in
 the next few days.  i will re-run the testsuite after this to see if
 that is causing any of the lossage.

Please let us know.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



Re: [HACKERS] Third call for platform testing

2001-04-03 Thread Tatsuo Ishii

 Unreported or problem platforms:
 
 Linux 2.0.x MIPS   7.0 2000-04-13 (Tatsuo has lost machine)
 mklinux PPC750 7.0 2000-04-13, Tatsuo Ishii
 NetBSD m68k7.0 2000-04-10 (Henry has lost machine)
 NetBSD Sparc   7.0 2000-04-13, Tom I. Helbekkmo
 QNX 4.25 x86   7.0 2000-04-01, Dr. Andreas Kardos
 Ultrix MIPS7.1 2001-??-??, Alexander Klimov
 
 mklinux has failed Tatsuo's testing afaicr. Demote to unsupported?

Yes. But you'd better to change mklinux - MkLinux DR1.  There may be
a chance that latest MkLinux or gcc successfully runs 7.1...

 
 Any NetBSD partisans who can do testing or solicit testing from the
 NetBSD crowd? Same for OpenBSD?
 
 QNX is known to have problems with 7.1. Any hope of fixing for 7.1.1? Is
 there anyone able to work on it? If not, I'll move to the unsupported
 list.
 
 
 And here are the up-to-date platforms; thanks for the reports:
 
 AIX 4.3.3 RS6000   7.1 2001-03-21, Gilles Darold
 BeOS 5.0.3 x86 7.1 2000-12-18, Cyril Velter
 BSDI 4.01  x86 7.1 2001-03-19, Bruce Momjian
 Compaq Tru64 4.0g Alpha 7.1 2001-03-19, Brent Verner
 FreeBSD 4.3 x867.1 2001-03-19, Vince Vielhaber
 HPUX PA-RISC   7.1 2001-03-19, 10.20 Tom Lane, 11.00 Giles Lean
 IRIX 6.5.11 MIPS   7.1 2001-03-22, Robert Bruccoleri
 Linux 2.2.x Alpha  7.1 2001-01-23, Ryan Kirkpatrick
 Linux 2.2.x armv4l 7.1 2001-03-22, Mark Knox
 Linux 2.2.18 PPC750 7.1 2001-03-19, Tom Lane
 Linux 2.2.x S/390  7.1 2000-11-17, Neale Ferguson
 Linux 2.2.15 Sparc 7.1 2001-01-30, Ryan Kirkpatrick
 Linux 2.2.16 x86   7.1 2001-03-19, Thomas Lockhart
 MacOS X Darwin PPC 7.1 2000-12-11, Peter Bierman
 NetBSD 1.5 alpha   7.1 2001-03-22, Giles Lean
 NetBSD 1.5E arm32  7.1 2001-03-21, Patrick Welche
 NetBSD 1.5S x867.1 2001-03-21, Patrick Welche
 OpenBSD 2.8 x867.1 2001-03-22, Brandon Palmer
 SCO OpenServer 5 x86   7.1 2001-03-13, Billy Allie
 SCO UnixWare 7.1.1 x86 7.1 2001-03-19, Larry Rosenman
 Solaris 2.7 Sparc  7.1 2001-03-22, Marc Fournier
 Solaris x867.1 2001-03-27, Mathijs Brands
 SunOS 4.1.4 Sparc  7.1 2001-03-23, Tatsuo Ishii
 Windows/Win32 x86  7.1 2001-03-26, Magnus Hagander (clients only)
 WinNT/Cygwin x86   7.1 2001-03-16, Jason Tishler
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
 

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[HACKERS] Third call for platform testing

2001-03-30 Thread Thomas Lockhart

Unreported or problem platforms:

Linux 2.0.x MIPS   7.0 2000-04-13 (Tatsuo has lost machine)
mklinux PPC750 7.0 2000-04-13, Tatsuo Ishii
NetBSD m68k7.0 2000-04-10 (Henry has lost machine)
NetBSD Sparc   7.0 2000-04-13, Tom I. Helbekkmo
QNX 4.25 x86   7.0 2000-04-01, Dr. Andreas Kardos
Ultrix MIPS7.1 2001-??-??, Alexander Klimov

mklinux has failed Tatsuo's testing afaicr. Demote to unsupported?

Any NetBSD partisans who can do testing or solicit testing from the
NetBSD crowd? Same for OpenBSD?

QNX is known to have problems with 7.1. Any hope of fixing for 7.1.1? Is
there anyone able to work on it? If not, I'll move to the unsupported
list.


And here are the up-to-date platforms; thanks for the reports:

AIX 4.3.3 RS6000   7.1 2001-03-21, Gilles Darold
BeOS 5.0.3 x86 7.1 2000-12-18, Cyril Velter
BSDI 4.01  x86 7.1 2001-03-19, Bruce Momjian
Compaq Tru64 4.0g Alpha 7.1 2001-03-19, Brent Verner
FreeBSD 4.3 x867.1 2001-03-19, Vince Vielhaber
HPUX PA-RISC   7.1 2001-03-19, 10.20 Tom Lane, 11.00 Giles Lean
IRIX 6.5.11 MIPS   7.1 2001-03-22, Robert Bruccoleri
Linux 2.2.x Alpha  7.1 2001-01-23, Ryan Kirkpatrick
Linux 2.2.x armv4l 7.1 2001-03-22, Mark Knox
Linux 2.2.18 PPC750 7.1 2001-03-19, Tom Lane
Linux 2.2.x S/390  7.1 2000-11-17, Neale Ferguson
Linux 2.2.15 Sparc 7.1 2001-01-30, Ryan Kirkpatrick
Linux 2.2.16 x86   7.1 2001-03-19, Thomas Lockhart
MacOS X Darwin PPC 7.1 2000-12-11, Peter Bierman
NetBSD 1.5 alpha   7.1 2001-03-22, Giles Lean
NetBSD 1.5E arm32  7.1 2001-03-21, Patrick Welche
NetBSD 1.5S x867.1 2001-03-21, Patrick Welche
OpenBSD 2.8 x867.1 2001-03-22, Brandon Palmer
SCO OpenServer 5 x86   7.1 2001-03-13, Billy Allie
SCO UnixWare 7.1.1 x86 7.1 2001-03-19, Larry Rosenman
Solaris 2.7 Sparc  7.1 2001-03-22, Marc Fournier
Solaris x867.1 2001-03-27, Mathijs Brands
SunOS 4.1.4 Sparc  7.1 2001-03-23, Tatsuo Ishii
Windows/Win32 x86  7.1 2001-03-26, Magnus Hagander (clients only)
WinNT/Cygwin x86   7.1 2001-03-16, Jason Tishler

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [HACKERS] Third call for platform testing

2001-03-30 Thread The Hermit Hacker

On Fri, 30 Mar 2001, Mathijs Brands wrote:

 On Fri, Mar 30, 2001 at 03:17:06PM +, Thomas Lockhart allegedly wrote:
  And here are the up-to-date platforms; thanks for the reports:

 SNIP

  Solaris 2.7 Sparc  7.1 2001-03-22, Marc Fournier

 Marc, was this done without unix sockets?

nope, purely default ... it was only the x86 platform that I had a bugger
with getting a clean regress working on ...



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Third call for platform testing (linux 2.4.x)

2001-03-30 Thread Franck Martin

I still don't see an entry for Linux 2.4.x

Cheers.

Thomas Lockhart wrote:

 Unreported or problem platforms:

 Linux 2.0.x MIPS   7.0 2000-04-13 (Tatsuo has lost machine)
 mklinux PPC750 7.0 2000-04-13, Tatsuo Ishii
 NetBSD m68k7.0 2000-04-10 (Henry has lost machine)
 NetBSD Sparc   7.0 2000-04-13, Tom I. Helbekkmo
 QNX 4.25 x86   7.0 2000-04-01, Dr. Andreas Kardos
 Ultrix MIPS7.1 2001-??-??, Alexander Klimov

 mklinux has failed Tatsuo's testing afaicr. Demote to unsupported?

 Any NetBSD partisans who can do testing or solicit testing from the
 NetBSD crowd? Same for OpenBSD?

 QNX is known to have problems with 7.1. Any hope of fixing for 7.1.1? Is
 there anyone able to work on it? If not, I'll move to the unsupported
 list.

 And here are the up-to-date platforms; thanks for the reports:

 AIX 4.3.3 RS6000   7.1 2001-03-21, Gilles Darold
 BeOS 5.0.3 x86 7.1 2000-12-18, Cyril Velter
 BSDI 4.01  x86 7.1 2001-03-19, Bruce Momjian
 Compaq Tru64 4.0g Alpha 7.1 2001-03-19, Brent Verner
 FreeBSD 4.3 x867.1 2001-03-19, Vince Vielhaber
 HPUX PA-RISC   7.1 2001-03-19, 10.20 Tom Lane, 11.00 Giles Lean
 IRIX 6.5.11 MIPS   7.1 2001-03-22, Robert Bruccoleri
 Linux 2.2.x Alpha  7.1 2001-01-23, Ryan Kirkpatrick
 Linux 2.2.x armv4l 7.1 2001-03-22, Mark Knox
 Linux 2.2.18 PPC750 7.1 2001-03-19, Tom Lane
 Linux 2.2.x S/390  7.1 2000-11-17, Neale Ferguson
 Linux 2.2.15 Sparc 7.1 2001-01-30, Ryan Kirkpatrick
 Linux 2.2.16 x86   7.1 2001-03-19, Thomas Lockhart
 MacOS X Darwin PPC 7.1 2000-12-11, Peter Bierman
 NetBSD 1.5 alpha   7.1 2001-03-22, Giles Lean
 NetBSD 1.5E arm32  7.1 2001-03-21, Patrick Welche
 NetBSD 1.5S x867.1 2001-03-21, Patrick Welche
 OpenBSD 2.8 x867.1 2001-03-22, Brandon Palmer
 SCO OpenServer 5 x86   7.1 2001-03-13, Billy Allie
 SCO UnixWare 7.1.1 x86 7.1 2001-03-19, Larry Rosenman
 Solaris 2.7 Sparc  7.1 2001-03-22, Marc Fournier
 Solaris x867.1 2001-03-27, Mathijs Brands
 SunOS 4.1.4 Sparc  7.1 2001-03-23, Tatsuo Ishii
 Windows/Win32 x86  7.1 2001-03-26, Magnus Hagander (clients only)
 WinNT/Cygwin x86   7.1 2001-03-16, Jason Tishler

 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])