Re: Problem reports for version control systems

2021-05-02 Thread Johnny Billquist

On 2021-05-02 18:21, Michael van Elst wrote:

b...@update.uu.se (Johnny Billquist) writes:


And as a "fun" fact. On my 4000/90, it takes about 3h after I start a
cvs update until I actually start having any network traffic...


A SCSI SSD could help. :)


Definitely. Because the disk is 100% busy all the time. Not a very 
modern disk either. RZ29 if I remember right.
SSD could make a big difference. And that kind of capacity isn't that 
much today. 4G or so...


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: Problem reports for version control systems

2021-05-02 Thread Johnny Billquist

On 2021-05-02 16:32, Anders Magnusson wrote:

Den 2021-05-02 kl. 15:57, skrev Johnny Billquist:

On 2021-05-02 13:51, Anders Magnusson wrote:

Den 2021-05-02 kl. 13:44, skrev Johnny Billquist:


I suspect what is commonly the problem here is related to the fact 
that cvs has such a phase at the beginning where it is scanning 
through the file system, which can take quite a while. Some NAT 
devices along the path sometimes have timeouts on existing 
connections that if no traffic is happening for a while, they are 
dropped, even though there hasn't been any FINs on the connection.
So a connection that just don't have any traffic for a while are hit 
by this, which is exactly the pattern you have with cvs.


I've seen the same effect on a simple telnet session, where ssh 
survives fine. And there it's just that when the connection is idle, 
telnet is not creating any traffic at all, while ssh do generate a 
bit of traffic even if there is no activity.


So one obvious solution is to use something like ssh as a carries 
for the cvs traffic, if possible, or else see if some kind of 
keepalives can be enabled on a connection, to defeat NAT and similar 
devices which aggressively drop connections on which there is no 
traffic for a while.
(Or, of course, if there is a NAT you have control over, you might 
be able to change how it behaves...)

This is quite common, yes.
I ususlly add ssh keepalive to ssh_config for all hosts to avoid this 
problem (which may occur, as written, when doing cvs update).


And as a "fun" fact. On my 4000/90, it takes about 3h after I start a 
cvs update until I actually start having any network traffic... In 
total it takes something like 8h to do a cvs update on /usr/src.
(I guess I'm a bit masochistic is still insisting on trying to do 
things on my VAXen...)

Have you tried it on the 8650? :-)


Last time I did manage, I think it was close to a day. That with RA73 
drives. But you know there's been problems with having two unibuses 
lately, and so on. So it's been quite a while since I actually managed 
to much of anything natively on that machine. :-(


Another reason I'd like to get gcc working native again. Also trying to 
get a chance to fix things on the 8650. Not sure how much longer we will 
be able to keep that machine around ready to run. We might need to 
shrink our locales soon...


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: Problem reports for version control systems

2021-05-02 Thread Johnny Billquist

On 2021-05-02 13:51, Anders Magnusson wrote:

Den 2021-05-02 kl. 13:44, skrev Johnny Billquist:


I suspect what is commonly the problem here is related to the fact 
that cvs has such a phase at the beginning where it is scanning 
through the file system, which can take quite a while. Some NAT 
devices along the path sometimes have timeouts on existing connections 
that if no traffic is happening for a while, they are dropped, even 
though there hasn't been any FINs on the connection.
So a connection that just don't have any traffic for a while are hit 
by this, which is exactly the pattern you have with cvs.


I've seen the same effect on a simple telnet session, where ssh 
survives fine. And there it's just that when the connection is idle, 
telnet is not creating any traffic at all, while ssh do generate a bit 
of traffic even if there is no activity.


So one obvious solution is to use something like ssh as a carries for 
the cvs traffic, if possible, or else see if some kind of keepalives 
can be enabled on a connection, to defeat NAT and similar devices 
which aggressively drop connections on which there is no traffic for a 
while.
(Or, of course, if there is a NAT you have control over, you might be 
able to change how it behaves...)

This is quite common, yes.
I ususlly add ssh keepalive to ssh_config for all hosts to avoid this 
problem (which may occur, as written, when doing cvs update).


And as a "fun" fact. On my 4000/90, it takes about 3h after I start a 
cvs update until I actually start having any network traffic... In total 
it takes something like 8h to do a cvs update on /usr/src.
(I guess I'm a bit masochistic is still insisting on trying to do things 
on my VAXen...)


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: Problem reports for version control systems

2021-05-02 Thread Johnny Billquist

On 2021-05-01 23:54, Brett Lymn wrote:

On Sat, May 01, 2021 at 12:58:50PM +1200, Lloyd Parkes wrote:


Germany is pretty much the opposite of New Zealand. It's close to
everywhere, but its last mile access speeds are a bit infamous.



Just for you info... there are a few NetBSD developers in .au, my self 
included.  I haven't
had any issues with cvs disconnects.  Not to deny you have an issue, just 
letting you know
it works ok for people near you.


Not anywhere near such a location, but just adding that cvs works fine 
for me too, but yes, there is a lot of disk activity on the local 
machine before anything even starts downloading, and a lot of activity 
at the end where it updates file metadata as well as clean out empty 
directories (if you added pruning).



I'm running some tests on other local clients and against other CVS mirrors
in the hope that come up with a better characterisation of the problem than
"it doesn't work".



If you have the space, a tcpdump from both sides of your firewall may provide a 
clue.


I suspect what is commonly the problem here is related to the fact that 
cvs has such a phase at the beginning where it is scanning through the 
file system, which can take quite a while. Some NAT devices along the 
path sometimes have timeouts on existing connections that if no traffic 
is happening for a while, they are dropped, even though there hasn't 
been any FINs on the connection.
So a connection that just don't have any traffic for a while are hit by 
this, which is exactly the pattern you have with cvs.


I've seen the same effect on a simple telnet session, where ssh survives 
fine. And there it's just that when the connection is idle, telnet is 
not creating any traffic at all, while ssh do generate a bit of traffic 
even if there is no activity.


So one obvious solution is to use something like ssh as a carries for 
the cvs traffic, if possible, or else see if some kind of keepalives can 
be enabled on a connection, to defeat NAT and similar devices which 
aggressively drop connections on which there is no traffic for a while.
(Or, of course, if there is a NAT you have control over, you might be 
able to change how it behaves...)


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

New one:

In the process of running build.sh for a full distribution:

#   compile  tmux/cmd-display-menu.o
/usr/src/obj/tooldir.NetBSD-8.0-amd64/bin/vax--netbsdelf-gcc -O2 
-std=gnu99-Wall -Wstrict-prototypes -Wmissing-prototypes 
-Wpointer-arith -Wno-sign-compare  -Wsystem-headers   -Wno-traditional 
 -Wa,--fatal-warnings  -Wreturn-type -Wswitch -Wshadow -Wcast-qual 
-Wwrite-strings -Wextra -Wno-unused-parameter -Wno-sign-compare 
-Wsign-compare -Wformat=2  -Wno-format-zero-length  -Werror 
--sysroot=/usr/src/obj/destdir.vax -DSUPPORT_UTMP -DSUPPORT_UTMPX 
-I/usr/src/external/bsd/tmux/dist 
-I/usr/src/external/bsd/tmux/usr.bin/tmux -DHAVE_ASPRINTF=1 
-DHAVE_B64_NTOP=1  -DHAVE_BITSTRING_H=1  -DHAVE_BSD_GETOPT=1 
-DHAVE_CFMAKERAW=1  -DHAVE_CLOCK_GETTIME=1  -DHAVE_CLOSEFROM=1 
-DHAVE_CURSES_H=1  -DHAVE_DAEMON=1  -DHAVE_DIRENT_H=1 
-DHAVE_EVENT2_EVENT_H=1  -DHAVE_FCNTL_CLOSEM=1  -DHAVE_FCNTL_H=1 
-DHAVE_FGETLN=1  -DHAVE_FLOCK=1  -DHAVE_FORKPTY=1  -DHAVE_FPARSELN=1 
-DHAVE_GETDTABLESIZE=1  -DHAVE_GETLINE=1  -DHAVE_GETOPT=1 
-DHAVE_GETPROGNAME=1  -DHAVE_INTTYPES_H=1  -DHAVE_LIBM=1 
-DHAVE_LIBPROC_H=1  -DHAVE_MEMMEM=1  -DHAVE_MEMORY_H=1  -DHAVE_PATHS_H=1 
 -DHAVE_PROC_PID=1  -DHAVE_QUEUE_H=1  -DHAVE_REALLOCARRAY=1 
-DHAVE_SETENV=1  -DHAVE_SETPROCTITLE=1  -DHAVE_STDINT_H=1 
-DHAVE_STDLIB_H=1  -DHAVE_STRCASESTR=1  -DHAVE_STRINGS_H=1 
-DHAVE_STRING_H=1  -DHAVE_STRLCAT=1  -DHAVE_STRLCPY=1  -DHAVE_STRNDUP=1 
 -DHAVE_STRSEP=1  -DHAVE_STRTONUM=1  -DHAVE_SYSCONF=1 
-DHAVE_SYS_DIR_H=1  -DHAVE_SYS_SIGNAME=1  -DHAVE_SYS_STAT_H=1 
-DHAVE_SYS_TREE_H=1  -DHAVE_SYS_TYPES_H=1  -DHAVE_TREE_H=1 
-DHAVE_UNISTD_H=1  -DHAVE_UTEMPTER=1  -DHAVE_UTIL_H=1  -DHAVE_VIS=1 
-DHAVE___PROGNAME=1  -DPACKAGE=\"tmux\"  -DPACKAGE_BUGREPORT=\"\" 
-DPACKAGE_NAME=\"tmux\"  -DPACKAGE_STRING=\"tmux\ 3.2\" 
-DPACKAGE_TARNAME=\"tmux\"  -DPACKAGE_URL=\"\" 
-DPACKAGE_VERSION=\"3.2\"  -DSTDC_HEADERS=1 
-DTMUX_CONF="\"/etc/tmux.conf:~/.tmux.conf:~/.config/tmux/tmux.conf\"" 
-DTMUX_VERSION='"3.2"'  -DVERSION=\"3.2\"  -D_ALL_SOURCE=1 
-D_GNU_SOURCE=1  -D_NETBSD_SOURCE  -D_OPENBSD_SOURCE 
-D_POSIX_PTHREAD_SEMANTICS=1  -D_TANDEM_SOURCE=1  -D_XOPEN_SOURCE=500 
-D__EXTENSIONS__=1  -c 
/usr/src/external/bsd/tmux/dist/cmd-display-menu.c -o cmd-display-menu.o
/usr/src/external/bsd/tmux/dist/cmd-display-menu.c: In function 
'cmd_display_menu_get_position':
/usr/src/external/bsd/tmux/dist/cmd-display-menu.c:158:8: error: 
comparison of integer expressions of different signedness: 'long int' 
and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare]

  158 |  if (n >= tty->sy)
  |^~
/usr/src/external/bsd/tmux/dist/cmd-display-menu.c:191:8: error: 
comparison of integer expressions of different signedness: 'long int' 
and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare]

  191 |  if (n >= tty->sy)
  |^~
/usr/src/external/bsd/tmux/dist/cmd-display-menu.c:239:8: error: 
comparison of integer expressions of different signedness: 'long int' 
and 'u_int' {aka 'unsigned int'} [-Werror=sign-compare]

  239 |  if (n < h)
  |^
cc1: all warnings being treated as errors

*** Failed target:  cmd-display-menu.o


Anyone have any idea about that one?

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

On 2021-04-18 17:55, Paul Goyette wrote:

Something in your environment? /etc/mk.conf ?


Dang. You're right. I wonder when any why I got that in there...?

Ok. Nothing to see here. Sorry for the noise.

  Johnny



On Sun, 18 Apr 2021, Johnny Billquist wrote:


On 2021-04-18 17:49, Martin Husemann wrote:

On Sun, Apr 18, 2021 at 05:46:39PM +0200, Johnny Billquist wrote:

I said in my original mail:

"Building from NetBSD-8 does not work. Unsure if this is a known
limitation."

So, not building from current, but am trying to build current.


Yes, but you are overriding HAVE_GCC somehow.


Not that I am aware of. But obviously it has been set by something 
already before it comes to those lines in bsd.own.mk


But I checked, and my whole source tree is pretty much unmodified, 
except for some vax specific files, where I have my own development 
stuff going on. Nothing that affects this.


 Johnny

--
Johnny Billquist  || "I'm on a bus
 ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol

!DSPAM:607c55fb81081654361426!




++--+---+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
++--+-------+



--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

On 2021-04-18 17:49, Martin Husemann wrote:

On Sun, Apr 18, 2021 at 05:46:39PM +0200, Johnny Billquist wrote:

I said in my original mail:

"Building from NetBSD-8 does not work. Unsure if this is a known
limitation."

So, not building from current, but am trying to build current.


Yes, but you are overriding HAVE_GCC somehow.


Not that I am aware of. But obviously it has been set by something 
already before it comes to those lines in bsd.own.mk


But I checked, and my whole source tree is pretty much unmodified, 
except for some vax specific files, where I have my own development 
stuff going on. Nothing that affects this.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

On 2021-04-18 17:48, Martin Husemann wrote:

On Sun, Apr 18, 2021 at 05:44:36PM +0200, Johnny Billquist wrote:

On 2021-04-18 17:42, Martin Husemann wrote:

On Sun, Apr 18, 2021 at 03:09:17PM +0200, Johnny Billquist wrote:

Basically, the problem is that HAVE_GCC is there set to 8,


Where and why? This should not happen - HAVE_GCC for -current is either 9
or 10, no matter on what host / which tools you compile with.


GW:/usr/src# uname -a
NetBSD GW.SoftJAR.SE 8.0 NetBSD 8.0 (GENERIC) #0: Tue Jul 17 14:59:51 UTC
2018 mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
amd64

Cross compiling for vax with a current checkout from a couple of hours ago.


No, I mean where the HAVE_GCC=8 comes from.

The only places where we set is is in bsd.own.mk, in this code:

#
# What GCC is used?
#
.if ${MACHINE} == "alpha" || \
 ${MACHINE_ARCH} == "x86_64" || \
 ${MACHINE} == "ia64" || \
 ${MACHINE} == "sparc" || \
 ${MACHINE} == "sparc64" || \
 ${MACHINE} == "vax" || \
 ${MACHINE_ARCH} == "riscv32" || \
 ${MACHINE_ARCH} == "riscv64"
HAVE_GCC?=  10
.else
HAVE_GCC?=  9
.endif


... so if you end up with "8" here, something is wrong in your setup.


bsd.own.mk only conditionally sets it if it is unset.
I have no idea where it comes from. I certainly have not touched it.
I only added a .info to print out what the actual value is, and it's 
already set to 8 by something previous, obviously, since it's not being 
set there.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

On 2021-04-18 17:44, Johnny Billquist wrote:

On 2021-04-18 17:42, Martin Husemann wrote:

On Sun, Apr 18, 2021 at 03:09:17PM +0200, Johnny Billquist wrote:

Basically, the problem is that HAVE_GCC is there set to 8,


Where and why? This should not happen - HAVE_GCC for -current is either 9
or 10, no matter on what host / which tools you compile with.


GW:/usr/src# uname -a
NetBSD GW.SoftJAR.SE 8.0 NetBSD 8.0 (GENERIC) #0: Tue Jul 17 14:59:51 
UTC 2018 
mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64


Cross compiling for vax with a current checkout from a couple of hours ago.


I said in my original mail:

"Building from NetBSD-8 does not work. Unsure if this is a known 
limitation."


So, not building from current, but am trying to build current.

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

On 2021-04-18 17:42, Martin Husemann wrote:

On Sun, Apr 18, 2021 at 03:09:17PM +0200, Johnny Billquist wrote:

Basically, the problem is that HAVE_GCC is there set to 8,


Where and why? This should not happen - HAVE_GCC for -current is either 9
or 10, no matter on what host / which tools you compile with.


GW:/usr/src# uname -a
NetBSD GW.SoftJAR.SE 8.0 NetBSD 8.0 (GENERIC) #0: Tue Jul 17 14:59:51 
UTC 2018 
mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64


Cross compiling for vax with a current checkout from a couple of hours ago.

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: HEADS UP: GCC 10 now default on several ports

2021-04-18 Thread Johnny Billquist

On 2021-04-17 10:01, John Paul Adrian Glaubitz wrote:

On 4/17/21 6:15 AM, matthew green wrote:

i've switched the alpha, amd64, sparc*, riscv*, ia64, and vax ports
have all been switched to GCC 10.

please send-pr or send email here about problems you encounter.


And please report compiler-specific issues such as miscompiliations to the
GCC bug tracker such that a broader audience becomes aware of these problems.


Building from NetBSD-8 does not work. Unsure if this is a known limitation.

Basically, the problem is that HAVE_GCC is there set to 8, and in 
bsd.own.mk there is this piece:


.if ${HAVE_GCC} == 9
EXTERNAL_GCC_SUBDIR?=   gcc.old
.elif ${HAVE_GCC} == 10
EXTERNAL_GCC_SUBDIR?=   gcc
.else
EXTERNAL_GCC_SUBDIR?=   /does/not/exist
.endif

Which means EXTERNAL_GCC_SUBDIR gets set to /does/not/exist
and then build.sh borks out with non-existing directory pretty quickly 
when building the tools.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: panic when removing a file in current

2018-07-19 Thread Johnny Billquist

On 2018-07-19 13:18, J. Hannken-Illjes wrote:

On Thu, Jul 19, 2018 at 01:08:22PM +0200, Johnny Billquist wrote:

Hmm. That means I need to update user land, which can be a bit scary since it 
can make a rollback really hard.
And there is also a chicken and egg thing here. Installing a new user land can 
potentially mean removing files, which will trigger the panic.

Is it really motivated with that panic? The system is running without issues on 
that same file system and NetBSD 7.


You could backport this change to -7 fsck_ffs, the patch (attached) is small.


Some more updates. I decided to remove the panic call and do some tests 
with NetBSD 7 userland.


Ran the system in single user mode. File system clean and nice.
cp /netbsd /foo
rm /foo

Rebooting system, again into single user mode:

# fsck -f -y /
** /dev/rra0a
** File system is already clean
** Last Mounted on /
** Root file system
** Phase 1 - Check Blocks and Sizes
PARTIALLY ALLOCATED INODE I=47
CLEAR? yes

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

5097 files, 131676 used, 122387 free (315 frags, 15259 blocks, 0.1% 
fragmentation)


* FILE SYSTEM WAS MODIFIED *


So it is an error created with NetBSD-8 and even detected with fsck from 
NetBSD-7 without any patching. Doing the same operation with NetBSD-7 
does not cause this corruption.
If I understand things right, this is that the disk blocks allocated to 
the file have not been released when the file is deleted, and the inode 
not cleared out?


Something have obviously changed here since NetBSD-7. I was trying to 
look a little at the code, and there is a truncate function called to 
trim the file down to 0 before deleting it, but I haven't had time to 
try and understand the code any better yet.


But maybe you, or someone else knows the innards of this better already 
and might know what the problem might be, or where.


Creating and working on files is fine. It's explicitly the deletion of 
files that cause the problem. And I've restored the panic call in 
ufs_inactive, since this was obviously a "good" panic.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: panic when removing a file in current

2018-07-19 Thread Johnny Billquist
Hmm. That means I need to update user land, which can be a bit scary since it 
can make a rollback really hard.
And there is also a chicken and egg thing here. Installing a new user land can 
potentially mean removing files, which will trigger the panic.

Is it really motivated with that panic? The system is running without issues on 
that same file system and NetBSD 7.

Like I said, these disks and file systems have been around a rather long time.

As for short files I believed that the data area in the indie was also used in 
a normal file for the first few bytes. Either way, when testing around this 
panic I can tell that creating a file with just a few bytes in it is not a 
problem. A can delete the file after creating. However, creating a larger file 
then triggers the panic when I delete the file again. So something is different 
on a really short file.

  Johnny 


"J. Hannken-Illjes"  skrev: (19 juli 2018 09:50:40 
CEST)
>
>
>> On 19. Jul 2018, at 03:54, Johnny Billquist  wrote:
>> 
>> Anyone seen this, or know what it's about?
>
>Great, it took 6 months to trigger my assertion ...
>
>This panic probably means the file contains unallocated inodes that
>were only partially zeroed.
>
>Please run "fsck -f" on this file system and look for messages
>like "PARTIALLY ALLOCATED INODE".
>
>> On NetBSD/vax, with 8.99.22 from today.
>> 
>> Removing any file that has disk blocks allocated to it:
>> 
>> [ 653.3285523] ufs_inactive: unlinked ino 50313 on "/home" has non
>zero size 0 or blocks 1ac0 with allerror 0
>> [ 653.3484633] panic: ufs_inactive: dirty filesystem?
>> [ 653.3788284] cpu0: Begin traceback...
>> [ 653.3984724] panic: ufs_inactive: dirty filesystem?
>> [ 653.4090004] Stack traceback :
>> [ 653.4231115]   Process is executing in user space.
>> [ 653.4286045] cpu0: End traceback...
>> Stopped in pid 39.1 (rm) at netbsd:vpanic+0xc5: pushl   $0
>> 
>> 
>> If a file is small enough to have all the data in the inode itself,
>rm survives fine.
>
>We never hold file data in inodes, only short sysmlinks.
>
>> 
>>  Johnny
>> 
>> -- 
>> Johnny Billquist  || "I'm on a bus
>>  ||  on a psychedelic trip
>> email: b...@softjar.se ||  Reading murder books
>> pdp is alive! ||  tryin' to stay hip" - B. Idol
>
>--
>J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)

-- 
Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.

Re: panic when removing a file in current

2018-07-19 Thread Johnny Billquist

On 2018-07-19 06:12, m...@netbsd.org wrote:

On Thu, Jul 19, 2018 at 03:54:17AM +0200, Johnny Billquist wrote:

Anyone seen this, or know what it's about?

On NetBSD/vax, with 8.99.22 from today.

Removing any file that has disk blocks allocated to it:

[ 653.3285523] ufs_inactive: unlinked ino 50313 on "/home" has non zero size
0 or blocks 1ac0 with allerror 0
[ 653.3484633] panic: ufs_inactive: dirty filesystem?
[ 653.3788284] cpu0: Begin traceback...
[ 653.3984724] panic: ufs_inactive: dirty filesystem?
[ 653.4090004] Stack traceback :
[ 653.4231115]   Process is executing in user space.
[ 653.4286045] cpu0: End traceback...
Stopped in pid 39.1 (rm) at netbsd:vpanic+0xc5: pushl   $0


If a file is small enough to have all the data in the inode itself, rm
survives fine.

   Johnny

--
Johnny Billquist  || "I'm on a bus
   ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


The only recent relevant-looking change is
https://mail-index.netbsd.org/source-changes-hg/2018/07/19/msg002880.html


I'll check around this. But the panic is happening in 
ufs/ufs/ufs_inode.c, and that change was in ufs/ffs/ffs_vfsops.c. But 
was also suggested by Paul, so... I'll get back with results later.



Cool that someone is very up-to-date on vax :-)


It's even worse. This is on a real VAX 8650... Very possibly the 
physically largest thing NetBSD that anyone is trying to run NetBSD 
on... :-)


  Johnny
--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: panic when removing a file in current

2018-07-19 Thread Johnny Billquist

Hi.

On 2018-07-19 05:21, Paul Goyette wrote:

Let me update my source tree, re-build, and check.

What port are you using?  i386? amd64? other?


NetBSD/vax.

A couple of more points. I upgraded the machine from 7.99.something, so 
I have not run any version 8 on this machine previously. The file 
systems have also been around forever, so no newfs done around this.


  Johnny



On Thu, 19 Jul 2018, Johnny Billquist wrote:


Anyone seen this, or know what it's about?

On NetBSD/vax, with 8.99.22 from today.

Removing any file that has disk blocks allocated to it:

[ 653.3285523] ufs_inactive: unlinked ino 50313 on "/home" has non 
zero size 0 or blocks 1ac0 with allerror 0

[ 653.3484633] panic: ufs_inactive: dirty filesystem?
[ 653.3788284] cpu0: Begin traceback...
[ 653.3984724] panic: ufs_inactive: dirty filesystem?
[ 653.4090004] Stack traceback :
[ 653.4231115]   Process is executing in user space.
[ 653.4286045] cpu0: End traceback...
Stopped in pid 39.1 (rm) at netbsd:vpanic+0xc5: pushl   $0


If a file is small enough to have all the data in the inode itself, rm 
survives fine.


 Johnny

--
Johnny Billquist  || "I'm on a bus
 ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol

!DSPAM:5b50017885611524031356!




+--+--++ 

| Paul Goyette | PGP Key fingerprint: | E-mail 
addresses:  |
| (Retired)    | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot 
com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot 
org |
+--+--+----+ 



--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


panic when removing a file in current

2018-07-18 Thread Johnny Billquist

Anyone seen this, or know what it's about?

On NetBSD/vax, with 8.99.22 from today.

Removing any file that has disk blocks allocated to it:

[ 653.3285523] ufs_inactive: unlinked ino 50313 on "/home" has non zero 
size 0 or blocks 1ac0 with allerror 0

[ 653.3484633] panic: ufs_inactive: dirty filesystem?
[ 653.3788284] cpu0: Begin traceback...
[ 653.3984724] panic: ufs_inactive: dirty filesystem?
[ 653.4090004] Stack traceback :
[ 653.4231115]   Process is executing in user space.
[ 653.4286045] cpu0: End traceback...
Stopped in pid 39.1 (rm) at netbsd:vpanic+0xc5: pushl   $0


If a file is small enough to have all the data in the inode itself, rm 
survives fine.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: Annoying problem with "cvs update -Pd" in /usr/src building current

2017-03-22 Thread Johnny Billquist
Yes.

And that is because your build is leaving files behind in the source 
directories, which then conflicts with cvs checkouts.

The solution is to build with a separate object directory.

  Johnny 


pimin inwa <pimini...@gmail.com> skrev: (21 mars 2017 15:55:47 CET)
>Should have included the commands:
>
>cd /usr/src
>cvs update -Pd
> ./build.sh -U -u -o -O BUILD_OBJ -T BUILD_TOOL release
>
>
>This will occasionally blowup in the cvs command with the problem.
>
>Paul N.
>
>On Tue, Mar 21, 2017 at 5:15 AM, Johnny Billquist <b...@softjar.se>
>wrote:
>
>> Run your builds with an object directory set up. Looks like you get
>your
>> output files in the same directory as the source.
>>
>> Johnny
>>
>>
>> pimin inwa <pimini...@gmail.com> skrev: (21 mars 2017 05:08:55 CET)
>>
>>> I'm seeing this annoying situation with cvs, especially after a
>failed
>>> build (as described in another thread).
>>>
>>> cvs update: Updating usr.bin/ktrace/kdump
>>> cvs [update aborted]: could not chdir to usr.bin/ktrace/ktrace: Not
>a
>>> directory
>>>
>>>
>>> I find I have to remove these files in order to proceed:
>>>
>>>  rm -fr  usr.bin/ktrace/ktrace  usr.bin/man/man
>usr.sbin/mrouted/mrouted
>>>  usr.sbin/racoon/racoon
>>>
>>>
>>> Where am I going wrong?
>>>
>>> Paul
>>>
>>
>> --
>> Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.
>>

-- 
Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.

Re: Annoying problem with "cvs update -Pd" in /usr/src building current

2017-03-21 Thread Johnny Billquist
Run your builds with an object directory set up. Looks like you get your output 
files in the same directory as the source.

  Johnny 


pimin inwa  skrev: (21 mars 2017 05:08:55 CET)
>I'm seeing this annoying situation with cvs, especially after a failed
>build (as described in another thread).
>
>cvs update: Updating usr.bin/ktrace/kdump
>cvs [update aborted]: could not chdir to usr.bin/ktrace/ktrace: Not a
>directory
>
>
>I find I have to remove these files in order to proceed:
>
>rm -fr  usr.bin/ktrace/ktrace  usr.bin/man/man usr.sbin/mrouted/mrouted
> usr.sbin/racoon/racoon
>
>
>Where am I going wrong?
>
>Paul

-- 
Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.

Re: PulseAudio and OSS audio of recent NetBSD-current

2017-03-12 Thread Johnny Billquist

On 2017-03-12 15:33, Tom Ivar Helbekkmo wrote:

Tom Ivar Helbekkmo <t...@hamartun.priv.no> writes:


After the latest upgrade (from 7.99.59 to 7.99.64), it stopped working
for me, as well.  I've gone back to using OSS directly instead.  :)


...and *man*, I'd forgotten how much 16-bit audio sucks.  :)


I guess that means you detest CDs like nothing else. :-)

Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: build fails due to intrctl on ARM

2016-10-15 Thread Johnny Billquist

On 2016-10-15 22:49, Rin Okuyama wrote:

Build fails due to intrctl(8) on ARM, whose char is unsigned. Please
apply the attached patch.


Well, considering that getopt() is returning an int, whoever wrote that 
code did it wrong. So this is a general bug that might affect everyone, 
not just arm.
(It will not be able to differ between -1 and 0xff, which might show up 
as well, if stuffed into a char, if signed chars.)


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: gdb.old is broken

2016-10-13 Thread Johnny Billquist

On 2016-10-13 11:51, Roy Marples wrote:

On 13/10/2016 10:07, Johnny Billquist wrote:

CVS do not explicitly manage directories. It's not a bug, but the way
CVS works. If you do an update, you need to give -d for CVS to create
new directories needed. But I would assume everyone here knows this.


Assumption is the mother of all evils.


Good point.


I for one did not know this and my recent struggles with cvs shows just
how poor the available documentation and help is.


I assumed that anyone involved in the deep end of NetBSD would know CVS, 
as CVS have been around for more than 20 years, and has been the 
revision control system used by NetBSD the whole time NetBSD have 
existed. But yes, I guess I should not make that assumption.


That said, doing a "cvs --help update" will also tell you the -d switch. 
And I would have hoped people knew, and had read "Version management 
with CVS" 
(https://ftp.gnu.org/non-gnu/cvs/source/stable/1.11.22/cederqvist-1.11.22.pdf), 
which is a good document to read, and not too heavy.


And while I'm on a roll I might as well promote -P as well. I think that 
unless you know what you are doing, -d and -P is probably switches you 
always want to apply when you do cvs update.


And to repeat what I said before, cvs do not actually keep track of 
directories. cvs only keep track of files. However, files have paths, 
and in Unix, this means that those directories must exist for cvs to be 
able to update the files. Adding a file implicitly then means that any 
intermediate directories will also come into existence. And if you 
delete all files in a directory, cvs normally leaves an empty directory 
around, which is why -P might be useful, as it removes any empty 
directories left around after an update.


Johnny



Re: gdb.old is broken

2016-10-13 Thread Johnny Billquist

On 2016-10-13 03:20, Christos Zoulas wrote:

On Oct 13,  8:45am, rokuy...@rk.phys.keio.ac.jp (Rin Okuyama) wrote:
-- Subject: gdb.old is broken

| Hi,
|
| Some files are missing in gdb.old. FYI, I listed files imported by this commit
|http://mail-index.netbsd.org/source-changes/2016/10/12/msg078360.html
| but not added by this one
|http://mail-index.netbsd.org/source-changes/2016/10/12/msg078361.html
| Please find the attached file.

I think I got them all now... Bug in CVS, adds CVS directories when you
add files, but does not add the directories.


CVS do not explicitly manage directories. It's not a bug, but the way 
CVS works. If you do an update, you need to give -d for CVS to create 
new directories needed. But I would assume everyone here knows this.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: CCD device

2016-03-28 Thread Johnny Billquist
I'm happy to report the Michael van Elst found the problem and provided 
me with a fix that I have verified solves the problem.
The problem itself was because ccd changed from using DIOCGPART to using 
DIOCGPARTINFO. In combination with an ancient ra device driver that 
didn't populate the disk geometry correctly, caused my failure.


I'm hoping Michael will submit the fix into the tree. Meanwhile I'm now 
updating the 8650 with the rest of -current. However, as I've already 
reported on port-vax, native builds still do not work.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: CCD device

2016-03-28 Thread Johnny Billquist

On 2016-03-28 16:39, Johnny Billquist wrote:

On 2016-03-28 15:17, Johnny Billquist wrote:

On 2016-03-28 15:08, Johnny Billquist wrote:

On 2016-03-28 10:54, Martin Husemann wrote:

On Mon, Mar 28, 2016 at 01:42:23AM +0200, Johnny Billquist wrote:

The error is:
Configuring CCD devices.
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1

Did we do some incompatible change recently? Not too happy with this
one.


Nothing obvious, can you add printfs to the two EINVAL returns in the
case CCDIOCSET: section?


Yech! It's more complicated than that.

After trying to just do that and getting nothing, I added:
 printf("ccd ioctl: %lu\n",cmd);

as the first line of  ccdioctl(), and the output is:

Configuring CCD devices.
ccd ioctl: 3223078416
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1
ccd ioctl: 2149606504

So something broken in the syscall handling, the new compiler (this is
on a VAX), or something else that I don't get yet... I should point out
that apart from ccd, I've successfully booted that same kernel (another
on another machine where I was not using ccd).


And to make it clear: It never comes to the CCDIOCSET handling where it
could return EINVAL. I have not really looked much deeper into the code.


Ok. Decided to dig a little deeper into this.

It's actually ccdinit() that fails. And the line where it fails is:

 error = getdisksize(vpp[ix], , );

Anyone know anything more about any newly introduced changes around the
struct vnode (which is what vpp is), or getdisksize()?

I now also realize that another error I was seeing at boot time, which I
assumed was harmless, is probably also related:

Found ra0 at mscpbus0 drive 0: RA73
Found ra1 at mscpbus0 drive 1: RA73
Found ra2 at mscpbus0 drive 2: RA73
Found ra3 at mscpbus0 drive 3: RA73
Found ra4 at mscpbus1 drive 4: RA73
Found ra5 at mscpbus1 drive 5: RA73
Found ra6 at mscpbus1 drive 6: RA73
Found ra7 at mscpbus1 drive 7: RA73
ra0: size 3920490 sectors
RAIDframe: can't get disk size for dev ra0 (22)
ra1: size 3920490 sectors
RAIDframe: can't get disk size for dev ra1 (22)
ra2: size 3920490 sectors
RAIDframe: can't get disk size for dev ra2 (22)
ra3: no disk label: size 3920490 sectors
RAIDframe: can't get disk size for dev ra3 (22)
ra4: size 3920490 sectors
RAIDframe: can't get disk size for dev ra4 (22)
ra5: size 3920490 sectors
RAIDframe: can't get disk size for dev ra5 (22)
ra6: size 3920490 sectors
RAIDframe: can't get disk size for dev ra6 (22)
ra7: size 3920490 sectors
RAIDframe: can't get disk size for dev ra7 (22)
boot device: ra0


Some more information, as I turned on CCDDBUG anyway:

Configuring CCD devices.
ccdopen(0x1102, 0x3)
ccdioctl: component 0: 0x7f506060
ccdioctl: component 1: 0x7f506070
ccdioctl: component 2: 0x7f506080
ccdioctl: component 3: 0x7f506090
ccdioctl: lookedup = 0
ccdioctl: lookedup = 1
ccdioctl: lookedup = 2
ccdioctl: lookedup = 3
ccd cp11: 0
ccd0: ccdinit
ccd0: /dev/ra4d: disksize failed, error = 22
ccd cp12: 22
dclose(0x1102, 0x3)
dconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument


(The ccd cp11 and cp12 lines are my own debug printouts.)

And to just try and document things a bit more for people who might need 
the information. My ccd.conf contains (only):

ccd016  none/dev/ra4d /dev/ra5d /dev/ra6d /dev/ra7d

And the disklabels on the disks looks good, and as I said at the start. 
It works ok on 7.99.23. So something happened between 7.99.23 and 7.99.26.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: CCD device

2016-03-28 Thread Johnny Billquist

On 2016-03-28 15:17, Johnny Billquist wrote:

On 2016-03-28 15:08, Johnny Billquist wrote:

On 2016-03-28 10:54, Martin Husemann wrote:

On Mon, Mar 28, 2016 at 01:42:23AM +0200, Johnny Billquist wrote:

The error is:
Configuring CCD devices.
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1

Did we do some incompatible change recently? Not too happy with this
one.


Nothing obvious, can you add printfs to the two EINVAL returns in the
case CCDIOCSET: section?


Yech! It's more complicated than that.

After trying to just do that and getting nothing, I added:
 printf("ccd ioctl: %lu\n",cmd);

as the first line of  ccdioctl(), and the output is:

Configuring CCD devices.
ccd ioctl: 3223078416
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1
ccd ioctl: 2149606504

So something broken in the syscall handling, the new compiler (this is
on a VAX), or something else that I don't get yet... I should point out
that apart from ccd, I've successfully booted that same kernel (another
on another machine where I was not using ccd).


And to make it clear: It never comes to the CCDIOCSET handling where it
could return EINVAL. I have not really looked much deeper into the code.


Ok. Decided to dig a little deeper into this.

It's actually ccdinit() that fails. And the line where it fails is:

error = getdisksize(vpp[ix], , );

Anyone know anything more about any newly introduced changes around the 
struct vnode (which is what vpp is), or getdisksize()?


I now also realize that another error I was seeing at boot time, which I 
assumed was harmless, is probably also related:


Found ra0 at mscpbus0 drive 0: RA73
Found ra1 at mscpbus0 drive 1: RA73
Found ra2 at mscpbus0 drive 2: RA73
Found ra3 at mscpbus0 drive 3: RA73
Found ra4 at mscpbus1 drive 4: RA73
Found ra5 at mscpbus1 drive 5: RA73
Found ra6 at mscpbus1 drive 6: RA73
Found ra7 at mscpbus1 drive 7: RA73
ra0: size 3920490 sectors
RAIDframe: can't get disk size for dev ra0 (22)
ra1: size 3920490 sectors
RAIDframe: can't get disk size for dev ra1 (22)
ra2: size 3920490 sectors
RAIDframe: can't get disk size for dev ra2 (22)
ra3: no disk label: size 3920490 sectors
RAIDframe: can't get disk size for dev ra3 (22)
ra4: size 3920490 sectors
RAIDframe: can't get disk size for dev ra4 (22)
ra5: size 3920490 sectors
RAIDframe: can't get disk size for dev ra5 (22)
ra6: size 3920490 sectors
RAIDframe: can't get disk size for dev ra6 (22)
ra7: size 3920490 sectors
RAIDframe: can't get disk size for dev ra7 (22)
boot device: ra0

Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: CCD device

2016-03-28 Thread Johnny Billquist

On 2016-03-28 15:08, Johnny Billquist wrote:

On 2016-03-28 10:54, Martin Husemann wrote:

On Mon, Mar 28, 2016 at 01:42:23AM +0200, Johnny Billquist wrote:

The error is:
Configuring CCD devices.
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1

Did we do some incompatible change recently? Not too happy with this
one.


Nothing obvious, can you add printfs to the two EINVAL returns in the
case CCDIOCSET: section?


Yech! It's more complicated than that.

After trying to just do that and getting nothing, I added:
 printf("ccd ioctl: %lu\n",cmd);

as the first line of  ccdioctl(), and the output is:

Configuring CCD devices.
ccd ioctl: 3223078416
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1
ccd ioctl: 2149606504

So something broken in the syscall handling, the new compiler (this is
on a VAX), or something else that I don't get yet... I should point out
that apart from ccd, I've successfully booted that same kernel (another
on another machine where I was not using ccd).


And to make it clear: It never comes to the CCDIOCSET handling where it 
could return EINVAL. I have not really looked much deeper into the code.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: CCD device

2016-03-28 Thread Johnny Billquist

On 2016-03-28 10:54, Martin Husemann wrote:

On Mon, Mar 28, 2016 at 01:42:23AM +0200, Johnny Billquist wrote:

The error is:
Configuring CCD devices.
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1

Did we do some incompatible change recently? Not too happy with this one.


Nothing obvious, can you add printfs to the two EINVAL returns in the
case CCDIOCSET: section?


Yech! It's more complicated than that.

After trying to just do that and getting nothing, I added:
printf("ccd ioctl: %lu\n",cmd);

as the first line of  ccdioctl(), and the output is:

Configuring CCD devices.
ccd ioctl: 3223078416
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1
ccd ioctl: 2149606504

So something broken in the syscall handling, the new compiler (this is 
on a VAX), or something else that I don't get yet... I should point out 
that apart from ccd, I've successfully booted that same kernel (another 
on another machine where I was not using ccd).


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


CCD device

2016-03-27 Thread Johnny Billquist
Hi. I installed the latest current (7.99.26) on a machine that was 
previously running 7.99.23.
Booting the new kernel fails. Still using the old userland. The problem 
is the CCD device.


The error is:
Configuring CCD devices.
ccdconfig: ioctl (CCDIOCSET): /dev/ccd0c: Invalid argument
/etc/rc.d/ccd exited with code 1

Did we do some incompatible change recently? Not too happy with this one.

Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: vax switched to gcc-5.3

2016-03-27 Thread Johnny Billquist

On 2016-03-27 18:44, Johnny Billquist wrote:


Hmm. I seem to remember that I've reported something that might be this
same thing a year or so ago. Should be in the archives...


Nah. That was in December, related to ntp, and the problem was a 
log10(0) call. So my memory was playing tricks with me.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: vax switched to gcc-5.3

2016-03-27 Thread Johnny Billquist

On 2016-03-27 16:12, Christos Zoulas wrote:

On Mar 27, 12:11pm, b...@update.uu.se (Johnny Billquist) wrote:
-- Subject: Re: vax switched to gcc-5.3

| On 2016-03-24 02:46, Johnny Billquist wrote:
| > On 2016-03-24 01:08, Christos Zoulas wrote:
| >> Tested with simh... The most important arch switches first!
| >
| > Nice! Now I need to try and do a new native build and see if I have any
| > more luck... On a real 8650 of all things... :-)
|
| No luck. :-(
| With current from yesterday:
|
| libtool: compile:  cc -DHAVE_CONFIG_H -I.
| -I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP
| -O2 -pedantic -fomit-frame-pointer -c
| /usr/src/tools/gmp/../../external/lgpl3/gmp/dist/tal-reent.c -o tal-reent.o
| /bin/sh ./libtool --tag=CC--mode=compile cc -DHAVE_CONFIG_H -I.
| -I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP
|-O2 -pedantic -fomit-frame-pointer -c -o mpn/toom63_mul.lo
| mpn/toom63_mul.c
| libtool: compile:  cc -DHAVE_CONFIG_H -I.
| -I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP
| -O2 -pedantic -fomit-frame-pointer -c mpn/toom63_mul.c -o mpn/toom63_mul.o
| cc: internal compiler error: Illegal instruction (program cc1 received
| signal 4)
| Please submit a full bug report,
| with preprocessed source if appropriate.
| See <http://www.NetBSD.org/support/send-pr.html> for instructions.

Can you run with -dH and see with gdb where it core-dumps?


Huff:gmp/build# gdb /usr/libexec/cc1
GNU gdb (GDB) 7.10.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "vax--netbsdelf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/libexec/cc1...(no debugging symbols found)...done.
(gdb) core cc1.core
[New process 1]
warning: Corrupted shared library list: 0x0 != 0xb5ec6439
warning: Error reading shared library list entry at 0x3
Core was generated by `cc1'.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x006b9561 in fibonacci_heap<sreal, 
cgraph_edge>::delete_node(fibonacci_node<sreal, cgraph_edge>*, bool) 
(2147473112, 2133887200, 0)

(gdb) bt
#0  0x006b9561 in fibonacci_heap<sreal, 
cgraph_edge>::delete_node(fibonacci_node<sreal, cgraph_edge>*, bool) 
(2147473112, 2133887200, 0)

Backtrace stopped: Cannot access memory at address 0x70b53a03
(gdb)


Hmm. I seem to remember that I've reported something that might be this 
same thing a year or so ago. Should be in the archives...


Johnny


--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: vax switched to gcc-5.3

2016-03-27 Thread Johnny Billquist

On 2016-03-27 13:23, David Brownlee wrote:

On 27/03/2016, Johnny Billquist <b...@update.uu.se> wrote:

On 2016-03-24 02:46, Johnny Billquist wrote:

On 2016-03-24 01:08, Christos Zoulas wrote:

Tested with simh... The most important arch switches first!


Nice! Now I need to try and do a new native build and see if I have any
more luck... On a real 8650 of all things... :-)


No luck. :-(
With current from yesterday:

libtool: compile:  cc -DHAVE_CONFIG_H -I.
-I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP
-O2 -pedantic -fomit-frame-pointer -c
/usr/src/tools/gmp/../../external/lgpl3/gmp/dist/tal-reent.c -o tal-reent.o
/bin/sh ./libtool --tag=CC--mode=compile cc -DHAVE_CONFIG_H -I.
-I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP
-O2 -pedantic -fomit-frame-pointer -c -o mpn/toom63_mul.lo
mpn/toom63_mul.c
libtool: compile:  cc -DHAVE_CONFIG_H -I.
-I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP
-O2 -pedantic -fomit-frame-pointer -c mpn/toom63_mul.c -o mpn/toom63_mul.o
cc: internal compiler error: Illegal instruction (program cc1 received
signal 4)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://www.NetBSD.org/support/send-pr.html> for instructions.


Just checking - its not memory related? Do you get the same error if
you run it unlimited?


Unfortunately yes.

Johnny


--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: vax switched to gcc-5.3

2016-03-27 Thread Johnny Billquist

On 2016-03-24 02:46, Johnny Billquist wrote:

On 2016-03-24 01:08, Christos Zoulas wrote:

Tested with simh... The most important arch switches first!


Nice! Now I need to try and do a new native build and see if I have any
more luck... On a real 8650 of all things... :-)


No luck. :-(
With current from yesterday:

libtool: compile:  cc -DHAVE_CONFIG_H -I. 
-I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP 
-O2 -pedantic -fomit-frame-pointer -c 
/usr/src/tools/gmp/../../external/lgpl3/gmp/dist/tal-reent.c -o tal-reent.o
/bin/sh ./libtool --tag=CC--mode=compile cc -DHAVE_CONFIG_H -I. 
-I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP 
  -O2 -pedantic -fomit-frame-pointer -c -o mpn/toom63_mul.lo 
mpn/toom63_mul.c
libtool: compile:  cc -DHAVE_CONFIG_H -I. 
-I/usr/src/tools/gmp/../../external/lgpl3/gmp/dist -D__GMP_WITHIN_GMP 
-O2 -pedantic -fomit-frame-pointer -c mpn/toom63_mul.c -o mpn/toom63_mul.o
cc: internal compiler error: Illegal instruction (program cc1 received 
signal 4)

Please submit a full bug report,
with preprocessed source if appropriate.
See <http://www.NetBSD.org/support/send-pr.html> for instructions.

Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: vax switched to gcc-5.3

2016-03-23 Thread Johnny Billquist

On 2016-03-24 01:08, Christos Zoulas wrote:

Tested with simh... The most important arch switches first!


Nice! Now I need to try and do a new native build and see if I have any 
more luck... On a real 8650 of all things... :-)


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Test files in etc.tgz

2015-11-29 Thread Johnny Billquist

Why are there now numerous test files in the etc.tgz tarball?

Looking, I see
/usr/libdata/debug/usr/tests/...
/usr/tests/...
/var/db/obsolete/tests/...

Shouldn't these be in tests.tgz ?
Or what is the point of tests.tgz ?

Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: Test files in etc.tgz

2015-11-29 Thread Johnny Billquist

On 2015-11-29 16:18, Greg Troxel wrote:


Johnny Billquist <b...@softjar.se> writes:


Why are there now numerous test files in the etc.tgz tarball?

Looking, I see
/usr/libdata/debug/usr/tests/...
/usr/tests/...
/var/db/obsolete/tests/...

Shouldn't these be in tests.tgz ?
Or what is the point of tests.tgz ?


Which release?  On recent -7 and -current builds from source, tests are
in tests.tgz and not in etc.tgz.  Are you using non-default options?


Did a straight build.sh build distribution sets from current yesterday. 
The only option was that I did a cross compile for vax, so the full 
command line was:


"./build.sh -m vax build distribution sets >& m.log &", and that is what 
I ended up with.


Installed those sets on a VAX today (well, not etc.tgz actually), and 
ran etcupdate, which also started wanting to create all these test 
directories, which is when I noted this. Went back and checked etc.tgz, 
and the files were in there too, which is when I wrote the mail.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: Bad sleep time resolution of nanosleep(2)

2015-11-23 Thread Johnny Billquist

On 2015-11-24 01:58, Rhialto wrote:

On Tue 24 Nov 2015 at 00:41:42 +0100, Joerg Sonnenberger wrote:

On Tue, Nov 24, 2015 at 12:25:45AM +0100, Rhialto wrote:

In the context of the machine simulator simh, which needs some accurate
timing now and then, I have come across an example of rather bad time
resolution of the nanosleep() system call.  The minimal sleep time seems
to be 20 ms, even if you ask for just 1 ms delay. If you ask for longer
sleeps, the discrepancy becomes relatively less but remains substantial:
20 ms becomes 30 ms, and 40 ms becomes 50.


Well, it is rounded up first to whole ticks, that's the easy part. Next
the callout is scheduled at the tick boundary and then the LWP is
unblocked and scheduled again. It will run in the next scheduling cycle
unless nothing else is running?


I tried it on some fairly idle machines, and the result was quite
consistent. It really looks like there is something in there that
inadvertently always causes an extra tick delay.


Are you sure it's not just a case of the (re)scheduling only happening 
at the next clock tick after the timer runs out?


We do not have a real time system here. sleeps only guarantee a minimum 
time. There is no upper bound to how long it will sleep.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: DoS attack against TCP services

2015-02-04 Thread Johnny Billquist
Are you *sure* the same connections stay around forever, or might it 
just be that you get new ones at a higher rate than old ones go away?


Johnny

On 2015-02-04 19:44, 6b...@6bone.informatik.uni-leipzig.de wrote:

Now the server has over 5000 TIME_WAIT connections.

netstat -a -n | grep TIME_WAIT
tcp0  0  139.18.25.33.59256 198.6.1.83.53
TIME_WAIT
tcp0  0  139.18.25.33.59257 77.222.50.250.53
TIME_WAIT
tcp0  0  139.18.25.33.59258 193.232.128.6.53
TIME_WAIT
tcp0  0  139.18.25.33.59259 78.104.145.37.53
TIME_WAIT
tcp0  0  139.18.25.33.59260 192.5.6.30.53
TIME_WAIT
tcp0  0  139.18.25.33.59261 192.41.162.30.53
TIME_WAIT
tcp0  0  139.18.25.33.59262 192.35.51.30.53
TIME_WAIT
tcp0  0  139.18.25.33.59263 192.43.172.30.53
TIME_WAIT
tcp0  0  139.18.25.33.59264 202.12.27.33.53
TIME_WAIT
...

It seems to be a result of the named. lsof shows that the connections
are not owned by named. lsof doesn't show any of the TIME_WAIT
connections. So stopping and restarting named doesn't delete the
connections.

Any more things that could be interessing for a problem report?


Regards
Uwe


On Wed, 4 Feb 2015, Christos Zoulas wrote:


Date: Wed, 4 Feb 2015 15:40:00 + (UTC)
From: Christos Zoulas chris...@astron.com
To: current-users@netbsd.org
Subject: Re: DoS attack against TCP services

In article
pine.neb.4.64.1502041602460@6bone.informatik.uni-leipzig.de,
6b...@6bone.informatik.uni-leipzig.de wrote:

Hello,

The problem occurred again. The kernel has over 3,000 connections in
TIME_WAIT state. The compounds are after an hour wait not disappeared.
There are more and more connections in the TIME_WAIT state. My settings
are:

net.inet.tcp.mslt.enable = 1
net.inet.tcp.mslt.loopback = 2
net.inet.tcp.mslt.local = 10
net.inet.tcp.mslt.remote = 60
net.inet.tcp.mslt.remote_threshold = 6

The last few times I have restarted the server in order to solve the
problem. Frequent reboots but very inconvenient for a server.

Does anyone have instructions what information I can still gather to
post
a bug report? The statement connections in the TIME_WAIT status are not
degraded are probably not sufficient to find the problem.


Thank you for your efforts


Can you find what daemon/process is being connected to and from where?

christos