Re: 800+ tests failing

2016-08-10 Thread Robert Elz
Date:Thu, 11 Aug 2016 01:51:54 +0700
From:Robert Elz 
Message-ID:  <27517.1470855...@andromeda.noi.kre.to>

  | I'll start poking them in an hour or so, but anyone else should feel free...

I think that (with Roy's help) all of the -lrumpdev related test failures
might be fixed now (not impossible that I missed one somewhere, especially
if it was expected to fail for some other reason.)

There are three remaining groups of new problems that are breaking tests that
I have not attempted to fix...

1) link numbering seems to have changed - tests looking for a route
   on link#2 are now seeing it on link#3 and failing (and similar.)

2) rumpdev_tap seems to be broken, creating a tap interface fails
   (rump.ifconfig -C lists tap as an existing cloner interface)

3) ext2fs tests are failing all over the place ... since ext2fs just
   had an upgrade (parts of which did not look very finished to me)
   these problems might not really be surprising.

Someone who understands each of these could take a look please.

kre



Re: Building on OS X - how?

2016-08-10 Thread coypu
There shouldn't be anything special about building from OS X.
You just ran into a setup that happens to not work with it.

It'd be good to file a bug report for it, or at least mention where it
fails.


Re: multipath fibre channel

2016-08-10 Thread Swift Griggs
On Wed, 10 Aug 2016, 6b...@6bone.informatik.uni-leipzig.de wrote:
> I want to configure multipath for a fibre channel storage. I need only the
> availability, not the performance.

As others have already said, you can't. However, ... 

RAID1 on the same LUN wouldn't work or make sense. However, to mitigate 
switch/path failures you could do the following to gain path-resiliency:

1. Create two LUNs identical sized on the SAN

2. On the SAN unit, present the LUNs out separate 
   fiber-adapters/paths/fabrics/switches.

3. Obviously you'll need to connect your HBAs to the two independent 
   paths. 

3. Use LVM or RAIDFrame to mirror the two LUNs into a RAID1 configuration.

4. Celebrate. You've used 2x the storage, but gained resiliency and can 
   withstand a complete path failure.

Thanks,
  Swift


Re: 800+ tests failing

2016-08-10 Thread Robert Elz
Date:Wed, 10 Aug 2016 18:39:02 +
From:Martin Husemann 
Message-ID:  <20160810183902.ga23...@homeworld.netbsd.org>

  | The ones I looked at all said:
  | 
  | stderr:
  | /usr/lib/librumpnet_net.so: Undefined PLT symbol 
"rumpns_config_cfdriver_attach" (symnum = 299)

Yes, just saw those too - I was expecting something like that.

They just need 
-lrumpdev
added to whatever command is failing.   There's been a lot of this fixed
already, but there's obviously a lot more to come.   Something tells me
there really ought be a better way...

I'll start poking them in an hour or so, but anyone else should feel free...

kre



Re: 800+ tests failing

2016-08-10 Thread Martin Husemann
On Wed, Aug 10, 2016 at 05:28:48PM +0700, Robert Elz wrote:
> After the fix just committed, I am now seeing (testing in an amd64 xen DomU,
> not qemu, so the results might differ - also this was not the cleanest
> environ possible - the installation system has been used a few times before
> which could also affect things, possibly)
> 
> Summary for 657 test programs:
> 4442 passed test cases.
> 128 failed test cases.
> 38 expected failed test cases.
> 90 skipped test cases.

On one of my "official" test systems I got 

Summary for 648 test programs:
 4103 passed test cases.
 108 failed test cases.
 36 expected failed test cases.
 102 skipped test cases.

so 108 up from 15 failures last week.

The ones I looked at all said:

stderr:
/usr/lib/librumpnet_net.so: Undefined PLT symbol 
"rumpns_config_cfdriver_attach" (symnum = 299)


Martin


Re: 800+ tests failing

2016-08-10 Thread Paul Goyette

On Wed, 10 Aug 2016, Martin Husemann wrote:


On Tue, Aug 09, 2016 at 09:12:57AM +, Martin Husemann wrote:

This is a call to in_control() with ifp = NULL.


Here is a trace with symbols:

tc-se:#5  0x7f7ff68637db in panic (fmt=) at 
/usr/src/lib/librump/../../sys/rump/../kern/subr_prf.c:258
tc-se:#6  0x7f7ff7085d3a in in_control0 (ifp=0x0, data=0x7f7fd4e0, 
cmd=2151704858, so=) at 
/usr/src/sys/rump/net/lib/libnet/../../../../netinet/in.c:466
tc-se:#7  in_control (so=, cmd=2151704858, data=0x7f7fd4e0, 
ifp=0x0) at /usr/src/sys/rump/net/lib/libnet/../../../../netinet/in.c:743
tc-se:#8  0x7f7ff7400c81 in rumpcompinitRUMP_COMPONENT_NET_IFCFG () at 
/usr/src/sys/rump/net/lib/libnetinet/netinet_component.c:90
tc-se:#9  0x7f7ff689f0b8 in rump_component_init 
(type=RUMP_COMPONENT_NET_IFCFG) at 
/usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:606
tc-se:#10 0x7f7ff689f0b8 in rump_component_init 
(type=type@entry=RUMP__FACTION_NET) at 
/usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:606
tc-se:#11 0x7f7ff689f604 in rump_init () at 
/usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:433

So: lo0ifp is NULL, since loopinit() has never been called (or if_loop.c
is not part of the rump libnetinet component?)

Paul? Christos?

How is this supposed to happen? Mark all ${cloner}init() functions as
constructors in the rump case?


I'm really off-line, with only Email access, so cannot look very 
closely.


However, I don't think the suggestion will work.  All the ${cloner}init 
routines will already be called as part of rump's module initialization, 
so it doesn't make sense to me to call them again as a rump constructor.


I will have normal access in early Sept. but I'm pretty sure you'd like 
a fix before then.  I hope Christos can dive in deeper.


If not, then perhaps all of these changes should be reverted.  That's a 
rather drastic solution, but might be necessary.




+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+--+--++


Re: 800+ tests failing

2016-08-10 Thread Robert Elz
After the fix just committed, I am now seeing (testing in an amd64 xen DomU,
not qemu, so the results might differ - also this was not the cleanest
environ possible - the installation system has been used a few times before
which could also affect things, possibly)

Summary for 657 test programs:
4442 passed test cases.
128 failed test cases.
38 expected failed test cases.
90 skipped test cases.

That's not quite back to normal yet, but from what I watched scroll
past as the tests were running, I suspect that most of the tests still
failing are probably more symptoms of missing -lrumpdev (or similar).

kre



Building on OS X - how?

2016-08-10 Thread Hubert Feyrer


Hi,

for a long time I've cross-built -current/amd64 from OS X. After following 
the advice to install Command Line Tools[1] as a second compiler besides 
Xcode, things went downhill and I see different build errors with no 
special build flags, or with various settings like "-V 
HOST_CC=/Developer/usr/bin/cc -V HOST_CXX=/Developer/usr/bin/

c++" (LLVM) and "-V HOST_CC=/usr/bin/cc -V HOST_CXX=/usr/bin/g++" (clang).

I've added a bit more information below[2], but to cut a long story short 
- what excact build.sh options does one use these days to cross-build 
-current/amd64 from OS X? Did I miss any documentation[3]


Thanks in advance!


 - Hubert


[1] https://developer.apple.com/downloads/
The link is from pkgsrc/bootstrap/README.MacOSX

[2]
% xcode-select -v
xcode-select version 2339.

% uname -a
Darwin promise.local 14.5.0 Darwin Kernel Version 14.5.0: Thu Jun 16 
19:58:21 PDT 2016; root:xnu-2782.50.4~1/RELEASE_X86_64 x86_64


% gcc -v
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
--with-gxx-include-dir=/usr/include/c++/4.2.1

Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin14.5.0
Thread model: posix

% /Developer/usr/bin/gcc -v
couldn't understand kern.osversion `14.5.0'
Using built-in specs.
Target: i686-apple-darwin11
Configured with: 
/private/var/tmp/llvmgcc42/llvmgcc42-2336.1~22/src/configure 
--disable-checking --enable-werror --prefix=/Developer/usr/llvm-gcc-4.2 
--mandir=/share/man --enable-languages=c,objc,c++,obj-c++ 
--program-prefix=llvm- --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ 
--with-slibdir=/usr/lib --build=i686-apple-darwin11 
--enable-llvm=/private/var/tmp/llvmgcc42/llvmgcc42-2336.1~22/dst-llvmCore/Developer/usr/local 
--program-prefix=i686-apple-darwin11- --host=x86_64-apple-darwin11 
--target=i686-apple-darwin11 --with-gxx-include-dir=/usr/include/c++/4.2.1

Thread model: posix
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.1.00)

[3] neither the NetBSD guide nor src/BUILDING provide help on
cross-building from OS X in particular


Re: multipath fibre channel

2016-08-10 Thread Stephen Borrill

On Wed, 10 Aug 2016, 6b...@6bone.informatik.uni-leipzig.de wrote:
I want to configure multipath for a fibre channel storage. I need only the 
availability, not the performance. For netbsd I have not found any 
documentation on this subject. Is multipath possible for FC storages?


There's no multipath support at all :-(

It could be added most easily to LVM.


If not, it is possible / useful fo use a software raid1 over both paths?


That would seem to be a bad idea as at the backend both paths point to the 
same thing.


--
Stephen


Re: 800+ tests failing

2016-08-10 Thread Robert Elz
Date:Wed, 10 Aug 2016 06:43:27 +0200
From:Martin Husemann 
Message-ID:  <20160810044327.ga8...@mail.duskware.de>

  | Here is a trace with symbols:

I think I may have an idea what happened - testing a possible fix now
(a real fix, I think, not just a gross hack...)

Your tracebacks helped - A LOT - thanks.

kre



Re: 800+ tests failing

2016-08-10 Thread Robert Elz
Date:Wed, 10 Aug 2016 06:43:27 +0200
From:Martin Husemann 
Message-ID:  <20160810044327.ga8...@mail.duskware.de>

  | So: lo0ifp is NULL, since loopinit() has never been called (or if_loop.c
  | is not part of the rump libnetinet component?)

I wonder if this is perhaps related to the same issues that led to ...

  | Module Name:src
  | Committed By:   christos
  | Date:   Sun Aug  7 11:33:38 UTC 2016
  | Log Message:
  | don't load loopback as a module as other parts of the code use it directly. 

Maybe we can make some gross hack in rump for now to simulate the
same effect, and then it can get fixed correctly when someone who
knows how is available to do it...

kre