Re: Proposal: validate FFS root inode during the mount.

2019-11-20 Thread Warner Losh
Kirk has added a bunch of checksum and other integrity checks to FreeBSD.
If you are looking for extra sanity, that might be a good place to snag
code from. They would be more comprehensive than just checking the root
node...

Warner

On Wed, Nov 20, 2019, 8:39 AM Mouse  wrote:

> >>> To make sure that corrupted mount won't cause harm to the user, I
> >>> want to add function to validate root inode on mount [...]
> >> Don't you have more or less the same issue with every other non-free
> >> inode in the filesystem?
> > I think the point is, when the root inode is corrupted, you can't
> > unmount then filesystem.
>
> If that were the problem, I'd expect the fix to be support for forcibly
> unmounting filesystems even when they're in bizarre states like that.
> Arguably that is something that should go in anyway.  I long ago added
> a flag to umount(8)
>
>  -R  Take the special | node argument as a path to be passed
> directly
>  to unmount(2), bypassing all attempts to be smart about
> mechani-
>  cally determining the correct path from the argument.  This
>  option is incompatible with any option that potentially
> unmounts
>  more than one filesystem, such as -a, but it can be used with
> -f
>  and/or -v.  This is the only way to unmount something that
> does
>  not appear as a directory (such as a nullfs mount of a plain
>  file); there are probably other cases where it is necessary.
>
> Could that be suitable for dealing with the "can't unmount" aspect, or
> is there kernel work needed too?  The initial post indicates that there
> is crasher behaviour involved, though it's not clear to what extent
> it's directly related to the "can't unmount" syndrome - the post says
> it can't be unmounted, but blames umount, not unmount, so it's not
> clear to me whether that's userland's fault or not.
>
> /~\ The ASCII Mouse
> \ / Ribbon Campaign
>  X  Against HTMLmo...@rodents-montreal.org
> / \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
>


Re: Adding an ioctl to check for disklabel existence

2019-10-03 Thread Warner Losh
On Thu, Oct 3, 2019 at 9:19 AM Robert Elz  wrote:

>  Now it makes no sense at all.
>

FreeBSD made the explicit decision when disks were sneaking up on 2TB to
move to GPT labels. Why invent a new scheme that interoperates poorly with
other things? GPT, for better or worse, won. disklabel64 would add no value
over GPT, require a lot of extra code and be an ongoing source of confusion
and difficulty for our users. This is why UFS2 didn't bring in a 64-bit
disklabel format...

NetBSD is, of course, free to do what it likes. My semi-outsider's view
suggests, though, that the FreeBSD experience is relevant and timely.

Warner


Re: re-enabling debugging of 32 bit processes with 64 bit debugger

2019-06-30 Thread Warner Losh
On Sat, Jun 29, 2019 at 2:04 PM Christos Zoulas  wrote:

> In article  zccfacpsbmoz-4xguf54n...@mail.gmail.com>,
> Andrew Cagney   wrote:
> >
> >Having 32-bit and 64-bit debuggers isn't sufficient.  Specifically, it
> >can't handle an exec() call where the new executable has a different
> >ISA; and this imnsho is a must have.
>
> It is really hard to make a 32 bit debugger work on a 64 bit system
> because of the tricks we play with the location of the shared
> libraries in rtld and the debugger needs to be aware of them.
> In retrospect it would have been simpler (and uglier) to have
> /32 and /64 in all the shared library paths so that they would
> not occupy the filesystem space, but even then things could break
> for raw dlopen() calls, or opening other data files that are not
> size neutral. HP/UX with Context Dependent Files and IBM/AIX with
> Hidden Directories were attempts to a solution, but they created
> a new dimension of problems.
>

I came to a similar conclusion when I hacked FreeBSD rtld to grok the
difference between hard and soft float on the same system at the same time.

Warner


Re: Removing PF

2019-04-01 Thread Warner Losh
On Sun, Mar 31, 2019, 8:13 AM Sevan Janiyan  wrote:

>
>
> On 30/03/2019 23:27, Mouse wrote:
> > (Rule of thumb: anyone who calls something "secure" or
> > "insecure" without giving any indication of the threat model in
> > question either doesn't understand security or hopes you don't; neither
> > alternative is good.  It's not universally applicable - here, for
> > example, I suspect you were just being a bit over-brief - but it's been
> > remarkably useful to me.)
>
>
> Deeming it insecure on that basis of all the bug fixes upstream have
> which haven't been merged in our tree since our last sync including
> published patches from around this point onwards:
> https://www.openbsd.org/errata42.html both of which need to be evaluated
> to see if applicable.
>

Also on the basis of nobody doing this for years, I'd say this is prime
evidence for there being no effective maintainer for years.

Warner

>


Re: Removing PF

2019-03-30 Thread Warner Losh
On Sat, Mar 30, 2019, 2:29 PM Maxime Villard  wrote:

> Le 30/03/2019 à 20:26, Michael van Elst a écrit :
> > On Sat, Mar 30, 2019 at 08:10:21PM +0100, Maxime Villard wrote:
> >
> >> ... sure, meanwhile you didn't really answer to the core of the issue,
> which
> >> I think was stated clearly by Sevan ...
> >
> > The issue is that we need to work on npf before we can drop other code.
>
> ... the questions raised were: why would someone use an insecure firewall?
> ...
> and isn't it irresponsible to provide an insecure firewall? ... you still
> fail to answer ... I see fewer and fewer reasons to keep talking to you,
> given your clear inability to answer in good faith ...
>

Also, this is a plan to depreciate, not remove from the tree tomorrow.
Declaring it for all to see that it is a rotting, festering caucus is a
good thing. Maybe, it will spur someone to fix that. Extremely unlikely,
but possible. It does let the users know with enough time to migrate and/or
enhance npf to meet their needs. It starts to break the log jam that has
lead to three under supported firewalls in the tree.

Warner

>


Re: Regarding the ULTRIX and OSF1 compats

2019-03-16 Thread Warner Losh
Picking a random message in this thread to respond to.

FreeBSD has struggled with deprecation as well (which is what this is).

I'm working on a doc to help there, but the basic criteria are:

1. What is the cost to keep it. Include the API change tax here.
2. What is the benefit the project gets from it. How many people use THING
and how much "good" do we get out of this.
3. Is the THING working for anything non-trivial?
4. Is there someone actively looking after THING?

It's basically nothing more than a cost-benefit analysis.

In the case of COMPAT_ULTRIX (which is not going away) you'd get:

1. Cost is low, though not zero. It's a thin veneer over stuff the system
would have anyway.
2. Some people are still running Ultrix binaries.
3. As far as has been reported, it's useful for non-trivial binaries.
4. Nobody is really looking after it, but there's enough use to generate
bug fixes.

So on the whole, there's some benefit at a modest cost to keeping a feature
that's basically working. Keep is a decent decision.

In the case of COMPAT_OSF (which some would like to be removed):

1. Cost is relatively high, as there's parts we'd not have in a normal
system (MACH features missing, must make API changes blind, no way to test)
2. Nobody has reported OSF binaries in recent memory, though some used it
years ago (it was quite important in the 90s for alpha bring up).
3. It's basically broken. Non-trivial binaries are impossible because of
the missing bits.
4. No one is looking after it.

Which is all negative: there's no benefit for something that's not known to
be working, and even if it was working it's incomplete for a user base of
zero with no maintainer. Add to that that since there's no good way to
test, the work to keep it compiling is make-work: it's a box to tick that
provides no benefit other than ticking the box.

Seems like a clear and compelling case to me, but my involvement with
NetBSD is too tangential for me to strongly advocate for that.

Anyway, my suggestion is that if there's this much contention for a
removal, I'd suggest coming up with a set of reasonable criteria people can
agree on that help focus the discussion on cost / benefit rather than some
of the more esoteric philosophical arguments I've seen in the thread which
feel good, but put a lot of work on others to generate that good feeling.

Warner


Re: nandemulator

2019-02-24 Thread Warner Losh
On Sun, Feb 24, 2019, 11:33 AM David Holland 
wrote:

> On Sat, Feb 23, 2019 at 02:05:39PM -0700, Warner Losh wrote:
>  > On Sat, Feb 23, 2019 at 12:40 PM David Holland <
> dholland-t...@netbsd.org>
>  > wrote:
>  >
>  > > Do we have docs for the object nandemulator is supposed to be
>  > > emulating? Some questions have arisen about how complete it is and
>  > > nobody I've talked to seems to really have answers.
>  >
>  > So looking at the code...
>  >[...]
>  > I know these aren't definitive answers as I didn't write the code and am
>  > basing this on briefly studying the code + the knowledge I picked up
> about
>  > NAND while working with planar SLC and MLC NAND in the 34nm to 19nm
>  > technology nodes for Intel, Micron and Toshiba. So in the absence of
> other
>  > answers, mine may be OK. However, I'd be happy to defer to someone who
>  > wrote the code and/or did a comparison of commands vs datasheets from
> that
>  > era.
>
> I think you underestimate how much the rest of us don't know :-)
>
> Many thanks -- that is definitely enough information to sort things
> out, and I'd had no idea even where to begin looking.
>

I'm happy to fill in more details. I worked at FusionIO for their third and
forth generation of cards doing tweaks to thresholds to optimize read
performance and reliability... I forget what the baseline for most people
is :)

I had thought about saying "just a lot of old stuff from the early 2000s,"
but that seemed to be too vague.

But seriously. I'm happy to help in any way I can.

Warner

>


Re: nandemulator

2019-02-23 Thread Warner Losh
On Sat, Feb 23, 2019 at 12:40 PM David Holland 
wrote:

> Do we have docs for the object nandemulator is supposed to be
> emulating? Some questions have arisen about how complete it is and
> nobody I've talked to seems to really have answers.
>

So looking at the code... It's an ONFI emulator. That mens Intel/Micron
parts (as opposed to the so-called 'Toggle' parts from Toshiba/Samsung).
It's from 2011, so it can't be emulating anything newer than 30ish nm
processes. The limited command set suggests that it's emulating just SLC
parts. It uses bogus manufacturer data, and a place holder name
(NANDEMULATOR made by NetBSD), so I doubt there's a specific model used
here. It hard codes a 32MB device with 2k pages, which suggests an even
older device (45nm SLC generation maybe). It emulates things at a much
lower level than FreeBSD's nandsim, it would appear, but I've not studied
either more than briefly for this email.

This is in keeping with the other nand_*.c files in that directory. They
are for parts like the Micron MT29F2G08AAC and such. These date from 2005
to 2008 if I can believe the quick sample of data sheets that I found. The
list of supported commands is approximately that of the emulator for the
Micron part. No mention is made of MLC or TLC, which usually indicates on
the older parts they are SLC. MLC and TLC parts sometimes have additional
features / commands required (or sometimes just desired) for coping with
partial page programming, etc.

It doesn't look super complete to my eye. But it's been 6 years since I was
building NAND based PCIe storage devices for a living.

I know these aren't definitive answers as I didn't write the code and am
basing this on briefly studying the code + the knowledge I picked up about
NAND while working with planar SLC and MLC NAND in the 34nm to 19nm
technology nodes for Intel, Micron and Toshiba. So in the absence of other
answers, mine may be OK. However, I'd be happy to defer to someone who
wrote the code and/or did a comparison of commands vs datasheets from that
era.

Warner


Re: scsipi: physio split the request

2018-12-28 Thread Warner Losh
On Fri, Dec 28, 2018, 11:04 AM Warner Losh 
>
> On Fri, Dec 28, 2018, 1:25 AM matthew green 
>> > Of course larger transfers would also mitigate the overhead for each I/O
>> > operation, but we already do several Gigabyte/s with 64k transfers and
>> > filesystem I/O tends to be even smaller.
>>
>> yes - the benefits will be in the 0-10% range for most things.  it
>> will help, but only a fairly small amount, most of us won't notice.
>>
>> i've seen peaks of 1.4GB/s with an nvme(4) device with ffs on top.
>>
>
>
> I've seen 3.3GB/s of 128k-512k transfers on FreeBSD off of nvme, but
> that's mostly video. It seems to be limited there not so much by transfer
> size, but by the ability to queue transactions. We see <1% by raising
> MAXPHYS to 1MB over the default 128k there.
>

Also, we are limited by what the device itself can do which varies a lot by
drive. From a low of 1GB/s to a high of just under 3.4GB/s.

Warner

>


Re: scsipi: physio split the request

2018-12-28 Thread Warner Losh
On Fri, Dec 28, 2018, 1:25 AM matthew green  > Of course larger transfers would also mitigate the overhead for each I/O
> > operation, but we already do several Gigabyte/s with 64k transfers and
> > filesystem I/O tends to be even smaller.
>
> yes - the benefits will be in the 0-10% range for most things.  it
> will help, but only a fairly small amount, most of us won't notice.
>
> i've seen peaks of 1.4GB/s with an nvme(4) device with ffs on top.
>


I've seen 3.3GB/s of 128k-512k transfers on FreeBSD off of nvme, but that's
mostly video. It seems to be limited there not so much by transfer size,
but by the ability to queue transactions. We see <1% by raising MAXPHYS to
1MB over the default 128k there.

Warner

>


Re: svr4, again

2018-12-20 Thread Warner Losh
On Thu, Dec 20, 2018, 6:17 PM Maxime Villard  Le 20/12/2018 à 18:11, Kamil Rytarowski a écrit :
> > https://github.com/krytarowski/franz-lisp-netbsd-0.9-i386
> >
> > On the other hand unless we need it for bootloaders, drivers or
> > something needed to run NetBSD, I'm for removal of srv3, sunos etc
> compat.
>
> Yes.
>
> So, first things first, and to come back to my email about ibcs2: what are
> the reasons for keeping it? As I said previously, this is not for x86 but
> for Vax. As was also said, FreeBSD removed it just a few days ago.
>


It had been disconnected from the build for a while too...

Warner

I'm bringing up compat_ibcs2 because I did start a thread on port-vax@ about
> it last year (as quoted earlier), and back then it seemed that no one knew
> what was the use case on Vax.
>


Re: svr4, again

2018-12-20 Thread Warner Losh
On Wed, Dec 19, 2018 at 4:38 PM  wrote:

> On Wed, Dec 19, 2018 at 11:01:27AM -0700, Warner Losh wrote:
> > FreeBSD ditched SYSV maybe 2 years ago, but
> > we still have IBCS in the tree because people are still using it (last we
> > checked) and bug fixes / reports are still trickling in...
> >
> > Which is a long way of saying 'be careful' :)
> >
> > Warner
>
> That statement lasted all of a few hours.
>
> https://v4.freshbsd.org/commit/freebsd/src/342242


I had no idea this was going to happen so quickly...

Warner


Re: Support for tv_sec=-1 (one second before the epoch) timestamps?

2018-12-15 Thread Warner Losh
On Sat, Dec 15, 2018, 1:17 PM Mouse  > Might I suggest that the obvious solution to this, and probably a
> > host of other issues, is to make time_t an always negative number
> > (negint/neglong?) and redefine the epoch as 03:14:09 UTC on Tuesday,
> > 19 January 2038,
>
> While it's academic as far as this thread is concerned, you can get
> much the same effect by making time_t a (positive) unsigned value and
> redefining the epoch to be 1901-12-13 20:45:54 UTC.
>

No, it is not. That breaks the naive seconds since 1970 to/from broken down
time that is specified in the standard (which is a fatal flaw since it
assigns no unique value to leap seconds, pretending that they don't exist).

Warner


> But, if you're going to redefine the epoch, there are a whole lot of
> options available.
>
> /~\ The ASCII Mouse
> \ / Ribbon Campaign
>  X  Against HTMLmo...@rodents-montreal.org
> / \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
>


Re: Support for tv_sec=-1 (one second before the epoch) timestamps?

2018-12-13 Thread Warner Losh
On Thu, Dec 13, 2018, 9:41 AM 
>
> > On Dec 13, 2018, at 6:06 AM, Martin Husemann  wrote:
> >
> >
> > [EXTERNAL EMAIL]
> >
> > On Thu, Dec 13, 2018 at 03:29:03AM +, David Holland wrote:
> >> On Wed, Dec 12, 2018 at 10:27:04PM +0100, Joerg Sonnenberger wrote:
> >>> On Wed, Dec 12, 2018 at 08:46:33PM +0100, Micha? G?rny wrote:
>  While researching libc++ test failures, I've discovered that NetBSD
>  suffers from the same issue as FreeBSD -- that is, both the userspace
>  tooling and the kernel have problems with (time_t)-1 timestamp,
>  i.e. one second before the epoch.
> >>>
> >>> I see no reason why that should be valid or more general, why any
> >>> negative value of time_t is required to be valid.
> >>
> >> Are you Dan Pop? :-)
> >
> > Not sure about that, but I agree that we should not extend the range of
> > time_t (aka "seconds since the epoch") to negative values. It is a
> pandora
> > box, keep it closed.
> >
> > Martin
>
> You could certainly make that restriction.  On the other hand, the TZ
> project maintains timezone offset rules for times long before the epoch.
> Those time stamps, and the rules for processing them, are well defined.  At
> least until you get far enough back that Gregorian vs. Julian calendar
> becomes a consideration.
>

Doesn't matter. The meaning of negative time_t values is implementation
defined. Some implementations have it well defined, but there are problems
due to things like the return value of time(). Is it convenient for values
before the epoch to be well defined? Sure. But it's not required by the
standards. I would posit that makes the specific test that kicked off this
thread invalid.

Warner

>


Re: Support for tv_sec=-1 (one second before the epoch) timestamps?

2018-12-12 Thread Warner Losh
On Wed, Dec 12, 2018 at 2:34 PM Joerg Sonnenberger  wrote:

> On Wed, Dec 12, 2018 at 08:46:33PM +0100, Michał Górny wrote:
> > While researching libc++ test failures, I've discovered that NetBSD
> > suffers from the same issue as FreeBSD -- that is, both the userspace
> > tooling and the kernel have problems with (time_t)-1 timestamp,
> > i.e. one second before the epoch.
>
> I see no reason why that should be valid or more general, why any
> negative value of time_t is required to be valid.
>

Does (time_t)-1 mean a date in 1969 (int)? Or does it mean a date in 2106
(uint32_t)? Or is it a date in 584942419325 (uint64_t)? I don't think that
POSIX actually says anything about the right answer. All I could find about
what time_t is the vague statement: "time_t Used for time in seconds." and
the following from the time() call: 'Upon successful completion, *time*()
shall return the value of time. Otherwise, (time_t)-1 shall be returned.'
This strongly implies, to my mind, that -1 is not a valid time_t.

Warner


Re: Missing compat_43 stuff for netbsd32?

2018-09-12 Thread Warner Losh
On Tue, Sep 11, 2018, 5:48 PM Brad Spencer  wrote:

> Eduardo Horvath  writes:
>
> > On Tue, 11 Sep 2018, Paul Goyette wrote:
> >
> >> While working on the compat code, I noticed that there are a few old
> >> syscalls which are defined in syc/compat/netbsd323/syscalls.master
> >> with a type of COMPAT_43, yet there does not exist any compat_netbsd32
> >> implementation as far as I can see...
> >>
> >>  #64 ogetpagesize
> >>  #84 owait
> >>  #89 ogetdtablesize
> >>  #108osigvec
> >>  #142ogethostid (interestingly, there _is_ an implementation
> >>  for osethostid!)
> >>  #149oquota
> >>
> >> Does any of this really matter?  Should we attempt to implement them?
> >
> > I believe COMPAT_43 is not NetBSD 4.3 it's BSD 4.3.  Anybody have any
> old
> > BSD 4.3 80386 binaries they still run?  Did BSD 4.3 run on an 80386?
> Did
> > the 80386 even exist when Berkeley published BSD 4.3?
> >
> > It's probably only useful for running ancient SunOS 4.x binaries, maybe
> > Ultrix, Irix or OSF-1 depending on how closely they followed BSD 4.3.
> >
> > Eduardo
>
>
> It has been a very long time since I did this, and I may not remember
> correctly, but I believe that COMPAT_43 is needed on NetBSD/i386 to run
> BSDI binaries.  I remember using the BSDI Netscape 3.x binary back in
> the day and I think it was required.
>

FreeBSD does too... net2 was closer to 4.3 system calls for many things
than 4.4.

Warner

>


Re: Missing compat_43 stuff for netbsd32?

2018-09-12 Thread Warner Losh
On Tue, Sep 11, 2018, 4:38 PM Thor Lancelot Simon  wrote:

> There can be a lot of value to being able to run really old executables,
> but you need the right customer in the right state of utter desperation...
>

I'm writing a COMPAT_V7 right now to celebrate Unix 50 next year. To be
difficult, this is really COMPAT_VENIX for an old 8088 v7 port that I have
some history with. There is a wait call there mentioned elsewhere in the
thread. I want it mostly so I can run the compiler on something fast...

Maybe not desperation, but at least a little crazy .. it's not clear even a
kernel module is the right path since qemu userland might be easier...

Warner

>


Re: new errno ?

2018-07-08 Thread Warner Losh
On Sat, Jul 7, 2018, 11:43 AM Jason Thorpe  wrote:

>
>
> On Jul 6, 2018, at 2:49 PM, Eitan Adler  wrote:
>
> For those interested in some of the history:
> https://lists.freebsd.org/pipermail/freebsd-hackers/2003-May/000791.html
>
>
> ...and the subsequent thread went just as I expected it might.  Sigh.
>
> Anyway... in what situations is this absurd error code used in the 802.11
> code?
>

ENOTTY is best for how 802.11 uses it.

Warner


EFAULT seems wrong because it means something very specific.  Actually,
> that brings me to a bigger point... rather than having a generic error code
> for "lulz I could have panic'd here, heh", why not simply return an error
> code appropriate for the situation that would have otherwise resulted in
> calling panic()?  There are many to choose from :-)
>
> -- thorpej
>
>


Re: new errno ?

2018-07-07 Thread Warner Losh
On Fri, Jul 6, 2018, 2:10 PM Greg Troxel  wrote:

>
> Phil Nelson  writes:
>
> > Hello,
> >
> > In working on the 802.11 refresh, I ran into a new errno code from
> FreeBSD:
> >
> > #define EDOOFUS 88  /* Programming error */
> >
> > Shall we add this one?  (Most likely with a different number since
> 88 is taken
> > in the NetBSD errno.h.)
> >
> >I could use EPROTO instead, but 
>
> My immediate reaction is not to add it. It's pretty clearly not in
> posix, unlikely to be added, and sounds unprofessional.


Poul-Henning added it to differentiate between potentially valid but not in
this combo (EINVAL or EFAULT) and args that are clearly programming errors
(EDOOFUS), but in code that couldn't just panic.

It seems like it would be used in cases where there is a KASSERT in the
> non-DIAGNOSTIC case.  I might just map it to EFAULT or EINVAL.
>

Not a terrible choice.

Warner

>


Re: QEMU/NetBSD status wiki page

2018-05-28 Thread Warner Losh
On Sun, May 27, 2018 at 11:57 AM, Kamil Rytarowski <n...@gmx.com> wrote:

> On 27.05.2018 16:53, Warner Losh wrote:
> >
> >
> > On Sun, May 27, 2018 at 4:05 AM, Kamil Rytarowski <n...@gmx.com
> > <mailto:n...@gmx.com>> wrote:
> >
> > As requested, I've prepared a QEMU/NetBSD status page:
> >
> > http://wiki.netbsd.org/users/kamil/qemu/
> > <http://wiki.netbsd.org/users/kamil/qemu/>
> >
> > I've attempted to be rather conservative with claims that something
> > works, without detailed verification.
> >
> >
> > FreeBSD has a complete QEMU user-mode implementation in a branch right
> > now. It's sufficiently advanced we build all our arm, arm64 and mips
> > packages using it. What's in upstream QEMU is totally, totally broken.
> > The work breaks things down so the common BSD could be shared. Starting
> > from that base would be a huge leg up to getting things working.
> >
>
> Thank you for the feedback.
>
> I would like to stress that In my point of view - whether bluetooth or
> vde is 100% functional - doesn't really matter in the context of:
>
>  - user mode emulation
>  - hardware assisted virtualization
>  - virtio
>  - vhost
>  - device passthrough
>
> Once that will work well, getting this or that library for compression
> of images of GUI is a matter packaging in tools.
>
> We can consider whether to collect the native kernel implementation of
> nbd from Bitrig, as it was required for at least a single ARM evaluation
> board in a bootstrap/booting process.
>
>
> The HQEMU project can be very useful for releng, as we can boost
> emulation of e.g. ARM by a factor of 3-20x on a amd64 host (exact boost
> times depend on the type of executed code), run the tests more quickly
> and save precious time and CPU cycles.


Haven't investigated that.


>
> > I'm in the process of getting it upstream. FreeBSD's branch is a royal
> > mess that has all the usual problem with a git branch that has lots of
> > merges applied: it had become almost impossible to rebase. I've sorted
> > most of that out, and am now sorting out collapsing down all the bug
> > fixes and/or qemu API changes that happened over the years so each
> > change in my branch is buildable. That should land this summer, maybe in
> > time for 3.0, but maybe not.
> >
>
> How close is this code to linux-user? I think that maintaining a concept
> of bsd-user in 2018 is obsolete, new code in one BSD can be closer to
> Linux or Solaris than other BSDs.
>

I'm not sure I follow this logic at all. The BSDs share a base that's quite
similar, even if new bits aren't similar. Have you looked at the code I'm
upstreaming? See the bsd-user branch in
https://github.com/seanbruno/qemu-bsd-user for details. It actually works
today, so it's not obsolete. It might be better not shared, but since that
doesn't exist today, I can't judge those efforts.


> Ideally we should go for [unix-]user shared between Linux and BSDs, add
> OS specific differences in dedicated {linux,freebsd,netbsd}-user,
> splitting NetBSD and FreeBSD.
>

I used to think that but no longer. There's a lot of code to deal with
threading and vm differences that insinuates itself into a lot of code. I'm
not so sure that sharing between Linux and anything else is really all that
sane, though there's some commonality. Without substantial changes in
upstream behavior, it will also result in lots of breakage as the code
velocity there is fast and often times the changes made are no good for BSD.


> For now please ignore NetBSD code in this upstreaming process.
>

I'm upstreaming exactly what we have, which moves the current netbsd/opensd
to their own subdirectory of bsd-user. The code in upstream is currently
totally broken, and this won't break it any more. My efforts are to push up
the code we have today that works really really well and nothing further.
Any cross-bsd or pan-unix efforts will post-date my upstreaming since those
do not exist today.

Warner


Re: QEMU/NetBSD status wiki page

2018-05-28 Thread Warner Losh
On Sun, May 27, 2018 at 4:05 AM, Kamil Rytarowski  wrote:

> As requested, I've prepared a QEMU/NetBSD status page:
>
> http://wiki.netbsd.org/users/kamil/qemu/
>
> I've attempted to be rather conservative with claims that something
> works, without detailed verification.
>

FreeBSD has a complete QEMU user-mode implementation in a branch right now.
It's sufficiently advanced we build all our arm, arm64 and mips packages
using it. What's in upstream QEMU is totally, totally broken. The work
breaks things down so the common BSD could be shared. Starting from that
base would be a huge leg up to getting things working.

I'm in the process of getting it upstream. FreeBSD's branch is a royal mess
that has all the usual problem with a git branch that has lots of merges
applied: it had become almost impossible to rebase. I've sorted most of
that out, and am now sorting out collapsing down all the bug fixes and/or
qemu API changes that happened over the years so each change in my branch
is buildable. That should land this summer, maybe in time for 3.0, but
maybe not.

Warner


Re: Kernel module framework status?

2018-05-05 Thread Warner Losh
On Sat, May 5, 2018, 4:17 AM  wrote:

> If someone wants to do this route of metadata, please consider the
> addition of a metadata property "should this be auto loaded".
>
> Currently we have ad-hoc logic for some modules that might be auto
> loaded (compat_...) and it'd probably be cleaner to do this.
>

In FreeBSD, I generally converted the ad hoc logic to tables and made sure
the metadata mini language was expressive enough to cope.

Warner

>


Re: Kernel module framework status?

2018-05-05 Thread Warner Losh
On Fri, May 4, 2018 at 12:32 AM, John Nemeth  wrote:

> On May 3, 10:54pm, Mouse wrote:
> }
> } >  There is also the idea of having a module specify the device(s)
> } > it handles by vendor:product
> }
> } Isn't that rather restrictive in what buses it permits supporting?
>
>  I suppose that other types of identifiers could be used.
>
> } Indeed, PCI (and close relatives, like PCIe) and USB are the only
> } things I can name offhand that even _have_ vendor:product.  (Of course,
> } I'm sure there are lots of buses out there I've never heard of, or
> } don't know enough about.)
>
>  Only buses where the devices are identified would work.  For
> buses like ISA where you have to probe the devices, it would not
> be workable.
>

Don't forget that ISA buses have ISAPNP as an option, so it's more of a
mixed bus. But yea, this can old work on self-enumerating, self-identifying
buses.


> }-- End of excerpt from Mouse
>

Warner


Re: Kernel module framework status?

2018-05-04 Thread Warner Losh
On Thu, May 3, 2018 at 8:54 PM, Mouse  wrote:

> >  There is also the idea of having a module specify the device(s)
> > it handles by vendor:product
>
> Isn't that rather restrictive in what buses it permits supporting?
>
> Indeed, PCI (and close relatives, like PCIe) and USB are the only
> things I can name offhand that even _have_ vendor:product.  (Of course,
> I'm sure there are lots of buses out there I've never heard of, or
> don't know enough about.)
>

FreeBSD's modules have metadata. Some of this metadata can describe "plug
and play" tables the drivers use to match devices. FreeBSD's newbus has a
method to get the textual representation of this "plug and play" data.
Combined, I wrote devmatch to sort through the unattached devices matching
their plug and play data to modules to get a list of modules to load. I'll
be presenting a talk on this at BSDcan next month...

Warner


Re: Spectre

2018-01-18 Thread Warner Losh
On Thu, Jan 18, 2018 at 7:58 AM,  wrote:

>
>
> > On Jan 18, 2018, at 9:48 AM, Mouse  wrote:
> >
> >> Since this involves a speculative load that is legal from the
> >> hardware definition point of view (the load is done by kernel code),
> >> this isn't a hardware bug the way Meltdown is.
> >
> > Well, I'd say it's the same fundamental hardware bug as meltdown, but
> > not compounded by an additional hardware property (which I'm not sure I
> > would call a bug) which is made much worse by the actual bug.
> >
> > To my mind, the bug here is that annulling spec ex doesn't annul _all_
> > its effects.  That, fundamentally, is what's behind both spectre and
> > meltdown.  In meltdown it's exacerbated by spec ex's failure to check
> > permissions fully - but if the side effects were annulled correctly,
> > even that failure wouldn't cause trouble.
>
> That's true.  But the problem is that cache fill is only the most
> obvious and easiest to exploit side channel.  There are others, such
> as timing due to execution units being busy, that are harder to exploit
> but also harder to cure.  It seems to me that blocking all observable
> side effects of speculative execution can probably only be done by
> disabling speculative execution outright.  That clearly isn't a good
> thing.  The Spectre fixes all amount to a speculative barrier, which
> will do the job just as well (though it requires code change).  The
> Meltdown fix is more obvious: don't omit mode dependent access checks
> before launching a speculative load, as most CPU designers already did.
>

One difficulty with caches: You'd have to re-cache what you eject,
otherwise there's an observable effect. That's the whole point of this
family of attacks: the micro architecture does something that you can
observe that you'd normally not be able to observe. It's really really hard
to not leak any side-channel data at all. Side channel has become the new
buffer overflow.

Warner


Re: virtual to physical memory address translation

2018-01-15 Thread Warner Losh
On Mon, Jan 15, 2018 at 8:09 AM, John Nemeth  wrote:

> On Jan 15,  2:04pm, Michael van Elst wrote:
> } m...@netbsd.org (Emmanuel Dreyfus) writes:
> }
> } >Sorry if that has been covered ad nauseum, but I canot find relevant
> } >information about that: on NetBSD, how can I get the physical memory
> } >address given a virtual memory address? This is to port the Linux
> } >Meltdown PoC so that we have something to test our systems against.
> }
> } pmap_extract() returns the physical address of a virtual address.
> } pmap_kernel() gives you the kernel map.
>
>  I suspect that he wants to do this from userland.
>

You have to walk the page tables, or trick some driver into leaking this
information somehow. There's no standard interface to get it.

In FreeSBD there's no standard interface, but that hasn't stopped people
from getting metldown and spectre working, though I don't think they have
shared that PoC code.

Warner


Re: Reading a DDS tape with 2M blocks

2018-01-09 Thread Warner Losh
On Jan 9, 2018 3:59 PM, "Greg Troxel"  wrote:


Edgar Fuß  writes:

> I have a DDS tape (written on an IRIX machine) with 2M blocks.
> Any way to read this on a NetBSD machine?
> My memories of SCSI ILI handling on DDS are fuzzy. I remember you can
operate
> these tapes in fixed or variable block size mode, where some values in
the CDB
> either mean blocks or bytes. I thought in variable mode, you could read
block
> sizes other than the (virtual) physical block size of the tape.

Did you try

dd if=/dev/rsd0d of=FILE bs=2m

or similar?  I believe that dd does reads of the given bs and these
reads are passed to the tape device driver which then does reads of that
size from the hardware, and that this then works fine.


Might need MAXPHYS of 2m too...

Warner


Re: Proposal to obsolete SYS_pipe

2017-12-25 Thread Warner Losh
On Dec 24, 2017 11:10 PM, "Robert Elz"  wrote:

Date:Sun, 24 Dec 2017 18:42:19 -0800
From:John Nemeth 
Message-ID:  <201712250242.vbp2gjjm017...@server.cornerstoneservice.ca>

  | HISTORY
  |  A pipe() function call appeared in Version 6 AT UNIX.

That I think would be a man page bug - pipe() was certainly in 5th
edition, but that is as far back as I go, so I am not sure when it
did appear - the syscall number suggests it was not in the very early
versions though (not 1st or 2nd edition probably.)


It is in the 3rd edition man pages, but is documented with only one return
code. The 4th edition manual looks very similar, but does have both values
documented. The source is fragmentary so it's hard to track down. 2nd
edition has no manuals, but no pipe in libc.

I just went through FreeBSD's system call man pages and corrected a number
of details like this...

Warner


Re: ext2fs superblock updates

2017-11-16 Thread Warner Losh
On Thu, Nov 16, 2017 at 12:12 PM, Mouse  wrote:

> >> They are generated by _newfs_ and left untouched thereafter.
> > Interesting, thanks.  what's so useful about the superblock at newfs
> > time?
>
> It contains enough information for fsck to find other critical things
> (like cylinder groups and their inode tables).  If the primary
> superblock has been destroyed but the rest of the filesystem is intact,
> fsck is supposed be able to put the filesystem back together with the
> help of a backup superblock.


Yes. For UFS filesystems on BSD labeled disks, there's additional hints to
fsck about the size of different parts of the filesystem that allow it to
guess fairly well at the location of these alternate super blocks.

Warner


Re: FUA and TCQ

2016-09-26 Thread Warner Losh
On Mon, Sep 26, 2016 at 8:27 AM, Michael van Elst <mlel...@serpens.de> wrote:
> i...@bsdimp.com (Warner Losh) writes:
>
>>NVMe is even worse. There's one drive that w/o queueing I can barely
>>get 1GB/s out of. With queueing and multiple requests I can get the
>>spec sheet rated 3.6GB/s. Here queueing is critical for Netflix to get to
>>90-93Gbps that our 100Gbps boxes can do (though it is but one of
>>many things).
>
> Luckily the Samsung 950pro isn't of that type. Can you tell what
> NVMe devices (in particular in M.2 form factor) have that problem?

I've not used any m.2 devices. These tests were raw dd's of 128k I/Os
with one thread of execution, so no effective queueing at all. As
queueing gets involved, the performance increases dramatically as the
drive idle time drops substantially. I'd imagine most drives are like
this for the workload I was testing since you had to make a full
round-trip from the kernel to userland after the completion to get the
next I/O rather than having it already in the hardware... Unless
NetBSD's context switching is substantially faster than FreeBSD's, I'd
expect to see similar results there as well. Some cards do a little
better, but not by much... All cards to significantly better when
multiple transactions are scheduled simultaneously.

Just ran a couple of tests and found dd of 4k blocks gave me 160MB/s,
128k blocks gave me 600MB/s, 1M blocks gave me 636MB/s. random
read/write with 64 jobs and an I/O depth of 128 with 128k random reeds
with fio gave me 3.5GB/s. This particular drive is rated at 3.6GB/s.
This is for a HGST Ultrastar SN100. All numbers from FreeBSD. In
production, for unencrypted traffic, we see a similar number to the
deep queue fio test. While I've not tried on NetBSD, I'd be surprised
if you got significantly more than these numbers due to the round trip
to user land vs having the next request being present in the drive...

Warner


Re: Plan: journalling fixes for WAPBL

2016-09-24 Thread Warner Losh
On Sat, Sep 24, 2016 at 2:01 AM, David Holland  wrote:
> On Fri, Sep 23, 2016 at 07:51:32PM +0200, Manuel Bouyer wrote:
>  > > > *if you have the write cache disabled*
>  > >
>  > > *Running with the write cache enabled is a bad idea*
>  >
>  > On ATA devices, you can't permanently disable the write cache. You have
>  > to do it on every power cycles.
>
> There are also drives that ignore attempts to turn off write caching.

These drives lie to the host and say that caching is off, when it
really is still on, right?

Warner


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 11:54 AM, Warner Losh <i...@bsdimp.com> wrote:
> On Fri, Sep 23, 2016 at 11:20 AM, Thor Lancelot Simon <t...@panix.com> wrote:
>> On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
>>> On September 23, 2016 10:51:30 AM EDT, Warner Losh <i...@bsdimp.com> wrote:
>>> >All NCQ gives you is the ability to schedule multiple requests and
>>> >to get notification of their completion (perhaps out of order). There's
>>> >no coherency features are all in NCQ.
>>>
>>> This seems like the key thing needed to avoid FUA: to implement fsync() you 
>>> just wait for notifications of completion to be received, and once you have 
>>> those for all requests pending when fsync was called, or started as part of 
>>> the fsync, then you're done.
>>
>> The other key point is that -- unless SATA NCQ is radically different from
>> SCSI tagged queuing in a particularly stupid way -- the rules require all
>> "simple" tags to be completed before any "ordered" tag is completed.  That 
>> is,
>> ordered tags are barriers against all simple tags.
>
> SATA NCQ doesn't have ordered tags. There's just 32 slots to send
> requests into. Don't allow the word 'tag' to confuse you into thinking
> it is anything at all like SCSI tags. You get ordering by not
> scheduling anything until after the queue has drained when you send
> your "ordered" command. It is that stupid.

And it can be even worse, since if the 'ordered' item must complete
after all before it, you have to drain the queue before you can even
send it to the drive. Depends on what the ordering guarantees you want
are...

Warner


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 11:20 AM, Thor Lancelot Simon <t...@panix.com> wrote:
> On Fri, Sep 23, 2016 at 05:15:16PM +, Eric Haszlakiewicz wrote:
>> On September 23, 2016 10:51:30 AM EDT, Warner Losh <i...@bsdimp.com> wrote:
>> >All NCQ gives you is the ability to schedule multiple requests and
>> >to get notification of their completion (perhaps out of order). There's
>> >no coherency features are all in NCQ.
>>
>> This seems like the key thing needed to avoid FUA: to implement fsync() you 
>> just wait for notifications of completion to be received, and once you have 
>> those for all requests pending when fsync was called, or started as part of 
>> the fsync, then you're done.
>
> The other key point is that -- unless SATA NCQ is radically different from
> SCSI tagged queuing in a particularly stupid way -- the rules require all
> "simple" tags to be completed before any "ordered" tag is completed.  That is,
> ordered tags are barriers against all simple tags.

SATA NCQ doesn't have ordered tags. There's just 32 slots to send
requests into. Don't allow the word 'tag' to confuse you into thinking
it is anything at all like SCSI tags. You get ordering by not
scheduling anything until after the queue has drained when you send
your "ordered" command. It is that stupid.

Warner


Re: FUA and TCQ

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 8:05 AM, Thor Lancelot Simon  wrote:
> Our storage stack's inability to use tags with SATA targets is a huge
> gating factor for performance with real workloads (the residual use of
> the kernel lock at and below the bufq layer is another).

FreeBSD's storage stack does support NCQ. When that's artificially
turned off, performance drops on a certain brand of SSDs from about
500-550MB/s for large reads down to 200-300MB/s depending on
too many factors to go into here. It helps a lot for work loads and is
critical for Netflix to get 36-38Gbps rate from our 40Gbps systems.

> Starting de
> novo with NVMe, where it's perverse and structurally difficult to not
> support multiple commands in flight simultaneously, will help some, but
> SATA SSDs are going to be around for a long time still and it'd be
> great if this limitation went away.

NVMe is even worse. There's one drive that w/o queueing I can barely
get 1GB/s out of. With queueing and multiple requests I can get the
spec sheet rated 3.6GB/s. Here queueing is critical for Netflix to get to
90-93Gbps that our 100Gbps boxes can do (though it is but one of
many things).

> That said, I am not going to fix it myself so all I can do is sit here
> and pontificate -- which is worth about what you paid for it, and no
> more.

Yea, I'm just a FreeBSD guy lurking here.

Warner


Re: Plan: journalling fixes for WAPBL

2016-09-23 Thread Warner Losh
On Fri, Sep 23, 2016 at 7:38 AM, Thor Lancelot Simon  wrote:
> On Fri, Sep 23, 2016 at 11:47:24AM +0200, Manuel Bouyer wrote:
>> On Thu, Sep 22, 2016 at 09:33:18PM -0400, Thor Lancelot Simon wrote:
>> > > AFAIK ordered tags only guarantees that the write will happen in order,
>> > > but not that the writes are actually done to stable storage.
>> >
>> > The target's not allowed to report the command complete unless the data
>> > are on stable storage, except if you have write cache enable set in the
>> > relevant mode page.
>> >
>> > If you run SCSI drives like that, you're playing with fire.  Expect to get
>> > burned.  The whole point of tagged queueing is to let you *not* set that
>> > bit in the mode pages and still get good performance.
>>
>> Now I remember that I did indeed disable disk write cache when I had
>> scsi disks in production. It's been a while though.
>>
>> But anyway, from what I remember you still need the disk cache flush
>> operation for SATA, even with NCQ. It's not equivalent to the SCSI tags.

All NCQ gives you is the ability to schedule multiple requests and
to get notification of their completion (perhaps out of order). There's
no coherency features are all in NCQ.

> I think that's true only if you're running with write cache enabled; but
> the difference is that most ATA disks ship with it turned on by default.
>
> With an aggressive implementation of tag management on the host side,
> there should be no performance benefit from unconditionally enabling
> the write cache -- all the available cache should be used to stage
> writes for pending tags.  Sometimes it works.

You don't need to flush all the writes, but do need to take special care
if you need more coherent semantics, which often is a small minority
of the writes, so I would agree the affect can be mostly mitigated. Not
completely since any coherency point has to drain the queue completely.
The cache drain ops are non-NCQ, and to send non-NCQ requests
no NCQ requests can be pending. TRIM[*] commands are the same way.

Warner

[*] There is an NCQ version of TRIM, but it requires the AUX register
to be sent and very few sata hosts controllers support that (though
AHCI does, many of the LSI controllers don't in any performant way).


Re: Proposal for kernel clock changes

2014-04-02 Thread Warner Losh

On Apr 1, 2014, at 1:50 PM, David Laight da...@l8s.co.uk wrote:

 This may mean that you can (effectively) count the ticks on all your
 clocks since 'boot' and then scale the frequency of each to give the
 same 'time since boot' - even though that will slightly change the
 relationship between old timestamps taken on different clocks.
 Possibly you do need a small offset for each clock to avoid
 discrepencies in the 'current time' when you recalculate the clocks
 frequency.

If the underling clock moves in frequency, you need to have both a
scale on the frequency, and a time to count adjustment as well. Otherwise
on long-running systems you accumulate a fair amount of error. It doesn’t
take much more than 1ppm of error to accumulate a second of error in 10
days if you don’t have ‘on time’ marks that integrate all of time up to that
point. Then the error in phase will be related to the time since last phase
sync, rather than since time of boot.

Warner



Re: asymmetric smp

2014-04-01 Thread Warner Losh

On Apr 1, 2014, at 5:49 AM, Johnny Billquist b...@softjar.se wrote:

 Good points.
 Is this the right time to ask why booting NetBSD on a VAX (a 3500) now takes 
 more than 15 minutes? What is the system doing all that time???

FreeBSD used to take forever to boot on certain low-end ARM CPUs with /etc/rc.d 
after it was imported from NetBSD. This was due to crappy root-device 
performance (100kB/s is enough for anybody, right?) and crappy, at the time, 
pmap code that caused excess page traffic in the /etc/rc.d environment. Perhaps 
those areas would be fruitful to profile? Also, there were some inefficiencies 
that were either the result of a botched port, or were basic to the system that 
got fixed. Between fixing all these things, the boot time went from 10 minutes 
down to ~20s.

Warner



Re: hf/sf [Was Re: CVS commit: pkgsrc/misc/raspberrypi-userland]

2013-11-11 Thread Warner Losh

On Nov 11, 2013, at 4:31 PM, Justin Cormack wrote:

 On Mon, Nov 11, 2013 at 10:56 PM, Michael van Elst mlel...@serpens.de wrote:
 m...@3am-software.com (Matt Thomas) writes:
 
 Exactly.  with hf, floating point values are passed in floating point
 registers.  That can not be hidden via a library (this works on x86
 since the stack has all the arguments).
 
 It could be hidden by emulating the floating point hardware.
 
 Thats not sane. The slowdown would be enormous. You are emulating
 registers as well as operations.

Is there a complete write up of the conventions here?

Warner



Re: MACHINE_ARCH on NetBSD/evbearmv6hf-el current

2013-11-05 Thread Warner Losh

On Oct 26, 2013, at 12:24 PM, Alistair Crooks wrote:

 On Sat, Oct 26, 2013 at 11:10:52AM -0700, Matt Thomas wrote:
 
 On Oct 26, 2013, at 10:54 AM, Izumi Tsutsui tsut...@ceres.dti.ne.jp wrote:
 
 By static MACHINE_ARCH, or dynamic sysctl(3)?
 If dynamic sysctl(3) is prefered, which node?
 
 hw.machine_arch
 
 which has been defined for a long long time.
 
 Yes, defined before sf vs hf issue arised, and
 you have changed the definition (i.e. make it dynamic)
 without public discussion.  That's the problem.
 
 It was already dynamic (it changes for compat_netbsd32).
 
 Whether or when it's dynamic or not, it would be great if you could
 fix it so that binary packages can be used.
 
 And Tsutsui-san is right - public discussion needs to take place, and
 consumers made aware, before these kind of changes are made.

I don't see any further emails on this thread. Was there ever a resolution, or 
just crickets?

Warner



Re: pulse-per-second API status

2013-11-05 Thread Warner Losh

On Nov 1, 2013, at 12:19 PM, paul_kon...@dell.com wrote:

 
 On Nov 1, 2013, at 2:04 PM, Mouse mo...@rodents-montreal.org wrote:
 
 ...
 But it still may not work in the sense of living up to the expectations
 people have come to have for PPS on serial ports.
 
 My worry is not that it's not the best time available in some
 circumstances.  My worry is that putting it into the tree will lead to
 its getting used as if it were as good as PPS on anything else, leading
 both to timeservers that claim stratum 1 but give bad chime and to
 people blaming NetBSD for its crappy PPS support when the real problem
 is that they don't understand the USB issues and it _looks_ like any
 other PPS support until you test the resulting time carefully.
 
 Not just PPS on serial ports, but PPS on other hardware.
 
 I don't know this API.  But my first reaction when I saw the designation 
 PPS is to think of GPS timekeeping boxes and other precision frequency 
 sources that have a PPS output.  On those devices, the PPS output is divided 
 down from the main oscillator frequency, i.e., you can expect accuracies of 
 10^-9 for modest price crystal oscillators, 10^-10 to 10^-12 for higher end 
 stuff -- and jitter in the nanosecond range or better.
 
 It seems rather confusing to have another interface that goes by the same 
 name but has specs 6 or more orders of magnitude worse.  How about a 
 different name that avoids this confusion?

Just because the signal has an Allen Variance of 10^-10 doesn't mean that 
you'll be able to measure each pulse with that precision, or that the tau of 
that figure is 1s... Most common time counter hardware in SoCs and the like is 
good to anywhere from hundreds of microseconds to tens of nano seconds. 
Hundreds of microseconds isn't much worse than the millisecondish USB accuracy. 
The PPS API even allows for an estimate of the accuracy of the measurements, 
IIRC, but that may be a higher-level facility of NTP (it has been a few years 
since I've done this stuff professionally). I don't think there will be any 
confusion at all, especially if the measured accuracy and variance of this 
facility is documented.

1ms is quite accurate enough for NTP though. NTP has trouble on the network 
getting below 1ms of accuracy, especially when there are any hops at all in the 
topology. It won't be the best NTP server in the world, but it will be accurate 
enough for most things. If you need more accuracy, get better hardware..

To those saying 'fix NMEA mode to be better': You can't. The characters that 
spit this code out aren't guaranteed to be at top of second any more than 
approximately...The exact timing varies from receiver to receiver, and if USB 
is involved, the same silly delays are present there too, only worse because 
the message spans USB packets (or likely would since it is just short of 100 
characters long IIRC)... And even if you get those issues out of the way, I 
also believe there's ambiguity in the NMEA standard between the 'on time' point 
for the NMEA messages. Is it the start of the message, the end? Is is the first 
transition of the first bit of the message, or the end of the first character?  
Since it isn't considered a precision signal, nobody times it exactly (or 
didn't a few years ago). It is useful, at best, for knowing what time the 
external PPS is about to be or just was...

So adding support to ucom isn't a horrible idea, as long as expectations are 
managed...

Warner

Re: NetBSD port for AT91SAM9G20?

2012-09-05 Thread Warner Losh

On Sep 5, 2012, at 12:43 AM, Jukka Marin wrote:

 Hi,
 
 I have asked this before, but got no replies.  We are making AT91SAM9G20
 based hardware and I would love to run NetBSD on it.  However, I can't
 find the time to port NetBSD to this MCU and hardware.  Is there anyone
 with some spare time and interest in this kind of a project?  I could
 provide the hardware and documentation required.  I might even sponsor
 some $'s to the NetBSD project if I could run my favourite OS on our
 hardware.
 
 The main features of our current hardware are:
 - AT91SAM9G20 MCU (400 MHz)
 - 64 MB RAM
 - 128 MB NAND FLASH
 - 8 MB NOR FLASH
 - hardware watchdog
 - RTC with battery backup
 - 4 x 10/100 Mbps Ethernet (with a switch)
 - 3 x RS232
 - 2 x RS485
 - 3G GSM modem
 - digital inputs (opto isolated)
 - relay outputs
 - LEDs
 - USB host / device ports
 - expansion slot
 - power supply 9...30 VDC
 - 19 rack mount case (or a smaller metal case)

Apart from a few clocks, this should work with the AT91SAM9260 support that's 
in the tree.  The device tables/trees are the same, and the errata for the 
devices are quite similar.

Warner

Re: NetBSD port for AT91SAM9G20?

2012-09-05 Thread Warner Losh

On Sep 5, 2012, at 11:50 AM, vinc...@labri.fr wrote:

 Warner Losh i...@bsdimp.com writes:
 
 On Sep 5, 2012, at 12:43 AM, Jukka Marin wrote:
 
 The main features of our current hardware are:
 - AT91SAM9G20 MCU (400 MHz)
 [...]
 
 Apart from a few clocks, this should work with the AT91SAM9260 support
 that's in the tree.  The device tables/trees are the same, and the
 errata for the devices are quite similar.
 
 Yes, maybe you'll have to add the CPU id to at91/at91dbgu* for it to be
 recognized correctly but that should be it.

You might want to look at FreeBSD's cpu identification.  I think I have all the 
SAM9 CPUs plus the RM9200 accounted for, plus autoprobing for the dbgu unit, 
which differs from SoC to SoC.

 However, be warned that the ethernet MAC will not work because the code
 currently intree is for another type of AT91 processor which has
 sufficiently different characteristics wrt number of RX and TX buffers
 and maximum sizes of these, although it will probably attach correctly
 and might be able to send short packets by chance.

I'd forgotten that detail.  FreeBSD's driver copes with both (plus there's a 
driver that just talks to the new hardware).

 I started a rewrite of it for the at91sam9260 but ran out of time. If
 other developers want smaller hardware than what Jukka offers to
 experiment, Propox sells small cards with an option for a at91sam9g20
 CPU.
 
 http://www.propox.com/products/t_232.html
 http://www.propox.com/products/t_231.html

Couldn't figure out how buy these in the US.  They are about US$110 if I read 
things right.  I recently got a nice little board from Glomation that ships 
quickly and is cheap (starting at US$55 for the low end up to about $95 for the 
high end in Q1 lot sizes).

http://www.glomationinc.com/products.html

The GESBC-9G20u is the $55 one.  You have to write for a pricelist for the 
other boards, but they are very responsive to email.

Warner

Re: Path to kernel modules (second attempt)

2012-07-08 Thread Warner Losh

On Jul 8, 2012, at 10:20 AM, Matthew Mondor wrote:

 On Sun, 8 Jul 2012 17:57:00 +0200
 Edgar Fuß e...@math.uni-bonn.de wrote:
 
 Please not /kernel as it was already mentioned, it is too similar to
 /kern.
 What about /netbsd? E.g. /netbsd/6.0_BETA/{modules,kernel,firmware}.
 
 /netbsd/amd64/6.0/GENERIC/{modules,kernel,firmware} :) ?

One more note about FreeBSD's structure.  In addition to looking in 
/boot/$KERNNAME, it will also look in /boot/modules.  This is done so that you 
can have multiple different kernels of the same version, that might use 
different internal KBIs that 3rd party drivers don't use.  You can install your 
3rd party driver into /boot/modules and load it with successive kernels (we 
move /boot/kernel to /boot/kernel.old before recreating the /boot/kernel to 
install the new kernel and modules).  This works well for similar versions (eg 
9.0, 9.1), but works less well with 8.x-9.x.

 But can the kernel easily detect that its image was booted in a
 particular directory, and use that as base directory to look for
 modules?  Also, how more complex would this be for the bootloader that
 also needs to preload a few modules to be able to boot?

FreeBSD's boot loader passes this in...

Warner



Re: Path to kernel modules (second attempt)

2012-07-07 Thread Warner Losh

On Jul 7, 2012, at 4:17 PM, Matthew Mondor wrote:

 On Sat, 07 Jul 2012 22:46:50 +0200
 Jean-Yves Migeon jeanyves.mig...@free.fr wrote:
 
 On 07.07.2012 21:57, Mindaugas Rasiukevicius wrote:
 Hello,
 
 Regarding the PR/38724, I propose to change the path to /kernel/.
 Can we reach some consensus quickly for netbsd-6?
 
 /kernel is way to close to /kern, and they serve different purposes.
 IMHO that will raise confusion.
 
 Perhaps /kmod, or /modules like dholland suggests?
 
 Technically modules are not libraries, but maybe /libdata/module is a
 good option? We already have firmwares in /libdata/firmware, and those
 get used by the kernel.
 
 That also makes sense

But it kinda fails with multiple kernels.  On FreeBSD, we went with 
/boot/$KERNNAME/kernel for the kernel, with all the modules associated with it 
in /boot/$KERNNAME. By default, we load /boot/kernel/kernel and the loader may 
also choose to load other things.  The reason we put it in /boot was because we 
have a secondary boot loader (/boot/loader) and on some platforms we were 
looking at you needed a separate boot partition to do things correctly.  this 
layout allows for that as well as transparently supporting multiple kernels.  I 
know on one of my MIPS boards, I can read kernels or the boot loader off of FAT 
partitions, so my /boot there is a FAT file system, with the rest of the system 
in a UFS file system on separate partitions/slices on my CF.

Just something to think about before you go stuffing it into /lbidata/module or 
something...

Warner