Re: BIRD 1.x/2.x support at rpki-client

2020-03-02 Thread Theo de Raadt
More on this:

> However, I'm not sure if the current options -B, -c, -j and -o are that
> great. Maybe something like "-o " would be
> more powerful and more flexible?

btw, in almost all other commmands -o (along with -f) indicates an filename,
not a format.

So that isn't the letter you want.

> To cover this, job@ suggested to maybe generate bird1-ipv6, bird1-ipv4 and
> bird2 when using -B option. The option currently leads to "bird" file with
> BIRD 1.x support only.

As written today with mkstemp + unveil/pledge, each output format
function can only output one file.  Using one format name to create
multiple outputs is going to require some restructuring.



Re: BIRD 1.x/2.x support at rpki-client

2020-03-02 Thread Theo de Raadt
> job@ suggested to move this from GitHub to tech@ list (as upstream):
> 
> 1. Currently, BIRD 1.x support in rpki-client seems to be broken: As per
>BIRD upstream the "combined format" produced by rpki-client can't be
>used as-is with BIRD 1.x due to separated daemons (and configuration
>files) for IPv4 and IPv6.
> 2. Lack of BIRD 2.x support in rpki-client, which requires a different
>output/configuration format (semi-finished pull request at GitHub).
> 
> To cover this, job@ suggested to maybe generate bird1-ipv6, bird1-ipv4 and
> bird2 when using -B option. The option currently leads to "bird" file with
> BIRD 1.x support only.

Can't we assume bird1 will go away eventually?

Or why don't those people convert json to their required format -- make
it their problem?

> However, I'm not sure if the current options -B, -c, -j and -o are that
> great. Maybe something like "-o " would be
> more powerful and more flexible?

The idea is you can specify many outputs.  That will make the commandline
very long, especially for the way we run it in cron.





BIRD 1.x/2.x support at rpki-client

2020-03-02 Thread Robert Scheck
Hi,

job@ suggested to move this from GitHub to tech@ list (as upstream):

1. Currently, BIRD 1.x support in rpki-client seems to be broken: As per
   BIRD upstream the "combined format" produced by rpki-client can't be
   used as-is with BIRD 1.x due to separated daemons (and configuration
   files) for IPv4 and IPv6.
2. Lack of BIRD 2.x support in rpki-client, which requires a different
   output/configuration format (semi-finished pull request at GitHub).

To cover this, job@ suggested to maybe generate bird1-ipv6, bird1-ipv4 and
bird2 when using -B option. The option currently leads to "bird" file with
BIRD 1.x support only.

However, I'm not sure if the current options -B, -c, -j and -o are that
great. Maybe something like "-o " would be
more powerful and more flexible?

Opinions?


Regards,
  Robert



Re: Fix use of WITNESS_UNLOCK() in rw_exit_{read,write}()

2020-03-02 Thread Mateusz Guzik
Oops, sorry for the mixup below. I got the e-mail bounced from someone
and it used their 'From' instead of the original. Regardless,
technical contend stands. :)

On 3/2/20, Mateusz Guzik  wrote:
> On 2/29/20, Visa Hankala  wrote:
>> There is a bug in how rw_exit_read() and rw_exit_write() use
>> WITNESS_UNLOCK(). The fast paths call WITNESS_UNLOCK() after releasing
>> the actual lock. This leads to a use-after-free in the checker if the
>> lock is dynamically allocated and another thread happens to free it
>> too early.
>>
>> The following diff splits rw_exit() into two separate functions, one
>> which is the rw_exit(9) frontend, and the other, rw_do_exit(), which is
>> common between rw_exit(9), rw_exit_read(9) and rw_exit_write(9). This
>> makes it possible to call WITNESS_UNLOCK() before the actual lock
>> release without turning the code into an #ifdef maze.
>>
>> The diff additionally does the following tweaks:
>>
>> * Invoke membar_exit_before_atomic() only once even if the slow path is
>>   taken. The initial barrier is enough for the release semantics.
>>
>> * Read rwl_owner for rw_cas() as late as possible so that the CAS
>>   operation would be less likely to fail. Note that in rw_exit() the
>>   purpose of the read is different; it determines the type of the lock
>>   (read / write).
>>
>> * Put the rw_cas() in rw_exit() (now rw_do_exit()) inside
>>   __predict_false(). This compacts the resulting machine code a bit.
>>
>> The code could be simplified by removing the fast path from
>> rw_exit_read() and rw_exit_write(), and always calling rw_do_exit().
>> However, that would reduce performance a tiny amount. I saw about 0.5%
>> increase in kernel build time in a test setup. The patch below does
>> not seem to affect performance negatively.
>>
>> OK?
>
> If the slowdown is repeatable it most likely happens because of the
> immediate re-read in the loop in the slow path.
>
> Also note performance bugs in unlock:
> - *any* read unlock wakes up everyone, so if the lock has many blocked
> waiters they will all wake up only be likely to go back to sleep. On
> the other hand this may be an anomaly which sometimes improves
> performance because it in corner cases it can mitigate lack of
> adaptive spinning
> - the loop keeps re-reading the word instead of using the value
> returned by cmpxchg. so happens rw_cas macro explicitly throws it away
>
> Splitting unlock paths between reader and writer cases is unavoidable
> in the long term (to allow handling different wake up policies) and I
> think this is a great opportunity to do it all the way.
>
> That said, I propose you do roughly this:
>
> static void __always_inline
> rw_exit_read_impl(struct rwlock *rwl, unsigned long v)
> {
>
> rw_assert_rdlock(rwl);
> WITNESS_UNLOCK(>rwl_lock_obj, 0);
>
>  cas loop + wake up come here
> }
>
> void
> rw_exit_read(struct rwlock *rwl)
> {
>
> membar_exit_before_atomic();
> rw_exit_read_impl(rwl, rwl->rwl_owner);
> }
>
> void
> rw_exit(struct rwlock *rwl)
> {
> unsigned long v;
>
> membar_exit_before_atomic();
> v = rwl->rwl_owner;
> if (v & RWLOCK_WRLOCK)
> rw_exit_write_impl(rwl, v);
> else
> rw_exit_read_impl(rwl, v);
> }
>
> And similarly for write.
>
> That is, the bottom routines assume the fence was posted and they got
> a "fresh" value.
>
>>
>> Index: kern/kern_rwlock.c
>> ===
>> RCS file: src/sys/kern/kern_rwlock.c,v
>> retrieving revision 1.44
>> diff -u -p -r1.44 kern_rwlock.c
>> --- kern/kern_rwlock.c   30 Nov 2019 11:19:17 -  1.44
>> +++ kern/kern_rwlock.c   29 Feb 2020 16:24:59 -
>> @@ -25,6 +25,8 @@
>>  #include 
>>  #include 
>>
>> +voidrw_do_exit(struct rwlock *, unsigned long);
>> +
>>  /* XXX - temporary measure until proc0 is properly aligned */
>>  #define RW_PROC(p) (((long)p) & ~RWLOCK_MASK)
>>
>> @@ -129,31 +131,31 @@ rw_enter_write(struct rwlock *rwl)
>>  void
>>  rw_exit_read(struct rwlock *rwl)
>>  {
>> -unsigned long owner = rwl->rwl_owner;
>> +unsigned long owner;
>>
>>  rw_assert_rdlock(rwl);
>> +WITNESS_UNLOCK(>rwl_lock_obj, 0);
>>
>>  membar_exit_before_atomic();
>> +owner = rwl->rwl_owner;
>>  if (__predict_false((owner & RWLOCK_WAIT) ||
>>  rw_cas(>rwl_owner, owner, owner - RWLOCK_READ_INCR)))
>> -rw_exit(rwl);
>> -else
>> -WITNESS_UNLOCK(>rwl_lock_obj, 0);
>> +rw_do_exit(rwl, 0);
>>  }
>>
>>  void
>>  rw_exit_write(struct rwlock *rwl)
>>  {
>> -unsigned long owner = rwl->rwl_owner;
>> +unsigned long owner;
>>
>>  rw_assert_wrlock(rwl);
>> +WITNESS_UNLOCK(>rwl_lock_obj, LOP_EXCLUSIVE);
>>
>>  membar_exit_before_atomic();
>> +owner = rwl->rwl_owner;
>>  if (__predict_false((owner & RWLOCK_WAIT) ||
>>  rw_cas(>rwl_owner, owner, 0)))
>> -rw_exit(rwl);
>> -

Re: Fix use of WITNESS_UNLOCK() in rw_exit_{read,write}()

2020-03-02 Thread Mateusz Guzik
On 2/29/20, Visa Hankala  wrote:
> There is a bug in how rw_exit_read() and rw_exit_write() use
> WITNESS_UNLOCK(). The fast paths call WITNESS_UNLOCK() after releasing
> the actual lock. This leads to a use-after-free in the checker if the
> lock is dynamically allocated and another thread happens to free it
> too early.
>
> The following diff splits rw_exit() into two separate functions, one
> which is the rw_exit(9) frontend, and the other, rw_do_exit(), which is
> common between rw_exit(9), rw_exit_read(9) and rw_exit_write(9). This
> makes it possible to call WITNESS_UNLOCK() before the actual lock
> release without turning the code into an #ifdef maze.
>
> The diff additionally does the following tweaks:
>
> * Invoke membar_exit_before_atomic() only once even if the slow path is
>   taken. The initial barrier is enough for the release semantics.
>
> * Read rwl_owner for rw_cas() as late as possible so that the CAS
>   operation would be less likely to fail. Note that in rw_exit() the
>   purpose of the read is different; it determines the type of the lock
>   (read / write).
>
> * Put the rw_cas() in rw_exit() (now rw_do_exit()) inside
>   __predict_false(). This compacts the resulting machine code a bit.
>
> The code could be simplified by removing the fast path from
> rw_exit_read() and rw_exit_write(), and always calling rw_do_exit().
> However, that would reduce performance a tiny amount. I saw about 0.5%
> increase in kernel build time in a test setup. The patch below does
> not seem to affect performance negatively.
>
> OK?

If the slowdown is repeatable it most likely happens because of the
immediate re-read in the loop in the slow path.

Also note performance bugs in unlock:
- *any* read unlock wakes up everyone, so if the lock has many blocked
waiters they will all wake up only be likely to go back to sleep. On
the other hand this may be an anomaly which sometimes improves
performance because it in corner cases it can mitigate lack of
adaptive spinning
- the loop keeps re-reading the word instead of using the value
returned by cmpxchg. so happens rw_cas macro explicitly throws it away

Splitting unlock paths between reader and writer cases is unavoidable
in the long term (to allow handling different wake up policies) and I
think this is a great opportunity to do it all the way.

That said, I propose you do roughly this:

static void __always_inline
rw_exit_read_impl(struct rwlock *rwl, unsigned long v)
{

rw_assert_rdlock(rwl);
WITNESS_UNLOCK(>rwl_lock_obj, 0);

 cas loop + wake up come here
}

void
rw_exit_read(struct rwlock *rwl)
{

membar_exit_before_atomic();
rw_exit_read_impl(rwl, rwl->rwl_owner);
}

void
rw_exit(struct rwlock *rwl)
{
unsigned long v;

membar_exit_before_atomic();
v = rwl->rwl_owner;
if (v & RWLOCK_WRLOCK)
rw_exit_write_impl(rwl, v);
else
rw_exit_read_impl(rwl, v);
}

And similarly for write.

That is, the bottom routines assume the fence was posted and they got
a "fresh" value.

>
> Index: kern/kern_rwlock.c
> ===
> RCS file: src/sys/kern/kern_rwlock.c,v
> retrieving revision 1.44
> diff -u -p -r1.44 kern_rwlock.c
> --- kern/kern_rwlock.c30 Nov 2019 11:19:17 -  1.44
> +++ kern/kern_rwlock.c29 Feb 2020 16:24:59 -
> @@ -25,6 +25,8 @@
>  #include 
>  #include 
>
> +void rw_do_exit(struct rwlock *, unsigned long);
> +
>  /* XXX - temporary measure until proc0 is properly aligned */
>  #define RW_PROC(p) (((long)p) & ~RWLOCK_MASK)
>
> @@ -129,31 +131,31 @@ rw_enter_write(struct rwlock *rwl)
>  void
>  rw_exit_read(struct rwlock *rwl)
>  {
> - unsigned long owner = rwl->rwl_owner;
> + unsigned long owner;
>
>   rw_assert_rdlock(rwl);
> + WITNESS_UNLOCK(>rwl_lock_obj, 0);
>
>   membar_exit_before_atomic();
> + owner = rwl->rwl_owner;
>   if (__predict_false((owner & RWLOCK_WAIT) ||
>   rw_cas(>rwl_owner, owner, owner - RWLOCK_READ_INCR)))
> - rw_exit(rwl);
> - else
> - WITNESS_UNLOCK(>rwl_lock_obj, 0);
> + rw_do_exit(rwl, 0);
>  }
>
>  void
>  rw_exit_write(struct rwlock *rwl)
>  {
> - unsigned long owner = rwl->rwl_owner;
> + unsigned long owner;
>
>   rw_assert_wrlock(rwl);
> + WITNESS_UNLOCK(>rwl_lock_obj, LOP_EXCLUSIVE);
>
>   membar_exit_before_atomic();
> + owner = rwl->rwl_owner;
>   if (__predict_false((owner & RWLOCK_WAIT) ||
>   rw_cas(>rwl_owner, owner, 0)))
> - rw_exit(rwl);
> - else
> - WITNESS_UNLOCK(>rwl_lock_obj, LOP_EXCLUSIVE);
> + rw_do_exit(rwl, RWLOCK_WRLOCK);
>  }
>
>  #ifdef DIAGNOSTIC
> @@ -314,22 +316,29 @@ retry:
>  void
>  rw_exit(struct rwlock *rwl)
>  {
> - unsigned long owner = rwl->rwl_owner;
> - int wrlock = owner & RWLOCK_WRLOCK;
> - unsigned long set;
> + unsigned long 

fix ldapd/ldapctl data directory handling

2020-03-02 Thread Robert Klein
Hi,

in ldapd and ldapctl the "-r directory" command line argument does not
work:

ldapd fork/execs itself but the directory command line argument is not
given to the execvp call which then uses the default data directory.

ldapctl has a thinko/typo which causes the data_db to be always opened
in the default directory for indexing.

The patch below removes the command line argument and corresponding
global variable and instead uses a configuration file directive
"datadir".  Incidentally removes the global 'datadir' variable used
before.

Best regards
Robert


---
commit 6a71c90c19b5dc5850305c15696af3f14d26c168 (master)
from: Robert Klein 
date: Mon Mar  2 18:50:12 2020 UTC
 
 fix ldapd/ldapctl data directory handling
 
 -r directive didn't work as intended in both
 ldapd and ldapctl.
 
 this patch replaces command line argument '-r directory'
 with a configuration file directive 'datadir'.
 
diff f62b80b397c780e8273b5cc67d544b951c4c9d1f 
35bb29194b2067dab925ec4566dfb82f9bf34d65
blob - 48cff01b435bca0deebf28f55e6f71675c5752f2
blob + 231af2ddbbee2da446d2f03c9c48e679e216e993
--- usr.sbin/ldapctl/ldapctl.8
+++ usr.sbin/ldapctl/ldapctl.8
@@ -24,7 +24,6 @@
 .Nm ldapctl
 .Op Fl v
 .Op Fl f Ar file
-.Op Fl r Ar directory
 .Op Fl s Ar socket
 .Ar command
 .Op Ar argument ...
@@ -42,11 +41,6 @@ Use
 .Ar file
 as the configuration file, instead of the default
 .Pa /etc/ldapd.conf .
-.It Fl r Ar directory
-Store and read database files in
-.Ar directory ,
-instead of the default
-.Pa /var/db/ldap .
 .It Fl s Ar socket
 Use
 .Ar socket
blob - c8564c5543f518a720e049c559556f87edda6b8a
blob + 6830a603d153794390faafbc9aadc8ef0bd985c0
--- usr.sbin/ldapctl/ldapctl.c
+++ usr.sbin/ldapctl/ldapctl.c
@@ -58,10 +58,10 @@ void show_stats(struct imsg *imsg);
 voidshow_dbstats(const char *prefix, struct btree_stat *st);
 voidshow_nsstats(struct imsg *imsg);
 int compact_db(const char *path);
-int compact_namespace(struct namespace *ns, const char *datadir);
-int compact_namespaces(const char *datadir);
-int index_namespace(struct namespace *ns, const char *datadir);
-int index_namespaces(const char *datadir);
+int compact_namespace(struct namespace *ns);
+int compact_namespaces(void);
+int index_namespace(struct namespace *ns);
+int index_namespaces(void);
 int ssl_load_certfile(struct ldapd_config *, const char *, 
u_int8_t);
 
 __dead void
@@ -70,7 +70,7 @@ usage(void)
extern char *__progname;
 
fprintf(stderr,
-   "usage: %s [-v] [-f file] [-r directory] [-s socket] "
+   "usage: %s [-v] [-f file] [-s socket] "
"command [argument ...]\n",
__progname);
exit(1);
@@ -97,11 +97,11 @@ compact_db(const char *path)
 }
 
 int
-compact_namespace(struct namespace *ns, const char *datadir)
+compact_namespace(struct namespace *ns)
 {
char*path;
 
-   if (asprintf(, "%s/%s_data.db", datadir, ns->suffix) == -1)
+   if (asprintf(, "%s/%s_data.db", conf->datadir, ns->suffix) == -1)
return -1;
if (compact_db(path) != 0) {
log_warn("%s", path);
@@ -110,7 +110,7 @@ compact_namespace(struct namespace *ns, const char *da
}
free(path);
 
-   if (asprintf(, "%s/%s_indx.db", datadir, ns->suffix) == -1)
+   if (asprintf(, "%s/%s_indx.db", conf->datadir, ns->suffix) == -1)
return -1;
if (compact_db(path) != 0) {
log_warn("%s", path);
@@ -123,14 +123,14 @@ compact_namespace(struct namespace *ns, const char *da
 }
 
 int
-compact_namespaces(const char *datadir)
+compact_namespaces()
 {
struct namespace*ns;
 
TAILQ_FOREACH(ns, >namespaces, next) {
if (SLIST_EMPTY(>referrals))
continue;
-   if (compact_namespace(ns, datadir) != 0)
+   if (compact_namespace(ns) != 0)
return -1;
}
 
@@ -138,7 +138,7 @@ compact_namespaces(const char *datadir)
 }
 
 int
-index_namespace(struct namespace *ns, const char *datadir)
+index_namespace(struct namespace *ns)
 {
struct btval key, val;
struct btree*data_db, *indx_db;
@@ -150,14 +150,14 @@ index_namespace(struct namespace *ns, const char *data
 
log_info("indexing namespace %s", ns->suffix);
 
-   if (asprintf(, "%s/%s_data.db", DATADIR, ns->suffix) == -1)
+   if (asprintf(, "%s/%s_data.db", conf->datadir, ns->suffix) == -1)
return -1;
data_db = btree_open(path, BT_NOSYNC | BT_REVERSEKEY, 0644);
free(path);
if (data_db == NULL)
return -1;
 
-   if (asprintf(, "%s/%s_indx.db", datadir, ns->suffix) == -1)
+   if (asprintf(, "%s/%s_indx.db", conf->datadir, ns->suffix) == -1)
return -1;
 

Re: puc(4): 64-bit BARs

2020-03-02 Thread Theo de Raadt
That's quite a hack you wrote, but I think any nicer way of doing it is
going to require a big rewrite..

Patrick Wildt  wrote:

> as it turns out, puc(4) has trouble with 64-bit BARs.  The issue is that
> puc(4) tries to map every BAR before doing anything else.  While iter-
> ating over the BARs it assumes the next BAR is always 4 bytes after the
> other.  For 64-bit BARs that's wrong.  On a device with a 64-bit BAR, it
> will map the BAR correctly, and then using the 0x4 offset try to map an-
> other BAR, thus breaking the previously correctly mapped bar.
> 
> Now I'm not sure this is the best fix, but I think the easiest way is to
> just skip that particular entry.  Thus, with two 64-bit BARs, sc->sc_bar
> mappings[1] and [3] will be empty.  That's fine since the pucdata.c
> entries reference the absolute BAR address, like 0x10 and 0x18, and the
> mapping to the index of the array is done using that absolute value:
> 
> #define PUC_PORT_BAR_INDEX(bar) (((bar) - PCI_MAPREG_START) / 4)
> 
> bar = PUC_PORT_BAR_INDEX(desc->ports[i].bar);
> if (!sc->sc_bar_mappings[bar].mapped) {
> ...
> 
> Feedback?
> 
> Patrick
> 
> diff --git a/sys/dev/pci/puc.c b/sys/dev/pci/puc.c
> index 2c5f3065177..7ec25ecaad2 100644
> --- a/sys/dev/pci/puc.c
> +++ b/sys/dev/pci/puc.c
> @@ -163,8 +163,11 @@ puc_pci_attach(struct device *parent, struct device 
> *self, void *aux)
>   0, >sc_bar_mappings[i].t, >sc_bar_mappings[i].h,
>   >sc_bar_mappings[i].a, >sc_bar_mappings[i].s, 0)
> == 0);
> - if (sc->sc_bar_mappings[i].mapped)
> + if (sc->sc_bar_mappings[i].mapped) {
> + if (type == PCI_MAPREG_MEM_TYPE_64BIT)
> + i++;
>   continue;
> + }
>  
>  #if NCOM > 0
>   /*
> @@ -184,6 +187,8 @@ puc_pci_attach(struct device *parent, struct device 
> *self, void *aux)
>   sc->sc_bar_mappings[i].h = comconsioh;
>   sc->sc_bar_mappings[i].s = COM_NPORTS;
>   sc->sc_bar_mappings[i].mapped = 1;
> + if (type == PCI_MAPREG_MEM_TYPE_64BIT)
> + i++;
>   continue;
>   }
>  #endif
> 



Re: librthread sem_t opaqueness, storage & unnamed semaphore sharing

2020-03-02 Thread Ted Unangst
On 2020-03-02, Lauri Tirkkonen wrote:

> Thanks for the input, and ping - is there still something about this
> diff that I should fix?

I'm kinda busy, but I should be able to look into it eventually.



Re: puc(4): 64-bit BARs

2020-03-02 Thread Mark Kettenis
> Date: Mon, 2 Mar 2020 13:56:11 +0100
> From: Patrick Wildt 
> 
> Hi,
> 
> as it turns out, puc(4) has trouble with 64-bit BARs.  The issue is that
> puc(4) tries to map every BAR before doing anything else.  While iter-
> ating over the BARs it assumes the next BAR is always 4 bytes after the
> other.  For 64-bit BARs that's wrong.  On a device with a 64-bit BAR, it
> will map the BAR correctly, and then using the 0x4 offset try to map an-
> other BAR, thus breaking the previously correctly mapped bar.
> 
> Now I'm not sure this is the best fix, but I think the easiest way is to
> just skip that particular entry.  Thus, with two 64-bit BARs, sc->sc_bar
> mappings[1] and [3] will be empty.  That's fine since the pucdata.c
> entries reference the absolute BAR address, like 0x10 and 0x18, and the
> mapping to the index of the array is done using that absolute value:
> 
> #define PUC_PORT_BAR_INDEX(bar) (((bar) - PCI_MAPREG_START) / 4)
> 
> bar = PUC_PORT_BAR_INDEX(desc->ports[i].bar);
> if (!sc->sc_bar_mappings[bar].mapped) {
> ...
> 
> Feedback?

ok kettenis@

> Patrick
> 
> diff --git a/sys/dev/pci/puc.c b/sys/dev/pci/puc.c
> index 2c5f3065177..7ec25ecaad2 100644
> --- a/sys/dev/pci/puc.c
> +++ b/sys/dev/pci/puc.c
> @@ -163,8 +163,11 @@ puc_pci_attach(struct device *parent, struct device 
> *self, void *aux)
>   0, >sc_bar_mappings[i].t, >sc_bar_mappings[i].h,
>   >sc_bar_mappings[i].a, >sc_bar_mappings[i].s, 0)
> == 0);
> - if (sc->sc_bar_mappings[i].mapped)
> + if (sc->sc_bar_mappings[i].mapped) {
> + if (type == PCI_MAPREG_MEM_TYPE_64BIT)
> + i++;
>   continue;
> + }
>  
>  #if NCOM > 0
>   /*
> @@ -184,6 +187,8 @@ puc_pci_attach(struct device *parent, struct device 
> *self, void *aux)
>   sc->sc_bar_mappings[i].h = comconsioh;
>   sc->sc_bar_mappings[i].s = COM_NPORTS;
>   sc->sc_bar_mappings[i].mapped = 1;
> + if (type == PCI_MAPREG_MEM_TYPE_64BIT)
> + i++;
>   continue;
>   }
>  #endif
> 
> 



puc(4): 64-bit BARs

2020-03-02 Thread Patrick Wildt
Hi,

as it turns out, puc(4) has trouble with 64-bit BARs.  The issue is that
puc(4) tries to map every BAR before doing anything else.  While iter-
ating over the BARs it assumes the next BAR is always 4 bytes after the
other.  For 64-bit BARs that's wrong.  On a device with a 64-bit BAR, it
will map the BAR correctly, and then using the 0x4 offset try to map an-
other BAR, thus breaking the previously correctly mapped bar.

Now I'm not sure this is the best fix, but I think the easiest way is to
just skip that particular entry.  Thus, with two 64-bit BARs, sc->sc_bar
mappings[1] and [3] will be empty.  That's fine since the pucdata.c
entries reference the absolute BAR address, like 0x10 and 0x18, and the
mapping to the index of the array is done using that absolute value:

#define PUC_PORT_BAR_INDEX(bar) (((bar) - PCI_MAPREG_START) / 4)

bar = PUC_PORT_BAR_INDEX(desc->ports[i].bar);
if (!sc->sc_bar_mappings[bar].mapped) {
...

Feedback?

Patrick

diff --git a/sys/dev/pci/puc.c b/sys/dev/pci/puc.c
index 2c5f3065177..7ec25ecaad2 100644
--- a/sys/dev/pci/puc.c
+++ b/sys/dev/pci/puc.c
@@ -163,8 +163,11 @@ puc_pci_attach(struct device *parent, struct device *self, 
void *aux)
0, >sc_bar_mappings[i].t, >sc_bar_mappings[i].h,
>sc_bar_mappings[i].a, >sc_bar_mappings[i].s, 0)
  == 0);
-   if (sc->sc_bar_mappings[i].mapped)
+   if (sc->sc_bar_mappings[i].mapped) {
+   if (type == PCI_MAPREG_MEM_TYPE_64BIT)
+   i++;
continue;
+   }
 
 #if NCOM > 0
/*
@@ -184,6 +187,8 @@ puc_pci_attach(struct device *parent, struct device *self, 
void *aux)
sc->sc_bar_mappings[i].h = comconsioh;
sc->sc_bar_mappings[i].s = COM_NPORTS;
sc->sc_bar_mappings[i].mapped = 1;
+   if (type == PCI_MAPREG_MEM_TYPE_64BIT)
+   i++;
continue;
}
 #endif