Re: callout_drain either broken or man page needs updating

2016-07-14 Thread Matthew Macy



  On Thu, 14 Jul 2016 21:21:57 -0700 Hans Petter Selasky  
wrote  
 > On 07/15/16 05:45, Matthew Macy wrote: 
 > > glebius last commit needs some further re-work. 
 >  
 > Hi, 
 >  
 > Glebius commit needs to be backed out, at least the API change that  
 > changes the return value when calling callout_stop() when the callout is  
 > scheduled and being serviced. Simply because there is code out there,  
 > like Mattew and others have discovered that is "refcounting" on the  
 > callout_reset() and expecting that a subsequent callout_stop() will  
 > return 1 to "unref". 

Yes. This is the cause of the "refcnt 0 on LLE at boot..." regression.

-M



 >  
 > If you consider this impossible, maybe a fourth return value is needed  
 > for CANCELLED and DRAINING . 
 >  
 > Further, getting the callouts straight in the TCP stack is a matter of  
 > doing the locking correctly, which some has called "my magic bullet" and  
 > not the return values. I've proposed in the following revision  
 > https://svnweb.freebsd.org/changeset/base/302768 to add a new callout  
 > API that accepts a locking function so that the callout code can run its  
 > cancelled checks at the right place for situations where more than one  
 > lock is needed. 
 >  
 > Consider this case: 
 >  
 > > void 
 > > tcp_timer_2msl(void *xtp) 
 > > { 
 > > struct tcpcb *tp = xtp; 
 > > struct inpcb *inp; 
 > > CURVNET_SET(tp->t_vnet); 
 > > #ifdef TCPDEBUG 
 > > int ostate; 
 > > 
 > > ostate = tp->t_state; 
 > > #endif 
 > > INP_INFO_RLOCK(_tcbinfo); 
 > > inp = tp->t_inpcb; 
 > > KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, 
 > > tp)); 
 > > INP_WLOCK(inp); 
 > > tcp_free_sackholes(tp); 
 > > if (callout_pending(>t_timers->tt_2msl) || 
 > > !callout_active(>t_timers->tt_2msl)) { 
 >  
 > Here we have custom in-house race check that doesn't affect the return  
 > value of callout_reset() nor callout_stop(). 
 >  
 > > INP_WUNLOCK(tp->t_inpcb); 
 > > INP_INFO_RUNLOCK(_tcbinfo); 
 > > CURVNET_RESTORE(); 
 > > return; 
 >  
 >  
 > I propose the following solution: 
 >  
 > > 
 > > static void 
 > > tcp_timer_2msl_lock(void *xtp, int do_lock) 
 > > { 
 > > struct tcpcb *tp = xtp; 
 > > struct inpcb *inp; 
 > > 
 > > inp = tp->t_inpcb; 
 > > 
 > > if (do_lock) { 
 > > CURVNET_SET(tp->t_vnet); 
 > > INP_INFO_RLOCK(_tcbinfo); 
 > > INP_WLOCK(inp); 
 > > } else { 
 > > INP_WUNLOCK(inp); 
 > > INP_INFO_RUNLOCK(_tcbinfo); 
 > > CURVNET_RESTORE(); 
 > > } 
 > > } 
 > > 
 >  
 > callout_init_lock_function(, _timer_2msl_lock,  
 > CALLOUT_RETURNUNLOCKED); 
 >  
 > Then in softclock_call_cc() it will look like this: 
 >  
 > > 
 > > CC_UNLOCK(cc); 
 > > if (c_lock != NULL) { 
 > > if (have locking function) 
 > > tcp_timer_2msl_lock(c_arg, 1); 
 > > else 
 > > class->lc_lock(c_lock, lock_status); 
 > > /* 
 > >  * The callout may have been cancelled 
 > >  * while we switched locks. 
 > >  */ 
 >  
 > Actually "CC_LOCK(cc)" should be in-front of cc_exec_cancel() to avoid  
 > races testing, setting and clearing this variable, like done in hps_head. 
 >  
 > > if (cc_exec_cancel(cc, direct)) { 
 > > if (have locking function) 
 > > tcp_timer_2msl_lock(c_arg, 0); 
 > > else 
 > > class->lc_unlock(c_lock); 
 > > goto skip; 
 > >} 
 >  >cc_exec_cancel(cc, direct) = true; 
 > > 
 > >  
 > > 
 > > skip: 
 > > if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) { 
 > > if (have locking function) 
 > > ... 
 > > else 
 > > class->lc_unlock(c_lock); 
 > > } 
 >  
 > The whole point about this is to make the the cancelled check atomic. 
 >  
 > 1) Lock TCP 
 > 2) Lock CC_LOCK() 
 > 3) change callout state 
 >  
 > --HPS 
 > ___ 
 > freebsd-current@freebsd.org mailing list 
 > https://lists.freebsd.org/mailman/listinfo/freebsd-current 
 > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" 
 > 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: refcnt 0 on LLE at boot....

2016-07-14 Thread Matthew Macy



  On Thu, 07 Jul 2016 06:36:19 -0700 Larry Rosenman  wrote 
 
 > Thanks for that.  I've added myself to the cc list, and a comment about 
 > having 2 vmcore's.
 > 
This was introduced by  302350.  It broke the return value of 
callout_{stop,drain}. returning 1 even if the callout system did not hold a 
reference. That in turn broke the following code in lltable_free:

LIST_FOREACH_SAFE(lle, , lle_chain, next) {
if (callout_stop(>lle_timer) > 0)
LLE_REMREF(lle);
llentry_free(lle);
}


 > 
 > On 2016-07-07 08:28, Edward Tomasz NapieraƂa wrote:
 > > FWIW, I'm seeing this too.  I've filed a PR:
 > > 
 > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884
 > > 
 > > On 0707T0813, Larry Rosenman wrote:
 > >> and now it's been up for 13+ hours.  I do have both VMCORE's from the 
 > >> 2
 > >> crashes.
 > >> 
 > >> 
 > >> 
 > >> On 2016-07-06 18:22, Larry Rosenman wrote:
 > >> > Got a similar crash a few minutes later.
 > >> >
 > >> >
 > >> > On 2016-07-06 18:17, Larry Rosenman wrote:
 > >> >> First boot, and I got the following panic.  2nd boot ran just fine.
 > >> >>
 > >> >>
 > >> >> borg.lerctr.org dumped core - see /var/crash/vmcore.0
 > >> >>
 > >> >> Wed Jul  6 18:13:34 CDT 2016
 > >> >>
 > >> >> FreeBSD borg.lerctr.org 11.0-ALPHA6 FreeBSD 11.0-ALPHA6 #5 r302379:
 > >> >> Wed Jul  6 16:59:11 CDT 2016
 > >> >> r...@borg.lerctr.org:/usr/obj/usr/src/sys/VT-LER  amd64
 > >> >>
 > >> >> panic: bogus refcnt 0 on lle 0xf800aa941200
 > >> >>
 > >> >> GNU gdb 6.1.1 [FreeBSD]
 > >> >> Copyright 2004 Free Software Foundation, Inc.
 > >> >> GDB is free software, covered by the GNU General Public License, and
 > >> >> you are
 > >> >> welcome to change it and/or distribute copies of it under certain
 > >> >> conditions.
 > >> >> Type "show copying" to see the conditions.
 > >> >> There is absolutely no warranty for GDB.  Type "show warranty" for
 > >> >> details.
 > >> >> This GDB was configured as "amd64-marcel-freebsd"...
 > >> >>
 > >> >> Unread portion of the kernel message buffer:
 > >> >> Copyright (c) 1992-2016 The FreeBSD Project.
 > >> >> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
 > >> >> 1994
 > >> >> The Regents of the University of California. All rights reserved.
 > >> >> FreeBSD is a registered trademark of The FreeBSD Foundation.
 > >> >> FreeBSD 11.0-ALPHA6 #5 r302379: Wed Jul  6 16:59:11 CDT 2016
 > >> >> r...@borg.lerctr.org:/usr/obj/usr/src/sys/VT-LER amd64
 > >> >> FreeBSD clang version 3.8.0 (tags/RELEASE_380/final 262564) (based on
 > >> >> LLVM 3.8.0)
 > >> >> can't re-use a leaf (ixl_rx_miss_bufs)!
 > >> >> MEMGUARD DEBUGGING ALLOCATOR INITIALIZED:
 > >> >> MEMGUARD map base: 0xfe40
 > >> >> MEMGUARD map size: 128604256 KBytes
 > >> >> VT(vga): resolution 640x480
 > >> >> CPU: Intel(R) Xeon(R) CPU   E5410  @ 2.33GHz (2327.55-MHz
 > >> >> K8-class CPU)
 > >> >>   Origin="GenuineIntel"  Id=0x10676  Family=0x6  Model=0x17
 > >> >> Stepping=6
 > >> >>
 > >> >> Features=0xbfebfbff
 > >> >>
 > >> >> Features2=0xce3bd
 > >> >>   AMD Features=0x20100800
 > >> >>   AMD Features2=0x1
 > >> >>   VT-x: HLT,PAUSE
 > >> >>   TSC: P-state invariant, performance statistics
 > >> >> real memory  = 68719476736 (65536 MB)
 > >> >> avail memory = 65382842368 (62353 MB)
 > >> >> Event timer "LAPIC" quality 400
 > >> >> ACPI APIC Table: 
 > >> >> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
 > >> >> FreeBSD/SMP: 2 package(s) x 4 core(s)
 > >> >> random: unblocking device.
 > >> >> ioapic0  irqs 0-23 on motherboard
 > >> >> ioapic1  irqs 24-47 on motherboard
 > >> >> random: entropy device external interface
 > >> >> netmap: loaded module
 > >> >> module_register_init: MOD_LOAD (vesa, 0x80f2cb40, 0) error 19
 > >> >> kbd1 at kbdmux0
 > >> >> vtvga0:  on motherboard
 > >> >> cryptosoft0:  on motherboard
 > >> >> acpi0:  on motherboard
 > >> >> acpi0: Power Button (fixed)
 > >> >> unknown: I/O range not supported
 > >> >> cpu0:  on acpi0
 > >> >> cpu1:  on acpi0
 > >> >> cpu2:  on acpi0
 > >> >> cpu3:  on acpi0
 > >> >> cpu4:  on acpi0
 > >> >> cpu5:  on acpi0
 > >> >> cpu6:  on acpi0
 > >> >> cpu7:  on acpi0
 > >> >> hpet0:  iomem 0xfed0-0xfed003ff irq
 > >> >> 0,8 on acpi0
 > >> >> Timecounter "HPET" frequency 14318180 Hz quality 950
 > >> >> Event timer "HPET" frequency 14318180 Hz quality 350
 > >> >> Event timer "HPET1" frequency 14318180 Hz quality 340
 > >> >> Event timer "HPET2" frequency 14318180 Hz quality 340
 > >> >> atrtc0:  port 0x70-0x71 on acpi0
 > >> >> Event timer "RTC" frequency 32768 Hz quality 0
 > >> >> attimer0:  port 0x40-0x43,0x50-0x53 on acpi0
 > >> >> Timecounter "i8254" frequency 1193182 Hz quality 0
 > 

Re: callout_drain either broken or man page needs updating

2016-07-14 Thread Hans Petter Selasky

On 07/15/16 05:45, Matthew Macy wrote:

glebius last commit needs some further re-work.


Hi,

Glebius commit needs to be backed out, at least the API change that 
changes the return value when calling callout_stop() when the callout is 
scheduled and being serviced. Simply because there is code out there, 
like Mattew and others have discovered that is "refcounting" on the 
callout_reset() and expecting that a subsequent callout_stop() will 
return 1 to "unref".


If you consider this impossible, maybe a fourth return value is needed 
for CANCELLED and DRAINING .


Further, getting the callouts straight in the TCP stack is a matter of 
doing the locking correctly, which some has called "my magic bullet" and 
not the return values. I've proposed in the following revision 
https://svnweb.freebsd.org/changeset/base/302768 to add a new callout 
API that accepts a locking function so that the callout code can run its 
cancelled checks at the right place for situations where more than one 
lock is needed.


Consider this case:


void
tcp_timer_2msl(void *xtp)
{
struct tcpcb *tp = xtp;
struct inpcb *inp;
CURVNET_SET(tp->t_vnet);
#ifdef TCPDEBUG
int ostate;

ostate = tp->t_state;
#endif
INP_INFO_RLOCK(_tcbinfo);
inp = tp->t_inpcb;
KASSERT(inp != NULL, ("%s: tp %p tp->t_inpcb == NULL", __func__, tp));
INP_WLOCK(inp);
tcp_free_sackholes(tp);
if (callout_pending(>t_timers->tt_2msl) ||
!callout_active(>t_timers->tt_2msl)) {


Here we have custom in-house race check that doesn't affect the return 
value of callout_reset() nor callout_stop().



INP_WUNLOCK(tp->t_inpcb);
INP_INFO_RUNLOCK(_tcbinfo);
CURVNET_RESTORE();
return;



I propose the following solution:



static void
tcp_timer_2msl_lock(void *xtp, int do_lock)
{
struct tcpcb *tp = xtp;
struct inpcb *inp;

inp = tp->t_inpcb;

if (do_lock) {
CURVNET_SET(tp->t_vnet);
INP_INFO_RLOCK(_tcbinfo);
INP_WLOCK(inp);
} else {
INP_WUNLOCK(inp);
INP_INFO_RUNLOCK(_tcbinfo);
CURVNET_RESTORE();
}
}



callout_init_lock_function(, _timer_2msl_lock, 
CALLOUT_RETURNUNLOCKED);


Then in softclock_call_cc() it will look like this:



CC_UNLOCK(cc);
if (c_lock != NULL) {
if (have locking function)
tcp_timer_2msl_lock(c_arg, 1);
else
class->lc_lock(c_lock, lock_status);
/*
 * The callout may have been cancelled
 * while we switched locks.
 */


Actually "CC_LOCK(cc)" should be in-front of cc_exec_cancel() to avoid 
races testing, setting and clearing this variable, like done in hps_head.



if (cc_exec_cancel(cc, direct)) {
if (have locking function)
tcp_timer_2msl_lock(c_arg, 0);
else
class->lc_unlock(c_lock);
goto skip;
   }

>cc_exec_cancel(cc, direct) = true;




skip:
if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0) {
if (have locking function)
...
else
class->lc_unlock(c_lock);
}


The whole point about this is to make the the cancelled check atomic.

1) Lock TCP
2) Lock CC_LOCK()
3) change callout state

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


callout_drain either broken or man page needs updating

2016-07-14 Thread Matthew Macy

Upon updating my drm-next branch to the latest -CURRENT callout_drain returning 
no longer means that the function was in fact pending when it was called.


This little bit of code will panic because dwork->wq is NULL, because the 
callout was _not_ in fact enqueued. So either it's no longer possible to 
reliably query if a callout was pending while clearing it and we're ok with 
that or glebius last commit needs some further re-work.



#define del_timer_sync(timer)   (callout_drain(&(timer)->timer_callout) == 1)

static inline bool
flush_delayed_work(struct delayed_work *dwork)
{

if (del_timer_sync(>timer))
linux_queue_work(dwork->cpu, dwork->wq, >work);
return (flush_work(>work));
}

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-BETA2 may be delayed

2016-07-14 Thread Glen Barber
On Wed, Jul 13, 2016 at 11:10:17PM +, Glen Barber wrote:
> As I am sure you have already seen, there is an issue in 11.0-BETA1 that
> has caused some headaches for people.
> 
>  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884
> 
> The issue is actively being investigated, and despite the KBI freeze,
> the fix may need to break KBI in stable/11, at least as far as I am
> aware at the moment.
> 
> That said, 11.0-BETA2 may be delayed, while this is properly resolved.
> If KBI is changed between 11.0-BETA1 and 11.0-BETA2, it will be noted in
> the announcement.
> 

The latest patch for this issue seems promising so far, but for the sake
of being cautious, 11.0-BETA2 is going to be delayed by a few days.  The
rationale is that we want to see the affected machines survive uptime of
at least 48 hours.

At this point, we are looking at the fix being committed in just over
one day, followed by the normal 3-day MFC timeframe.  If this changes,
an update will be emailed.

Apologies again for the delay with 11.0-BETA2, and thank you for your
patience.

Glen
On behalf of:   re@



signature.asc
Description: PGP signature


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-14 Thread Glen Barber
On Thu, Jul 14, 2016 at 11:13:56PM +, Glen Barber wrote:
> Could people try this on various hardware, KVM setups, and so on?  I'm
> mainly interested if you get to the bsdinstall(8) screen, not issues not
> directly related to using GEOM_UZIP to compress the image further.
> (Meaning, I'm not asking for people to do installs from this image.)
> 

I forgot to mention, the UEFI bits are not in place in this image, so
likely will not boot on UEFI-only systems.  This will be added after the
project branch is created and what I have now is added to the build
tools.

Glen



signature.asc
Description: PGP signature


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-14 Thread Glen Barber
With additional tweaks, I was able to get the CD to boot both with
a real internal CD-ROM drive, as well as USB CD-ROM.

I have uploaded a disc1.iso image here:

 https://people.freebsd.org/~gjb/disc1_uzip.iso

Could people try this on various hardware, KVM setups, and so on?  I'm
mainly interested if you get to the bsdinstall(8) screen, not issues not
directly related to using GEOM_UZIP to compress the image further.
(Meaning, I'm not asking for people to do installs from this image.)

The hashes are:
SHA512 (disc1_uzip.iso) =
560033cbc65932abb77ae85475f3a222fbdd8a35f99ac220f85028ede60a47305d62c5e8eab508bfb3f02f0d074a1dc3200f2a1e409408e34fa9808e800ad6df
SHA256 (disc1_uzip.iso) =
65edbc4ddca29af5f9f03f8a3026e06462f05400d70b79d4a8d0adf2ea875e33

I'll create the project branch shortly, and add the relevant bits
afterward.

Thank you.

Glen

On Thu, Jul 14, 2016 at 08:50:24PM +, Glen Barber wrote:
> Thank you for the additional information.
> 
> I finally found my old laptop's internal CD-ROM drive, so I'll be able
> to at least check if the issue is USB-related.  I just need to open the
> laptop to install it.  After which I'll tinker with the cluster sizes
> and test further.
> 
> Glen
> 
> On Wed, Jul 13, 2016 at 10:30:33PM -0700, Maxim Sobolev wrote:
> > Hi Glen, nice update, glad being of some help. The slowdown may be related
> > to the fact that geom_uzip reads whole compressed cluster, which is 20-30k
> > typically, even if only single block from that cluster is requested. I
> > imagine it might impact rc.d, which is essentially bunch of small(ish)
> > shell scripts and I would not be surprised if their blocks would be
> > scattered all over the place. There is some very basic caching in the
> > geom_uzip module, but it is only one cluster deep. What might help if you
> > still have some room on the CD is to decrease cluster size (-s parameter of
> > mkuzip), to something like 32k or even 16k. That would make compression
> > less effective, but would reduce the I/O bandwidth waste, which could also
> > be important for the KVM setups. I might also look into making a bigger
> > cache, as RAM is getting cheaper and more abundant every day. Another
> > approach would be to make several "partitions", segregating for example
> > /etc stuff so it's all tighly packed together and you can also use smaller
> > cluster size for /etc and bigger for the rest. In any case, keep me posted
> > with your findings.
> > 
> > -Max
> > 
> > On Wed, Jul 13, 2016 at 3:12 PM, Glen Barber  wrote:
> > 
> > > Just replying to the first email in the thread, since it's a general
> > > reply, and only related to the original topic at hand, and only for
> > > informative purposes at this point.
> > >
> > > On Mon, Jul 11, 2016 at 11:01:51PM +0200, Ronald Klop wrote:
> > > > Just downloaded the amd64 BETA1 ISO (873MB) and tried to burn a CD on
> > > > Windows 10. It complained that the ISO is too big for my 700 MB CD-r.
> > > >
> > >
> > > I have *something* semi-working, with a huge amount of help from Maxim
> > > in private email.  There is still a nit or two to fix, I'm running into
> > > them as I rebuild the ISO after fixing the prior issue.  But, right now,
> > > I can get the ISO to boot enough to get to a shell (the "init failed due
> > > to inability to mount '/'" shell, but it is still a shell).  :)
> > >
> > > Once I get what I have now into a state where it's somewhat committable,
> > > I'm going to create a project branch to sand off the edges, instead of
> > > doing it directly in head, since there might be some edge cases for
> > > non-x86 architectures.  (But some other architectures do not have the
> > > "too big" problem.)
> > >
> > > Once that is merged, I fully intend to merge this to stable/11, provided
> > > there is no major fallout.  With what I have now, disc1.iso is 630M, and
> > > the disc1.iso.xz is 554M.  I'll upload an image somewhere public for
> > > people to test 11.0-BETA1 on hardware, KVM, etc.  One thing to note,
> > > though, there appears to be a significantly non-zero speed decrease,
> > > though this may just be because my CD-ROM is USB-based.  When I have the
> > > ready-to-commit result, I'll test it on a machine with an internal CD
> > > drive.
> > >
> > > Glen
> > >
> > >
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"




signature.asc
Description: PGP signature


Re: ath (AR9460) no longer works after going to 11-STABLE r302483

2016-07-14 Thread Adrian Chadd
On 14 July 2016 at 14:37, Wolfgang Zenker  wrote:
> Hi,
>
> * Adrian Chadd  [160710 21:47]:
>> Since you've reverted the ath driver directories without success, I'm
>> mostly out of simple ideas. I think you need to bisect the whole
>> kernel version until you find the commit that broke things.
>
> done. The commit is 11-STABLE r302410. AFAICS the only change here
> is the removal of debugging options from the GENERIC kernel config:
>
> https://svnweb.freebsd.org/base/stable/11/sys/amd64/conf/GENERIC?r1=302408=302410

... loool, okay. Let me see.

Try INVARIANTS and INVARIANT_SUPPORT. Maybe something in the ath
driver needs it.. oops!



-adrian

>
> I guess next would be to remove single debugging options one after the
> other?
>
> Wolfgang
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ath (AR9460) no longer works after going to 11-STABLE r302483

2016-07-14 Thread Wolfgang Zenker
Hi,

* Adrian Chadd  [160710 21:47]:
> Since you've reverted the ath driver directories without success, I'm
> mostly out of simple ideas. I think you need to bisect the whole
> kernel version until you find the commit that broke things.

done. The commit is 11-STABLE r302410. AFAICS the only change here
is the removal of debugging options from the GENERIC kernel config:

https://svnweb.freebsd.org/base/stable/11/sys/amd64/conf/GENERIC?r1=302408=302410

I guess next would be to remove single debugging options one after the
other?

Wolfgang
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic with tcp timers

2016-07-14 Thread Larry Rosenman

On 2016-07-14 12:01, Julien Charbon wrote:

Hi,

On 6/20/16 11:55 AM, Julien Charbon wrote:

On 6/20/16 9:39 AM, Gleb Smirnoff wrote:

On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
J> > Comparing stable/10 and head, I see two changes that could
J> > affect that:
J> >
J> > - callout_async_drain
J> > - switch to READ lock for inp info in tcp timers
J> >
J> > That's why you are in To, Julien and Hans :)
J> >
J> > We continue investigating, and I will keep you updated.
J> > However, any help is welcome. I can share cores.

Now, spending some time with cores and adding a bunch of
extra CTRs, I have a sequence of events that lead to the
panic. In short, the bug is in the callout system. It seems
to be not relevant to the callout_async_drain, at least for
now. The transition to READ lock unmasked the problem, that's
why NetflixBSD 10 doesn't panic.

The panic requires heavy contention on the TCP info lock.

[CPU 1] the callout fires, tcp_timer_keep entered
[CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
[CPU 2] schedules the callout
[CPU 2] tcp_discardcb called
[CPU 2] callout successfully canceled
[CPU 2] tcpcb freed
[CPU 1] unblocks... panic

When the lock was WLOCK, all contenders were resumed in a
sequence they came to the lock. Now, that they are readers,
once the lock is released, readers are resumed in a "random"
order, and this allows tcp_discardcb to go before the old
running callout, and this unmasks the panic.


 Highly interesting.  I should be able to reproduce that (will be 
useful

for testing the corresponding fix).


 Finally, I was able to reproduce it (without glebius fix).   The trick
was to really lower TCP keep timer expiration:

$ sysctl -a | grep tcp.keep
net.inet.tcp.keepidle: 720
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
$ sudo bash -c "sysctl net.inet.tcp.keepidle=10 && sysctl
net.inet.tcp.keepintvl=50 && sysctl net.inet.tcp.keepinit=10"
Password:
net.inet.tcp.keepidle: 720 -> 10
net.inet.tcp.keepintvl: 75000 -> 50
net.inet.tcp.keepinit: 75000 -> 10

 Note: It will certainly close all your ssh connections to the tested
server.

 Now I will test in order:

#1. glebius fix
https://svnweb.freebsd.org/base?view=revision=302350

#2. rss extra fix
https://reviews.freebsd.org/D7135

#3. rrs TCP Timer cleanup
https://reviews.freebsd.org/D7136

 My panic for reference:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 10; apic id = 28
[root@atlas-dl360-4 ~]# instruction pointer = 
0x20:0x80c346f1

stack pointer   = 0x28:0xfe1f29b848b0
frame pointer   = 0x28:0xfe1f29b848e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (4))
trap number = 9
panic: general protection fault
cpuid = 10
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe1f29b844a0
vpanic() at vpanic+0x182/frame 0xfe1f29b84520
panic() at panic+0x43/frame 0xfe1f29b84580
trap_fatal() at trap_fatal+0x351/frame 0xfe1f29b845e0
trap() at trap+0x820/frame 0xfe1f29b847f0
calltrap() at calltrap+0x8/frame 0xfe1f29b847f0
--- trap 0x9, rip = 0x80c346f1, rsp = 0xfe1f29b848c0, rbp =
0xfe1f29b848e0 ---
tcp_timer_keep() at tcp_timer_keep+0x51/frame 0xfe1f29b848e0
softclock_call_cc() at softclock_call_cc+0x19c/frame 0xfe1f29b849c0
softclock() at softclock+0x47/frame 0xfe1f29b849e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame
0xfe1f29b84a20
ithread_loop() at ithread_loop+0xa6/frame 0xfe1f29b84a70
fork_exit() at fork_exit+0x84/frame 0xfe1f29b84ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe1f29b84ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

--
Julien



please see also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210884
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic with tcp timers

2016-07-14 Thread Julien Charbon

 Hi,

On 6/20/16 11:55 AM, Julien Charbon wrote:
> On 6/20/16 9:39 AM, Gleb Smirnoff wrote:
>> On Fri, Jun 17, 2016 at 11:27:39AM +0200, Julien Charbon wrote:
>> J> > Comparing stable/10 and head, I see two changes that could
>> J> > affect that:
>> J> > 
>> J> > - callout_async_drain
>> J> > - switch to READ lock for inp info in tcp timers
>> J> > 
>> J> > That's why you are in To, Julien and Hans :)
>> J> > 
>> J> > We continue investigating, and I will keep you updated.
>> J> > However, any help is welcome. I can share cores.
>>
>> Now, spending some time with cores and adding a bunch of
>> extra CTRs, I have a sequence of events that lead to the
>> panic. In short, the bug is in the callout system. It seems
>> to be not relevant to the callout_async_drain, at least for
>> now. The transition to READ lock unmasked the problem, that's
>> why NetflixBSD 10 doesn't panic.
>>
>> The panic requires heavy contention on the TCP info lock.
>>
>> [CPU 1] the callout fires, tcp_timer_keep entered
>> [CPU 1] blocks on INP_INFO_RLOCK(_tcbinfo);
>> [CPU 2] schedules the callout
>> [CPU 2] tcp_discardcb called
>> [CPU 2] callout successfully canceled
>> [CPU 2] tcpcb freed
>> [CPU 1] unblocks... panic
>>
>> When the lock was WLOCK, all contenders were resumed in a
>> sequence they came to the lock. Now, that they are readers,
>> once the lock is released, readers are resumed in a "random"
>> order, and this allows tcp_discardcb to go before the old
>> running callout, and this unmasks the panic.
> 
>  Highly interesting.  I should be able to reproduce that (will be useful
> for testing the corresponding fix).

 Finally, I was able to reproduce it (without glebius fix).   The trick
was to really lower TCP keep timer expiration:

$ sysctl -a | grep tcp.keep
net.inet.tcp.keepidle: 720
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
$ sudo bash -c "sysctl net.inet.tcp.keepidle=10 && sysctl
net.inet.tcp.keepintvl=50 && sysctl net.inet.tcp.keepinit=10"
Password:
net.inet.tcp.keepidle: 720 -> 10
net.inet.tcp.keepintvl: 75000 -> 50
net.inet.tcp.keepinit: 75000 -> 10

 Note: It will certainly close all your ssh connections to the tested
server.

 Now I will test in order:

#1. glebius fix
https://svnweb.freebsd.org/base?view=revision=302350

#2. rss extra fix
https://reviews.freebsd.org/D7135

#3. rrs TCP Timer cleanup
https://reviews.freebsd.org/D7136

 My panic for reference:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 10; apic id = 28
[root@atlas-dl360-4 ~]# instruction pointer = 0x20:0x80c346f1
stack pointer   = 0x28:0xfe1f29b848b0
frame pointer   = 0x28:0xfe1f29b848e0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (swi4: clock (4))
trap number = 9
panic: general protection fault
cpuid = 10
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe1f29b844a0
vpanic() at vpanic+0x182/frame 0xfe1f29b84520
panic() at panic+0x43/frame 0xfe1f29b84580
trap_fatal() at trap_fatal+0x351/frame 0xfe1f29b845e0
trap() at trap+0x820/frame 0xfe1f29b847f0
calltrap() at calltrap+0x8/frame 0xfe1f29b847f0
--- trap 0x9, rip = 0x80c346f1, rsp = 0xfe1f29b848c0, rbp =
0xfe1f29b848e0 ---
tcp_timer_keep() at tcp_timer_keep+0x51/frame 0xfe1f29b848e0
softclock_call_cc() at softclock_call_cc+0x19c/frame 0xfe1f29b849c0
softclock() at softclock+0x47/frame 0xfe1f29b849e0
intr_event_execute_handlers() at intr_event_execute_handlers+0x96/frame
0xfe1f29b84a20
ithread_loop() at ithread_loop+0xa6/frame 0xfe1f29b84a70
fork_exit() at fork_exit+0x84/frame 0xfe1f29b84ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe1f29b84ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

--
Julien




signature.asc
Description: OpenPGP digital signature


Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-14 Thread Glen Barber
Thank you for the additional information.

I finally found my old laptop's internal CD-ROM drive, so I'll be able
to at least check if the issue is USB-related.  I just need to open the
laptop to install it.  After which I'll tinker with the cluster sizes
and test further.

Glen

On Wed, Jul 13, 2016 at 10:30:33PM -0700, Maxim Sobolev wrote:
> Hi Glen, nice update, glad being of some help. The slowdown may be related
> to the fact that geom_uzip reads whole compressed cluster, which is 20-30k
> typically, even if only single block from that cluster is requested. I
> imagine it might impact rc.d, which is essentially bunch of small(ish)
> shell scripts and I would not be surprised if their blocks would be
> scattered all over the place. There is some very basic caching in the
> geom_uzip module, but it is only one cluster deep. What might help if you
> still have some room on the CD is to decrease cluster size (-s parameter of
> mkuzip), to something like 32k or even 16k. That would make compression
> less effective, but would reduce the I/O bandwidth waste, which could also
> be important for the KVM setups. I might also look into making a bigger
> cache, as RAM is getting cheaper and more abundant every day. Another
> approach would be to make several "partitions", segregating for example
> /etc stuff so it's all tighly packed together and you can also use smaller
> cluster size for /etc and bigger for the rest. In any case, keep me posted
> with your findings.
> 
> -Max
> 
> On Wed, Jul 13, 2016 at 3:12 PM, Glen Barber  wrote:
> 
> > Just replying to the first email in the thread, since it's a general
> > reply, and only related to the original topic at hand, and only for
> > informative purposes at this point.
> >
> > On Mon, Jul 11, 2016 at 11:01:51PM +0200, Ronald Klop wrote:
> > > Just downloaded the amd64 BETA1 ISO (873MB) and tried to burn a CD on
> > > Windows 10. It complained that the ISO is too big for my 700 MB CD-r.
> > >
> >
> > I have *something* semi-working, with a huge amount of help from Maxim
> > in private email.  There is still a nit or two to fix, I'm running into
> > them as I rebuild the ISO after fixing the prior issue.  But, right now,
> > I can get the ISO to boot enough to get to a shell (the "init failed due
> > to inability to mount '/'" shell, but it is still a shell).  :)
> >
> > Once I get what I have now into a state where it's somewhat committable,
> > I'm going to create a project branch to sand off the edges, instead of
> > doing it directly in head, since there might be some edge cases for
> > non-x86 architectures.  (But some other architectures do not have the
> > "too big" problem.)
> >
> > Once that is merged, I fully intend to merge this to stable/11, provided
> > there is no major fallout.  With what I have now, disc1.iso is 630M, and
> > the disc1.iso.xz is 554M.  I'll upload an image somewhere public for
> > people to test 11.0-BETA1 on hardware, KVM, etc.  One thing to note,
> > though, there appears to be a significantly non-zero speed decrease,
> > though this may just be because my CD-ROM is USB-based.  When I have the
> > ready-to-commit result, I'll test it on a machine with an internal CD
> > drive.
> >
> > Glen
> >
> >
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


signature.asc
Description: PGP signature


Re: CURRENT: frequent crashes if mpd5 is running

2016-07-14 Thread Allan Jude
On 2016-07-14 13:13, Oleg V. Nauman wrote:
>  I'm experiencing frequent CURRENT ( 12.0-CURRENT r302535 amd64 ) crashes 
> triggered by mpd5:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x10
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x814f6162
> stack pointer   = 0x28:0xfe011b06d640
> frame pointer   = 0x28:0xfe011b06d670
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 901 (mpd5)
> trap number = 12
> panic: page fault
> cpuid = 0
> 
> #0  doadump (textdump=) at pcpu.h:221
> 221 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) #0  doadump (textdump=) at pcpu.h:221
> #1  0x80749169 in kern_reboot (howto=260)
> at ../../../kern/kern_shutdown.c:366
> #2  0x807496e1 in vpanic (fmt=,
> ap=) at ../../../kern/kern_shutdown.c:759
> #3  0x80749553 in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:690
> #4  0x80a5aca1 in trap_fatal (frame=0xfe011b06d590, eva=16)
> at ../../../amd64/amd64/trap.c:841
> #5  0x80a5af51 in trap_pfault (frame=0x0, usermode=0)
> at ../../../amd64/amd64/trap.c:716
> #6  0x80a5a430 in trap (frame=0xfe011b06d590)
> at ../../../amd64/amd64/trap.c:442
> #7  0x80a3e161 in calltrap () at ../../../amd64/amd64/exception.S:236
> #8  0x814f6162 in ng_uncallout (c=0xf80004842460,
> node=0xf80004c79a00)
> at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3815
> #9  0x8151bbab in ng_pptpgre_disconnect (hook=)
> at 
> /usr/src/sys/modules/netgraph/pptpgre/../../../netgraph/ng_pptpgre.c:966
> #10 0x814f2928 in ng_destroy_hook (hook=0xf8000487ad80)
> at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1219
> #11 0x814f2635 in ng_rmnode (node=,
> dummy1=, dummy2=,
> dummy3=)
> at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:744
> #12 0x814f4832 in ng_apply_item (node=0xf80004c79a00,
> item=0xf80004e72600, rw=1)
> at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2523
> #13 0x814f41a3 in ng_snd_item (item=,
> flags=)
> at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2320
> #14 0x814eec4e in ngc_send (so=,
> flags=, m=,
> addr=, control=,
> td=)
> at /usr/src/sys/modules/netgraph/socket/../../../netgraph/ng_socket.c:338
> #15 0x807dee17 in sosend_generic (so=,
> addr=, uio=,
> top=, control=,
> flags=, td=)
> at ../../../kern/uipc_socket.c:1359
> #16 0x807e66b8 in kern_sendit (td=,
> s=, mp=, flags=0, control=0x0,
> segflg=) at ../../../kern/uipc_syscalls.c:848
> #17 0x807e6abf in sendit (td=0xf800047caa00,
> s=, mp=0xfe011b06da60,
> flags=) at ../../../kern/uipc_syscalls.c:775
> #18 0x807e690d in sys_sendto (td=0x0, uap=)
> at ../../../kern/uipc_syscalls.c:899
> #19 0x80a5b618 in amd64_syscall (td=, traced=0)
> at subr_syscall.c:135
> #20 0x80a3e44b in Xfast_syscall ()
> at ../../../amd64/amd64/exception.S:396
> #21 0x0008025d284a in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 

There is a patch for this issue:

https://reviews.freebsd.org/D7209

You might try seeing if it solves your problem, and reporting that to
the author

-- 
Allan Jude



signature.asc
Description: OpenPGP digital signature


Re: ptrace attach in multi-threaded processes

2016-07-14 Thread Mark Johnston
On Thu, Jul 14, 2016 at 08:25:37AM +0300, Konstantin Belousov wrote:
> On Wed, Jul 13, 2016 at 01:01:39PM -0700, Mark Johnston wrote:
> > On Wed, Jul 13, 2016 at 10:19:47PM +0300, Konstantin Belousov wrote:
> > > Hmm, I think no, we can not make such change. Issue is, debugger
> > > interface guarantees (at least for single-threaded programs it is
> > > done correctly) that SIGSTOP is noted. In my opinion, it would be the
> > > incompatible API change.
> > 
> > But this guarantee is not honoured in the single-threaded case where
> > PT_ATTACH sends SIGSTOP after another signal is already pending. This
> > other signal will stop the process in ptracestop(), so SIGSTOP will not
> > be reported until after a PT_CONTINUE or PT_DETACH, which seems to
> > violate the interface as you described it. Am I missing some reason
> > that this cannot occur? If not, I'll write a test case for the
> > single-threaded case first.
> 
> Please give me some initial test case, I am fine with single-threaded case.
> I do not think that the mt test would be much different ?

Please see the program here:
https://people.freebsd.org/~markj/ptrace_stop.c

It cheats a bit: it uses SIGSTOP to stop the child before sending a
SIGHUP to it. However, this is just for convenience; note that PT_ATTACH
will result in a call to thread_unsuspend() on the child, so PT_ATTACH's
SIGSTOP will be delivered to a running process. When ptrace attaches,
the child stops and WSTOPSIG(status) == SIGHUP. When ptrace detaches,
the child is left stopped.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RoCE v2 on FreeBSD

2016-07-14 Thread David Somayajulu
Hi All,
Does FreeBSD support RoCE v2 ?
Thanks
David S.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


CURRENT: frequent crashes if mpd5 is running

2016-07-14 Thread Oleg V. Nauman
 I'm experiencing frequent CURRENT ( 12.0-CURRENT r302535 amd64 ) crashes 
triggered by mpd5:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x10
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x814f6162
stack pointer   = 0x28:0xfe011b06d640
frame pointer   = 0x28:0xfe011b06d670
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 901 (mpd5)
trap number = 12
panic: page fault
cpuid = 0

#0  doadump (textdump=) at pcpu.h:221
221 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump (textdump=) at pcpu.h:221
#1  0x80749169 in kern_reboot (howto=260)
at ../../../kern/kern_shutdown.c:366
#2  0x807496e1 in vpanic (fmt=,
ap=) at ../../../kern/kern_shutdown.c:759
#3  0x80749553 in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:690
#4  0x80a5aca1 in trap_fatal (frame=0xfe011b06d590, eva=16)
at ../../../amd64/amd64/trap.c:841
#5  0x80a5af51 in trap_pfault (frame=0x0, usermode=0)
at ../../../amd64/amd64/trap.c:716
#6  0x80a5a430 in trap (frame=0xfe011b06d590)
at ../../../amd64/amd64/trap.c:442
#7  0x80a3e161 in calltrap () at ../../../amd64/amd64/exception.S:236
#8  0x814f6162 in ng_uncallout (c=0xf80004842460,
node=0xf80004c79a00)
at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3815
#9  0x8151bbab in ng_pptpgre_disconnect (hook=)
at 
/usr/src/sys/modules/netgraph/pptpgre/../../../netgraph/ng_pptpgre.c:966
#10 0x814f2928 in ng_destroy_hook (hook=0xf8000487ad80)
at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1219
#11 0x814f2635 in ng_rmnode (node=,
dummy1=, dummy2=,
dummy3=)
at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:744
#12 0x814f4832 in ng_apply_item (node=0xf80004c79a00,
item=0xf80004e72600, rw=1)
at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2523
#13 0x814f41a3 in ng_snd_item (item=,
flags=)
at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2320
#14 0x814eec4e in ngc_send (so=,
flags=, m=,
addr=, control=,
td=)
at /usr/src/sys/modules/netgraph/socket/../../../netgraph/ng_socket.c:338
#15 0x807dee17 in sosend_generic (so=,
addr=, uio=,
top=, control=,
flags=, td=)
at ../../../kern/uipc_socket.c:1359
#16 0x807e66b8 in kern_sendit (td=,
s=, mp=, flags=0, control=0x0,
segflg=) at ../../../kern/uipc_syscalls.c:848
#17 0x807e6abf in sendit (td=0xf800047caa00,
s=, mp=0xfe011b06da60,
flags=) at ../../../kern/uipc_syscalls.c:775
#18 0x807e690d in sys_sendto (td=0x0, uap=)
at ../../../kern/uipc_syscalls.c:899
#19 0x80a5b618 in amd64_syscall (td=, traced=0)
at subr_syscall.c:135
#20 0x80a3e44b in Xfast_syscall ()
at ../../../amd64/amd64/exception.S:396
#21 0x0008025d284a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r302773: non-removable files with "make delete-old"

2016-07-14 Thread Ngie Cooper (yaneurabeya)

> On Jul 13, 2016, at 11:20, O. Hartmann  wrote:
> 
> Am Wed, 13 Jul 2016 09:40:05 -0700
> David Wolfskill  schrieb:
> 
>> On Wed, Jul 13, 2016 at 06:35:10PM +0200, O. Hartmann wrote:
>>> make delete-old removes these files on CURRENT (FreeBSD 12.0-CURRENT #12 
>>> r302773: Wed
>>> Jul 13 18:10:55 CEST 2016), but they seem not to disappear. They are 
>>> present after an
>>> installation of world again and again:
>>> 
>>> [...]
>>> remove /usr/share/locale/kk_KZ.UTF-8/LC_COLLATE? y
>>> 
>> 
>> Please see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211046
>> 
>> (Above applies to both head & stable/11.)
>> 
>> Peace,
>> david
> 
> All right, I'm not the only one.

Fixed in ^/head@r302842; will be MFCed after a week to ^/stable/11 .
Thanks for the report!
-Ngie


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: svn commit: r302601 - in head/sys: arm/include arm64/include [clang 3.8.0: powerpc int instead of 32-bit SYSVR4's long and 64-bit ELF V2 long]

2016-07-14 Thread Mark Millard
[Top post of a history note for powerpc and wchar_t's type in FreeBSD. The 
history is from looking around in svn.]

[The below is not a complaint or a request for a change. It just looks like int 
for wchar_t for powerpc was a choice made long ago for simpler code given 
FreeBSD's pre-existing structure.]

int being used for powerpc wchar_t on FreeBSD goes back to at least 2001-Jan-1. 
[FYI: "27 February, 2008: FreeBSD 7.0 is the first release to officially 
support the FreeBSD/ppc port". So long before official support.]

wchar_t's type is one place where FreeBSD choose to override the powerpc (and 
powerpc64) ABI standards (that indicate long, not int). I'm not sure if this 
was implicit vs. explicitly realizing the ABI mismatch. [The SYSVR4 32-bit 
powerpc ABI goes back to 1995.]

I first traced the history back to 2002-Aug-23: -r102315 of sys/sys/_types.h 
standardized FreeBSD on the following until the ARM change:

typedef int __ct_rune_t;
typedef __ct_rune_t __rune_t;
typedef __ct_rune_t __wchar_t;
typedef __ct_rune_t __wint_t;

Prior to this there was 2002-Aug-21's -r102227 sys/powerpc/include/_types.h 
that used __int32_t.

Prior to that had ansi.h and types.h instead of _types.h --and ansi.h had:

#define _BSD_WCHAR_T_   _BSD_CT_RUNE_T_ /* wchar_t (see below) */
. . .
#define _BSD_CT_RUNE_T_ int /* arg type for ctype funcs */

Going back to sys/powerpc/include/ansi.h's -r70571 (2001-Jan-1 creation in svn):

#define _BSD_WCHAR_T_   int /* wchar_t */

And the comments back then say:

. . . It is not
 * unsigned so that EOF (-1) can be naturally assigned to it and used.
. . . The reason an int was
 * chosen over a long is that the is*() and to*() routines take ints (says
 * ANSI C), but they use __ct_rune_t instead of int.

I've decided to not go any farther back in time (if there is prior history for 
wchar_t for powerpc).

Ignoring the temporary __int32_t use: FreeBSD has had its own powerpc wchar_t 
type (int) for at least the last 15 years, at least when viewed just relative 
to the powerpc ABI(s) FreeBSD is based on for powerpc.



Modern gcc versions even have the FreeBSD wchar_t type correct for powerpc 
variants in recent times: int. Previously some notation (L based notation) used 
the wrong type for one of the powerpc variants (32-bit vs. 64-bit), causing 
lots of false-positive compiler notices. gcc had followed the ABI involved 
(long int) until the correction.

===
Mark Millard
markmi at dsl-only.net

On 2016-Jul-13, at 11:46 PM, Mark Millard  wrote:

> On 2016-Jul-13, at 6:00 PM, Andrey Chernov  wrote:
> 
>> On 13.07.2016 11:53, Mark Millard wrote:
>>> [The below does note that TARGET=powerpc has a mix of signed wchar_t and 
>>> unsigned char types and most architectures have both being signed types.]
>> 
>> POSIX says nothing about wchar_t and char should be the same (un)signed.
>> It is arm ABI docs may say so only. They are different entities
>> differently encoded and cross assigning between wchar_t and char is not
>> recommended.
> 
> [My "odd" would better have been the longer phrase "unusual for FreeBSD" for 
> the signed type mismatch point.]
> 
> C11 (9899:2011[2012]) and C++11 (14882:2011(E)) agree with your POSIX note: 
> no constraint to have the same signed type status as char.
> 
> But when I then looked at the "System V Application Binary Interface PowerpC 
> Processor Supplement" (1995-Sept SunSoft document) that I believe FreeBSD 
> uses for powerpc (32-bit only: TARGET_ARCH=powerpc) it has:
> 
> typedef long wchar_t;
> 
> as part of: Figure 6-39  (page labeled 6-38).
> 
> While agreeing about the signed-type status for wchar_t this does not agree 
> with FreeBSD 11.0's use of int as the type:
> 
> sys/powerpc/include/_types.h:typedef  int ___wchar_t;
> sys/powerpc/include/_types.h:#define  __WCHAR_MIN __INT_MIN   /* min 
> value for a wchar_t */
> sys/powerpc/include/_types.h:#define  __WCHAR_MAX __INT_MAX   /* max 
> value for a wchar_t */
> 
> # clang --target=powerpc-freebsd11 -std=c99 -E -dM  - < /dev/null | more
> . . .
> #define __WCHAR_MAX__ 2147483647
> #define __WCHAR_TYPE__ int
> #define __WCHAR_WIDTH__ 32
> . . .
> 
> I'm not as sure of which document is official for TARGET_ARCH=powerpc64 but 
> using "Power Architecture 64-bit ELF V2 ABI Specification" (Open POWER ABI 
> for Linux Supplement) as an example of what likely is common for that 
> context: 5.1.3 Types Defined in Standard header lists:
> 
> typedef long wchar_t;
> 
> which again does not agree with FreeBSD 11.0's use of int as the type:
> 
> # clang --target=powerpc64-freebsd11 -std=c99 -E -dM  - < /dev/null | more
> . . .
> #define __WCHAR_MAX__ 2147483647
> #define __WCHAR_TYPE__ int
> #define __WCHAR_WIDTH__ 32
> . . .
> 
> 
> ===
> Mark Millard
> markmi at dsl-only.net
> 
> 
>> 
>> On 2016-Jul-11, at 8:57 PM, Andrey Chernov  wrote:
>> 
>>> On 12.07.2016 5:44, Mark Millard wrote:
 My 

Jenkins build is back to normal : FreeBSD_HEAD_sparc64 #148

2016-07-14 Thread jenkins-admin
See 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: svn commit: r302601 - in head/sys: arm/include arm64/include [clang 3.8.0: powerpc int instead of 32-bit SYSVR4's long and 64-bit ELF V2 long]

2016-07-14 Thread Mark Millard
On 2016-Jul-13, at 6:00 PM, Andrey Chernov  wrote:

> On 13.07.2016 11:53, Mark Millard wrote:
>> [The below does note that TARGET=powerpc has a mix of signed wchar_t and 
>> unsigned char types and most architectures have both being signed types.]
> 
> POSIX says nothing about wchar_t and char should be the same (un)signed.
> It is arm ABI docs may say so only. They are different entities
> differently encoded and cross assigning between wchar_t and char is not
> recommended.

[My "odd" would better have been the longer phrase "unusual for FreeBSD" for 
the signed type mismatch point.]

C11 (9899:2011[2012]) and C++11 (14882:2011(E)) agree with your POSIX note: no 
constraint to have the same signed type status as char.

But when I then looked at the "System V Application Binary Interface PowerpC 
Processor Supplement" (1995-Sept SunSoft document) that I believe FreeBSD uses 
for powerpc (32-bit only: TARGET_ARCH=powerpc) it has:

typedef long wchar_t;

as part of: Figure 6-39  (page labeled 6-38).

While agreeing about the signed-type status for wchar_t this does not agree 
with FreeBSD 11.0's use of int as the type:

sys/powerpc/include/_types.h:typedefint ___wchar_t;
sys/powerpc/include/_types.h:#define__WCHAR_MIN __INT_MIN   /* min 
value for a wchar_t */
sys/powerpc/include/_types.h:#define__WCHAR_MAX __INT_MAX   /* max 
value for a wchar_t */

# clang --target=powerpc-freebsd11 -std=c99 -E -dM  - < /dev/null | more
. . .
#define __WCHAR_MAX__ 2147483647
#define __WCHAR_TYPE__ int
#define __WCHAR_WIDTH__ 32
. . .

I'm not as sure of which document is official for TARGET_ARCH=powerpc64 but 
using "Power Architecture 64-bit ELF V2 ABI Specification" (Open POWER ABI for 
Linux Supplement) as an example of what likely is common for that context: 
5.1.3 Types Defined in Standard header lists:

typedef long wchar_t;

which again does not agree with FreeBSD 11.0's use of int as the type:

# clang --target=powerpc64-freebsd11 -std=c99 -E -dM  - < /dev/null | more
. . .
#define __WCHAR_MAX__ 2147483647
#define __WCHAR_TYPE__ int
#define __WCHAR_WIDTH__ 32
. . .


===
Mark Millard
markmi at dsl-only.net


> 
> On 2016-Jul-11, at 8:57 PM, Andrey Chernov  wrote:
> 
>> On 12.07.2016 5:44, Mark Millard wrote:
>>> My understanding of the criteria for __WCHAR_MIN and __WCHAR_MAX:
>>> 
>>> A) __WCHAR_MIN and __WCHAR_MAX: same type as the integer promotion of
>>> ___wchar_t (if that is distinct).
>>> B) __WCHAR_MIN is the low value for ___wchar_t as an integer type; not
>>> necessarily a valid char value
>>> C) __WCHAR_MAX is the high value for ___wchar_t as an integer type; not
>>> necessarily a valid char value
>> 
>> It seems you are right about "not a valid char value", I'll back this
>> change out.
>> 
>>> As far as I know arm FreeBSD uses unsigned character types (of whatever
>>> width).
>> 
>> Probably it should be unsigned for other architectures too, clang does
>> not generate negative values with L'' literals and locale use only
>> positive values too.
> 
> Looking around:
> 
> # grep -i wchar sys/*/include/_types.h
> sys/arm/include/_types.h:typedef  unsigned int___wchar_t;
> sys/arm/include/_types.h:#define  __WCHAR_MIN 0   /* min 
> value for a wchar_t */
> sys/arm/include/_types.h:#define  __WCHAR_MAX __UINT_MAX  /* max 
> value for a wchar_t */
> sys/arm64/include/_types.h:typedefunsigned int___wchar_t;
> sys/arm64/include/_types.h:#define__WCHAR_MIN 0   /* min 
> value for a wchar_t */
> sys/arm64/include/_types.h:#define__WCHAR_MAX __UINT_MAX  /* max 
> value for a wchar_t */
> sys/mips/include/_types.h:typedef int ___wchar_t;
> sys/mips/include/_types.h:#define __WCHAR_MIN __INT_MIN   /* min 
> value for a wchar_t */
> sys/mips/include/_types.h:#define __WCHAR_MAX __INT_MAX   /* max 
> value for a wchar_t */
> sys/powerpc/include/_types.h:typedef  int ___wchar_t;
> sys/powerpc/include/_types.h:#define  __WCHAR_MIN __INT_MIN   /* min 
> value for a wchar_t */
> sys/powerpc/include/_types.h:#define  __WCHAR_MAX __INT_MAX   /* max 
> value for a wchar_t */
> sys/riscv/include/_types.h:typedefint ___wchar_t;
> sys/riscv/include/_types.h:#define__WCHAR_MIN __INT_MIN   /* min 
> value for a wchar_t */
> sys/riscv/include/_types.h:#define__WCHAR_MAX __INT_MAX   /* max 
> value for a wchar_t */
> sys/sparc64/include/_types.h:typedef  int ___wchar_t;
> sys/sparc64/include/_types.h:#define  __WCHAR_MIN __INT_MIN   /* min 
> value for a wchar_t */
> sys/sparc64/include/_types.h:#define  __WCHAR_MAX __INT_MAX   /* max 
> value for a wchar_t */
> sys/x86/include/_types.h:typedef  int ___wchar_t;
> sys/x86/include/_types.h:#define  __WCHAR_MIN __INT_MIN   /* min 
> value for a wchar_t */
> 

Re: FreeBSD-11.0-BETA1-amd64-disc1.iso is too big for my 700MB CD-r

2016-07-14 Thread Chris H
> On Wed, Jul 13, 2016 at 3:12 PM, Glen Barber  wrote:
> 
> > Just replying to the first email in the thread, since it's a general
> > reply, and only related to the original topic at hand, and only for
> > informative purposes at this point.
> >
> > On Mon, Jul 11, 2016 at 11:01:51PM +0200, Ronald Klop wrote:
> > > Just downloaded the amd64 BETA1 ISO (873MB) and tried to burn a CD on
> > > Windows 10. It complained that the ISO is too big for my 700 MB CD-r.
> > >
> >
> > I have *something* semi-working, with a huge amount of help from Maxim
> > in private email.  There is still a nit or two to fix, I'm running into
> > them as I rebuild the ISO after fixing the prior issue.  But, right now,
> > I can get the ISO to boot enough to get to a shell (the "init failed due
> > to inability to mount '/'" shell, but it is still a shell).  :)
> >
> > Once I get what I have now into a state where it's somewhat committable,
> > I'm going to create a project branch to sand off the edges, instead of
> > doing it directly in head, since there might be some edge cases for
> > non-x86 architectures.  (But some other architectures do not have the
> > "too big" problem.)
> >
> > Once that is merged, I fully intend to merge this to stable/11, provided
> > there is no major fallout.  With what I have now, disc1.iso is 630M, and
> > the disc1.iso.xz is 554M.  I'll upload an image somewhere public for
> > people to test 11.0-BETA1 on hardware, KVM, etc.  One thing to note,
> > though, there appears to be a significantly non-zero speed decrease,
> > though this may just be because my CD-ROM is USB-based.  When I have the
> > ready-to-commit result, I'll test it on a machine with an internal CD
> > drive.
> >
> > Glen
> >
> >
On Wed, 13 Jul 2016 22:30:33 -0700 Maxim Sobolev  wrote

> Hi Glen, nice update, glad being of some help. The slowdown may be related
> to the fact that geom_uzip reads whole compressed cluster, which is 20-30k
> typically, even if only single block from that cluster is requested. I
> imagine it might impact rc.d, which is essentially bunch of small(ish)
> shell scripts and I would not be surprised if their blocks would be
> scattered all over the place. There is some very basic caching in the
> geom_uzip module, but it is only one cluster deep. What might help if you
> still have some room on the CD is to decrease cluster size (-s parameter of
> mkuzip), to something like 32k or even 16k. That would make compression
> less effective, but would reduce the I/O bandwidth waste, which could also
> be important for the KVM setups. I might also look into making a bigger
> cache, as RAM is getting cheaper and more abundant every day. Another
> approach would be to make several "partitions", segregating for example
> /etc stuff so it's all tighly packed together and you can also use smaller
> cluster size for /etc and bigger for the rest. In any case, keep me posted
> with your findings.
> 
> -Max
> 
It's CPU, and IO bound mostly, and it's going to prove painful for some
with lesser powered hardware. But better than than the alternative. Right?

Hey, Glen. Just a nod, for taking the time to do this!

--Chris


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"