Re: drm-current-kmod panics

2019-12-19 Thread Jeff Roberson

On Thu, 19 Dec 2019, Hans Petter Selasky wrote:


On 2019-12-19 19:40, Cy Schubert wrote:

In message , Hans Petter
Sela
sky writes:

On 2019-12-19 17:50, Cy Schubert wrote:

Has anyone else had these since Dec 9?

<4>WARN_ON(!mutex_is_locked(>lock))WARN_ON(!mutex_is_locked(>
lock))
panic: page fault
cpuid = 1
time = 1576772837
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfe007c98b930
vpanic() at vpanic+0x17e/frame 0xfe007c98b990
panic() at panic+0x43/frame 0xfe007c98b9f0
trap_fatal() at trap_fatal+0x386/frame 0xfe007c98ba50
trap_pfault() at trap_pfault+0x4f/frame 0xfe007c98bac0
trap() at trap+0x41b/frame 0xfe007c98bbf0
calltrap() at calltrap+0x8/frame 0xfe007c98bbf0
--- trap 0xc, rip = 0x242c52, rsp = 0x7fffbe70, rbp = 0x7fffbe90 
--

-

Uptime: 59m7s

It is triggered through random keystrokes or mouse movements.


Looks like a double fault.

Did you recompile drm-current-kmod with the latest kernel sources?


Yes.




Are you able to get a full backtrace?


Since my recent scheduler commits the following functions now return 
without the thread lock held:


sched_add()/sched_wakeup()/sched_switch()/mi_switch()/setrunnable()/sleepq_abort()

I audited drm and linuxkpi for use of these functions.  There was one in 
the linuxkpi sources that I corrected in the same commit as the change in 
api.  I don't see any users of these in drm-current-kmod.  It is possible 
that I have somehow missed it.  I did just commit a fix to cpuset that may 
be called indirectly somehow.  That fix is r355915.  The first commit of 
this series was r35579.


If this is at fault I may need some assistance in identifying what the 
offending call is.  It should show up with INVARIANTS/WITNESS more quickly 
than a pagefault if so though.


Thanks,
Jeff



--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


object locking patch series going into current later today

2019-10-14 Thread Jeff Roberson

Hello,

As part of the NUMA and VM concurrency work I have refactored the way page 
busy state works in order to improve vm object concurrency.  This is a six 
patch series that has been tested more in totality than in parts.  I will 
be committing them back to back after a final universe build today.


https://reviews.freebsd.org/D21548
https://reviews.freebsd.org/D21549
https://reviews.freebsd.org/D21592
https://reviews.freebsd.org/D21594
https://reviews.freebsd.org/D21595
https://reviews.freebsd.org/D21596

This has been tested fairly extensively so I don't expect a lot of fallout 
but you may want to wait for all six before you update.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange ARC/Swap/CPU on yesterday's -CURRENT

2018-03-12 Thread Jeff Roberson

On Sun, 11 Mar 2018, Matthew D. Fuller wrote:


On Sun, Mar 11, 2018 at 10:43:58AM -1000 I heard the voice of
Jeff Roberson, and lo! it spake thus:


First, I would like to identify whether the wired memory is in the
buffer cache.  Can those of you that have a repro look at sysctl
vfs.bufspace and tell me if that accounts for the bulk of your wired
memory usage?  I'm wondering if a job ran that pulled in all of the
bufs from your root disk and filled up the buffer cache which
doesn't have a back-pressure mechanism.


If by "root disk", you mean the one that isn't ZFS, that wouldn't
touch anything here; apart from a md-backed UFS /tmp and some NFS
mounts, everything on my system is ZFS.

I believe vfs.bufspace is what shows up as "Buf" on top?  I don't
recall it looking particularly interesting when things were madly
swapping.  I'll uncork arc_max again for a bit and see if anything odd
shows up in it, but it's only a dozen megs or so now.


You are right.  I forgot that it was in top and didn't notice.

What I believe I need most is for someone to bisect a few revisions to let 
me know if it was one of my two major patches.


Thanks,
Jeff





--
Matthew Fuller (MF4839)   |  fulle...@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
  On the Internet, nobody can hear you scream.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange ARC/Swap/CPU on yesterday's -CURRENT

2018-03-11 Thread Jeff Roberson

On Sun, 11 Mar 2018, Mark Millard wrote:


As I understand, O. Hartmann's report ( ohartmann at walstatt.org ) in:

https://lists.freebsd.org/pipermail/freebsd-current/2018-March/068806.html

includes a system with a completely non-ZFS context: UFS only. Quoting that 
part:


This is from a APU, no ZFS, UFS on a small mSATA device, the APU (PCenigine) 
works as a
firewall, router, PBX):

last pid:  9665;  load averages:  0.13,  0.13,  0.11
up 3+06:53:55  00:26:26 19 processes:  1 running, 18 sleeping CPU:  0.3% user,  
0.0%
nice,  0.2% system,  0.0% interrupt, 99.5% idle Mem: 27M Active, 6200K Inact, 
83M
Laundry, 185M Wired, 128K Buf, 675M Free Swap: 7808M Total, 2856K Used, 7805M 
Free
[...]

The APU is running CURRENT ( FreeBSD 12.0-CURRENT #42 r330608: Wed Mar  7 
16:55:59 CET
2018 amd64). Usually, the APU never(!) uses swap, now it is starting to swap 
like hell
for a couple of days and I have to reboot it failty often.


Unless this is unrelated, it would suggest that ZFS and its ARC need not
be involved.

Would what you are investigating relative to your "NUMA and concurrency
related work" fit with such a non-ZFS (no-ARC) context?


I think there are probably two different bugs.  I believe the pid 
controller has caused the laundry thread to start being more aggressive 
causing more pageouts which would cause increased swap consumption.


The back-pressure mechanisms in arch should've resolved the other reports. 
It's possible that I broke those.  Although if the reports from 11.x are 
to be believed I don't know that it was me.  It is possible they have been 
broken at different times for different reasons.  So I will continue to 
look.


Thanks,
Jeff



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Strange ARC/Swap/CPU on yesterday's -CURRENT

2018-03-11 Thread Jeff Roberson

On Sun, 11 Mar 2018, O. Hartmann wrote:


Am Wed, 7 Mar 2018 14:39:13 +0400
Roman Bogorodskiy  schrieb:


  Danilo G. Baio wrote:


On Tue, Mar 06, 2018 at 01:36:45PM -0600, Larry Rosenman wrote:

On Tue, Mar 06, 2018 at 10:16:36AM -0800, Rodney W. Grimes wrote:

On Tue, Mar 06, 2018 at 08:40:10AM -0800, Rodney W. Grimes wrote:

On Mon, 5 Mar 2018 14:39-0600, Larry Rosenman wrote:


Upgraded to:

FreeBSD borg.lerctr.org 12.0-CURRENT FreeBSD 12.0-CURRENT #11 r330385:
Sun Mar  4 12:48:52 CST 2018
r...@borg.lerctr.org:/usr/obj/usr/src/amd64.amd64/sys/VT-LER  amd64
+1200060 1200060

Yesterday, and I'm seeing really strange slowness, ARC use, and SWAP use
and swapping.

See http://www.lerctr.org/~ler/FreeBSD/Swapuse.png


I see these symptoms on stable/11. One of my servers has 32 GiB of
RAM. After a reboot all is well. ARC starts to fill up, and I still
have more than half of the memory available for user processes.

After running the periodic jobs at night, the amount of wired memory
goes sky high. /etc/periodic/weekly/310.locate is a particular nasty
one.


I would like to find out if this is the same person I have
reporting this problem from another source, or if this is
a confirmation of a bug I was helping someone else with.

Have you been in contact with Michael Dexter about this
issue, or any other forum/mailing list/etc?

Just IRC/Slack, with no response.


If not then we have at least 2 reports of this unbound
wired memory growth, if so hopefully someone here can
take you further in the debug than we have been able
to get.

What can I provide?  The system is still in this state as the full backup is
slow.


One place to look is to see if this is the recently fixed:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=88
g_bio leak.

vmstat -z | egrep 'ITEM|g_bio|UMA'

would be a good first look


borg.lerctr.org /home/ler $ vmstat -z | egrep 'ITEM|g_bio|UMA'
ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
UMA Kegs:   280,  0, 346,   5, 560,   0,   0
UMA Zones: 1928,  0, 363,   1, 577,   0,   0
UMA Slabs:  112,  0,25384098,  977762,102033225,   0,   0
UMA Hash:   256,  0,  59,  16, 105,   0,   0
g_bio:  384,  0,  33,1627,542482056,   0,   0
borg.lerctr.org /home/ler $

Limiting the ARC to, say, 16 GiB, has no effect of the high amount of
wired memory. After a few more days, the kernel consumes virtually all
memory, forcing processes in and out of the swap device.


Our experience as well.

...

Thanks,
Rod Grimes
rgri...@freebsd.org

Larry Rosenman http://www.lerctr.org/~ler


--
Rod Grimes rgri...@freebsd.org


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106



Hi.

I noticed this behavior as well and changed vfs.zfs.arc_max for a smaller size.

For me it started when I upgraded to 1200058, in this box I'm only using
poudriere for building tests.


I've noticed that as well.

I have 16G of RAM and two disks, the first one is UFS with the system
installation and the second one is ZFS which I use to store media and
data files and for poudreire.

I don't recall the exact date, but it started fairly recently. System would
swap like crazy to a point when I cannot even ssh to it, and can hardly
login through tty: it might take 10-15 minutes to see a command typed in
the shell.

I've updated loader.conf to have the following:

vfs.zfs.arc_max="4G"
vfs.zfs.prefetch_disable="1"

It fixed the problem, but introduced a new one. When I'm building stuff
with poudriere with ccache enabled, it takes hours to build even small
projects like curl or gnutls.

For example, current build:

[10i386-default] [2018-03-07_07h44m45s] [parallel_build:] Queued: 3  Built: 1  
Failed:
0  Skipped: 0  Ignored: 0  Tobuild: 2   Time: 06:48:35 [02]: security/gnutls
| gnutls-3.5.18 build   (06:47:51)

Almost 7 hours already and still going!

gstat output looks like this:

dT: 1.002s  w: 1.000s
 L(q)  ops/sr/s   kBps   ms/rw/s   kBps   ms/w   %busy Name
0  0  0  00.0  0  00.00.0  da0
0  1  0  00.0  11280.70.1  ada0
1106106439   64.6  0  00.0   98.8  ada1
0  1  0  00.0  11280.70.1  ada0s1
0  0  0  00.0  0  00.00.0  ada0s1a
0  0  0  00.0  0  00.00.0  ada0s1b
0  1  0  00.0  11280.70.1  ada0s1d

ada0 here is UFS driver, and ada1 is ZFS.


Regards.
--
Danilo G. Baio (dbaio)




Roman Bogorodskiy



This is from a APU, no ZFS, UFS on a small mSATA device, the APU (PCenigine) 
works as a
firewall, router, 

Re: FreeBSD has a politics problem

2018-03-04 Thread Jeff Roberson

Hi John,

First of all this is really not an appropriate forum for this discussion. 
It is really unfortunate that these emails were leaked.  However, I think 
if you take a careful look at them, you will find a couple examples of 
hostility but the great preponderance of them are quite reasonable 
discussion.  People are sharing experiences, discussing how cultures other 
than western ones may be affected, what abuses have taken place, and what 
our aspirations for the project are.  This mostly looks like healthy 
debate to me, understanding that the subject matter may create strong 
feelings all around.


There is a lot of catastrophizing going on in the dialog, especially that 
on third party sites and among non-committers.  I believe in the strength 
of the project and its members to avoid these worst case scenarios.  I 
believe the vast majority of contributors are incredibly reasonable and 
desire a project where they can share their good work and be respected and 
respect others.  It is unfortunate that a few have left but that seems 
quite rash to me at this stage.


I would urge everyone to be calm and patient.  This is an important dialog 
and it's bound to be bumpy.  I also strongly urge people to refrain from 
discussing it further on technical lists where it is counter productive 
and unwelcome.


Regards,
Jeff


On Sat, 3 Mar 2018, John Darrah wrote:


FreeBSD recently introduced an updated Code of Conduct that developers and
members must adhere to. There has been much backlash online about it and
about introducing identity politics into a technical OS project in general.
The Code of Conduct was adopted from the "Geek Feminism" wiki's version,
which claims (among other things) that racism against whites doesn't exist,
sexism against men doesn't exist, and that certain protected classes of
people should not be criticised.

Emails of the internal discussion about this controversial Code of Conduct
have now been leaked publicly, painting a picture of the disagreement in
the FreeBSD project about how this was handled.

A number of developers, particularly benno@, phk@ and des@, have used racist
and sexist remarks against those criticising the far-reaching project policy
change, saying that the concerns about the policy essentially boil down to
"white male privilege" and being "on the wrong side of history".

Other developers expressed concern about the policy being thrown upon them
with no discussion or debate, as well as The FreeBSD Foundation's choice
to pay an outside person (with donations from the users) to work on the
Code of Conduct's enforcement. Said person identifies as a feminist.

Mods on BSD and FreeBSD-related subreddits are censoring posts, removing
threads, and banning users for posting the link. Colin Percival is among
the mods doing the removal. FreeBSD forum mods are also cracking down and
eliminating any discussion. Censorship is not the way to win culture wars.

This file is an email archive in MBOX format. You can open it with any
email client (including the mail or mutt commands) or view it as plaintext
with any text editor. It contains just over 200 emails.

View:
https://privatebin.net/?4c0fb59e63e8271e#irS3KFaEdtuFxsVM4xzQ4/llXLhSz0oZLV9WuOEUHBc=

Download:
https://mega.nz/#!xBpHBSAb!ENyoYPopqGVlx320X-a4ecpRjJBtPvd9jmRT9h57eao
https://my.mixtape.moe/nhybsi.mbox

I encourage you to read and form your own opinions, especially with regard
to how the project is handling donation money.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: New NUMA support coming to CURRENT

2018-01-13 Thread Jeff Roberson

Hello,

This work has been committed.  It is governed by a new 'NUMA' config 
option and 'DEVICE_NUMA' and 'VM_NUMA_ALLOC' have both been retired.  This 
option is fairly light weight and I will likely enable it in GENERIC 
before 12.0 release.


I have heard reports that switching from a default policy of first-touch 
to round-robin has caused some performance regression.  You can change the 
default policy at runtime by doing the following:


cpuset -s 1 -n first-touch:all

This is the default set that all others inherit from.  You can query the 
current default with:

cpuset -g -s 1

I will be investigating the regression and tweaking the default policy 
based on performance feedback from multiple workloads.  This may take some 
time.


numactl is still functional but deprecated.  Man pages will be updated 
soonish.


Thank you for your patience as I work on refining this somewhat involved 
feature.


Thanks,
Jeff

On Tue, 9 Jan 2018, Jeff Roberson wrote:


Hello folks,

I am working on merging improved NUMA support with policy implemented by 
cpuset(2) over the next week.  This work has been supported by Dell/EMC's 
Isilon product division and Netflix.  You can see some discussion of these 
changes here:


https://reviews.freebsd.org/D13403
https://reviews.freebsd.org/D13289
https://reviews.freebsd.org/D13545

The work has been done in user/jeff/numa if you want to look at svn history 
or experiment with the branch.  It has been tested by Peter Holm on i386 and 
amd64 and it has been verified to work on arm at various points.


We are working towards compatibility with libnuma and linux mbind.  These 
commits will bring in improved support for NUMA in the kernel.  There are new 
domain specific allocation functions available to kernel for UMA, malloc, 
kmem_, and vm_page*.  busdmamem consumers will automatically be placed in the 
correct domain, bringing automatic improvements to some device performance.


cpuset will be able to constrains processes, groups of processes, jails, etc. 
to subsets of the system memory domains, just as it can with sets of cpus. 
It can set default policy for any of the above.  Threads can use cpusets to 
set policy that specifies a subset of their visible domains.


Available policies are first-touch (local in linux terms), round-robin 
(similar to linux interleave), and preferred.  For now, the default is 
round-robin.  You can achieve a fixed domain policy by using round-robin with 
a bitmask of a single domain.  As the scheduler and VM become more 
sophisticated we may switch the default to first-touch as linux does.


Currently these features are enabled with VM_NUMA_ALLOC and MAXMEMDOM.  It 
will eventually be NUMA/MAXMEMDOM to match SMP/MAXCPU.  The current NUMA 
syscalls and VM_NUMA_ALLOC code was 'experimental' and will be deprecated. 
numactl will continue to be supported although cpuset should be preferred 
going forward as it supports the full feature set of the new API.


Thank you for your patience as I deal with the inevitable fallout of such 
sweeping changes.  If you do have bugs, please file them in bugzilla, or 
reach out to me directly.  I don't always have time to catch up on all of my 
mailing list mail and regretfully things slip through the cracks when they 
are not addressed directly to me.


Thanks,
Jeff


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


New NUMA support coming to CURRENT

2018-01-09 Thread Jeff Roberson

Hello folks,

I am working on merging improved NUMA support with policy implemented by 
cpuset(2) over the next week.  This work has been supported by Dell/EMC's 
Isilon product division and Netflix.  You can see some discussion of these 
changes here:


https://reviews.freebsd.org/D13403
https://reviews.freebsd.org/D13289
https://reviews.freebsd.org/D13545

The work has been done in user/jeff/numa if you want to look at svn 
history or experiment with the branch.  It has been tested by Peter Holm 
on i386 and amd64 and it has been verified to work on arm at various 
points.


We are working towards compatibility with libnuma and linux mbind.  These 
commits will bring in improved support for NUMA in the kernel.  There are 
new domain specific allocation functions available to kernel for UMA, 
malloc, kmem_, and vm_page*.  busdmamem consumers will automatically be 
placed in the correct domain, bringing automatic improvements to some 
device performance.


cpuset will be able to constrains processes, groups of processes, jails, 
etc. to subsets of the system memory domains, just as it can with sets of 
cpus.  It can set default policy for any of the above.  Threads can use 
cpusets to set policy that specifies a subset of their visible domains.


Available policies are first-touch (local in linux terms), round-robin 
(similar to linux interleave), and preferred.  For now, the default is 
round-robin.  You can achieve a fixed domain policy by using round-robin 
with a bitmask of a single domain.  As the scheduler and VM become more 
sophisticated we may switch the default to first-touch as linux does.


Currently these features are enabled with VM_NUMA_ALLOC and MAXMEMDOM.  It 
will eventually be NUMA/MAXMEMDOM to match SMP/MAXCPU.  The current NUMA 
syscalls and VM_NUMA_ALLOC code was 'experimental' and will be deprecated. 
numactl will continue to be supported although cpuset should be preferred 
going forward as it supports the full feature set of the new API.


Thank you for your patience as I deal with the inevitable fallout of such 
sweeping changes.  If you do have bugs, please file them in bugzilla, or 
reach out to me directly.  I don't always have time to catch up on all of 
my mailing list mail and regretfully things slip through the cracks when 
they are not addressed directly to me.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: UMA cache back pressure

2013-11-18 Thread Jeff Roberson

On Mon, 18 Nov 2013, Alexander Motin wrote:


Hi.

I've created patch, based on earlier work of avg@, to add back pressure to 
UMA allocation caches. The problem of physical memory or KVA exhaustion 
existed there for many years and it is quite critical now for improving 
systems performance while keeping stability. Changes done in memory 
allocation last years improved situation. but haven't fixed completely. My 
patch solves remaining problems from two sides: a) reducing bucket sizes 
every time system detects low memory condition; and b) as last-resort 
mechanism for very low memory condition, it cycling over all CPUs to purge 
their per-CPU UMA caches. Benefit of this approach is in absence of any 
additional hard-coded limits on cache sizes -- they are self-tuned, based on 
load and memory pressure.


With this change I believe it should be safe enough to enable UMA allocation 
caches in ZFS via vfs.zfs.zio.use_uma tunable (at least for amd64). I did 
many tests on machine with 24 logical cores (and as result strong allocation 
cache effects), and can say that with 40GB RAM using UMA caches, allowed by 
this change, by two times increases results of SPEC NFS benchmark on ZFS pool 
of several SSDs. To test system stability I've run the same test with 
physical memory limited to just 2GB and system successfully survived that, 
and even showed results 1.5 times better then with just last resort measures 
of b). In both cases tools/umastat no longer shows unbound UMA cache growth, 
that makes me believe in viability of this approach for longer runs.


I would like to hear some comments about that:
http://people.freebsd.org/~mav/uma_pressure.patch


Hey Mav,

This is a great start and great results.  I think it could probably even 
go in as-is, but I have a few suggestions.


First, let's test this with something that is really super allocator heavy 
and doesn't benefit much from bucket sizing.  For example, a network 
forwarding test.  Or maybe you could get someone like Netflix that is 
using it to push a lot of bits with less filesystem cost than zfs and 
spec.


Second, the cpu binding is a very costly and very high-latency operation. 
It would make sense to do CPU_FOREACH and then ZONE_FOREACH.  You're also 
biasing the first zones in the list.  The low memory condition will more 
often clear after you check these first zones.  So you might just check it 
once and equally penalize all zones.  I'm concerned that doing CPU_FOREACH 
in every zone will slow the pagedaemon more.  We also have been working 
towards per-domain pagedaemons so perhaps we should have a uma-reclaim 
taskqueue that we wake up to do the work?


Third, using vm_page_count_min() will only trigger when the pageout daemon 
can't keep up with the free target.  Typically this should only happen 
with a lot of dirty mmap'd pages or incredibly high system load coupled 
with frequent allocations.  So there may be many cases where reclaiming 
the extra UMA memory is helpful but the pagedaemon can still keep up while 
pushing out file pages that we'd prefer to keep.


I think the perfect heuristic would have some idea of how likely the UMA 
pages are to be re-used immediately so we can more effectively tradeoff 
between file pages and kernel memory cache.  As it is now we limit the 
uma_reclaim() calls to every 10 seconds when there is memory pressure. 
Perhaps we could keep a timestamp for when the last slab was allocated to 
a zone and do the more expensive reclaim on zones who have timestamps that 
exceed some threshold?  Then have a lower threshold for reclaiming at all? 
Again, it doesn't need to be perfect, but I believe we can catch a wider 
set of cases by carefully scheduling this.


Thanks,
Jeff



Thank you.

--
Alexander Motin


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: UMA cache back pressure

2013-11-18 Thread Jeff Roberson


On Mon, 18 Nov 2013, Alexander Motin wrote:


On 18.11.2013 21:11, Jeff Roberson wrote:

On Mon, 18 Nov 2013, Alexander Motin wrote:

I've created patch, based on earlier work of avg@, to add back
pressure to UMA allocation caches. The problem of physical memory or
KVA exhaustion existed there for many years and it is quite critical
now for improving systems performance while keeping stability. Changes
done in memory allocation last years improved situation. but haven't
fixed completely. My patch solves remaining problems from two sides:
a) reducing bucket sizes every time system detects low memory
condition; and b) as last-resort mechanism for very low memory
condition, it cycling over all CPUs to purge their per-CPU UMA caches.
Benefit of this approach is in absence of any additional hard-coded
limits on cache sizes -- they are self-tuned, based on load and memory
pressure.

With this change I believe it should be safe enough to enable UMA
allocation caches in ZFS via vfs.zfs.zio.use_uma tunable (at least for
amd64). I did many tests on machine with 24 logical cores (and as
result strong allocation cache effects), and can say that with 40GB
RAM using UMA caches, allowed by this change, by two times increases
results of SPEC NFS benchmark on ZFS pool of several SSDs. To test
system stability I've run the same test with physical memory limited
to just 2GB and system successfully survived that, and even showed
results 1.5 times better then with just last resort measures of b). In
both cases tools/umastat no longer shows unbound UMA cache growth,
that makes me believe in viability of this approach for longer runs.

I would like to hear some comments about that:
http://people.freebsd.org/~mav/uma_pressure.patch


Hey Mav,

This is a great start and great results.  I think it could probably even
go in as-is, but I have a few suggestions.


Hey! Thanks for your review. I appreciate.


And I appreciate more people being interested in working on the allocator.




First, let's test this with something that is really super allocator
heavy and doesn't benefit much from bucket sizing.  For example, a
network forwarding test.  Or maybe you could get someone like Netflix
that is using it to push a lot of bits with less filesystem cost than
zfs and spec.


I am not sure what simple forwarding may show in this case. Even on my 
workload with ZFS creating strong memory pressure I still have mbuf* zones 
buckets almost (some totally) maxed out. Without other major (or even any) 
pressure in system they just can't become bigger then maximum. But if you can 
propose some interesting test case with pressure that I can reproduce -- I am 
all ears.


I think part of that is also because you're using min free pages right now 
as your threshold.  It should probably be triggering slightly more often.





Second, the cpu binding is a very costly and very high-latency
operation. It would make sense to do CPU_FOREACH and then ZONE_FOREACH.
You're also biasing the first zones in the list.  The low memory
condition will more often clear after you check these first zones.  So
you might just check it once and equally penalize all zones.  I'm
concerned that doing CPU_FOREACH in every zone will slow the pagedaemon
more.


I completely agree with all you said here. This part of code I just took 
as-is from earlier work. It definitely can be improved. I'll take a look on 
that. But as I have mentioned in one of earlier responses that code used in 
_very_ rare cases, unless system is heavily overloaded on memory, like doing 
ZFS on box with 24 cores and 2GB RAM. During reasonable operation it is 
enough to have soft back pressure to keep on caches in shape and never call 
that.



We also have been working towards per-domain pagedaemons so
perhaps we should have a uma-reclaim taskqueue that we wake up to do the
work?


VM is not my area so far, so please propose the right way. I took this task 
now only because I have to due to huge performance bottleneck this problem 
causes and years it remains unsolved.


Well it's probably fine to keep abusing the first domain's pageout daemon 
for now but we won't want to in the future, especially if we want to keep 
each domain's page daemon on the socket that it's managing.





Third, using vm_page_count_min() will only trigger when the pageout
daemon can't keep up with the free target.  Typically this should only
happen with a lot of dirty mmap'd pages or incredibly high system load
coupled with frequent allocations.  So there may be many cases where
reclaiming the extra UMA memory is helpful but the pagedaemon can still
keep up while pushing out file pages that we'd prefer to keep.


As I have told that is indeed last resort. It does not need to be done often. 
Per-CPU caches just should not grow without real need to the point when they 
have to be cleaned.


Let me explain it differently.  Right now you're handling cases of 
overloaded CPU, if we run this code under different conditions we

Re: UMA cache back pressure

2013-11-18 Thread Jeff Roberson

On Mon, 18 Nov 2013, Adrian Chadd wrote:


Remember that for Netflix, we have a mostly non-cachable workload
(with some very specific exceptions!) and thus we churn through VM
pages at a presitidigious rate. 20gbit sec, or ~ 2.4 gigabytes a
second, or ~ 680,000 4 kilobyte pages a second. It's quite frightening
and it's only likely to increase.

There's a lot of pressure from all over the place so IIRC pools tend
to not stay very large for very long.


I think the combination of a lot of cache pressure, a lot of allocator 
use, and no ZFS makes you an interesting candidate.




That's why I'm interested in your specific situations. Doing an all
CPU TLB shootdown with 24 cores is costly. But after we killed some
incorrect KVA mapping flags for sendfile, we (netflix) totally stopped


Do you have any information on what this change was?


seeing the TLB shootdown and IPIs in any of the performance traces.
Now, doing 24 cores worth of ZFS when you let the pools grow to the
size you do is understandable, but I'd like to just make sure that you
aren't breaking performance for people doing different workloads on
less cores.


We also have opportunities now with vmem to cache KVA backed pages and 
release them together in bulk when necessary.  However, remember most UMA 
memory won't need an IPI since it comes from the direct map.   Only the 
few zones which use very large allocations will.


Jeff



I'm a bit busy at work with other things so I can't spin up your patch
on a cache for another week or two. But I'll certainly get around to
it as I'd like to see this stuff catch on.

What I _can_ do in a reasonably immediate timeframe is update
vm0.freebsd.org to the latest -HEAD and stress test your patch out.
I'm using vm0.freebsd.org to stress test -HEAD with ZFS doing
concurrent poudriere builds so it gets very crowded on that box. The
box currently survives a couple days before I hit some races to do
with vnode exhaustion and a lack of handling there, and ZFS deadlocks.
I'll just run this up to see if anything unexpected happens that
causes it to blow up in a different way.

Thanks,



-adrian


On 18 November 2013 11:55, Alexander Motin m...@freebsd.org wrote:

On 18.11.2013 21:11, Jeff Roberson wrote:


On Mon, 18 Nov 2013, Alexander Motin wrote:


I've created patch, based on earlier work of avg@, to add back
pressure to UMA allocation caches. The problem of physical memory or
KVA exhaustion existed there for many years and it is quite critical
now for improving systems performance while keeping stability. Changes
done in memory allocation last years improved situation. but haven't
fixed completely. My patch solves remaining problems from two sides:
a) reducing bucket sizes every time system detects low memory
condition; and b) as last-resort mechanism for very low memory
condition, it cycling over all CPUs to purge their per-CPU UMA caches.
Benefit of this approach is in absence of any additional hard-coded
limits on cache sizes -- they are self-tuned, based on load and memory
pressure.

With this change I believe it should be safe enough to enable UMA
allocation caches in ZFS via vfs.zfs.zio.use_uma tunable (at least for
amd64). I did many tests on machine with 24 logical cores (and as
result strong allocation cache effects), and can say that with 40GB
RAM using UMA caches, allowed by this change, by two times increases
results of SPEC NFS benchmark on ZFS pool of several SSDs. To test
system stability I've run the same test with physical memory limited
to just 2GB and system successfully survived that, and even showed
results 1.5 times better then with just last resort measures of b). In
both cases tools/umastat no longer shows unbound UMA cache growth,
that makes me believe in viability of this approach for longer runs.

I would like to hear some comments about that:
http://people.freebsd.org/~mav/uma_pressure.patch



Hey Mav,

This is a great start and great results.  I think it could probably even
go in as-is, but I have a few suggestions.



Hey! Thanks for your review. I appreciate.



First, let's test this with something that is really super allocator
heavy and doesn't benefit much from bucket sizing.  For example, a
network forwarding test.  Or maybe you could get someone like Netflix
that is using it to push a lot of bits with less filesystem cost than
zfs and spec.



I am not sure what simple forwarding may show in this case. Even on my
workload with ZFS creating strong memory pressure I still have mbuf* zones
buckets almost (some totally) maxed out. Without other major (or even any)
pressure in system they just can't become bigger then maximum. But if you
can propose some interesting test case with pressure that I can reproduce --
I am all ears.



Second, the cpu binding is a very costly and very high-latency
operation. It would make sense to do CPU_FOREACH and then ZONE_FOREACH.
You're also biasing the first zones in the list.  The low memory
condition will more often clear after you check

Re: Early drop to debugger with DEBUG_MEMGUARD

2013-08-13 Thread Jeff Roberson

On Mon, 12 Aug 2013, David Wolfskill wrote:


On Tue, Aug 13, 2013 at 08:29:44AM +0300, Konstantin Belousov wrote:

...
The r254025 indeed introduced the problem, and Davide pointed out you a
workaround for the assertion triggering.


Right; I tried one of those -- I hope I got it right...


Proper fix for the memguard requires a policy of M_NEXTFIT or like, to
avoid a reuse of the previous allocated range as long as possible.


That's why I passed a start address as a lower bound to vmem_xalloc.  I 
would like to eventually implement nextfit.




Ah.


But, you have some further issue even after the assertion was silenced,
isn't it ?


I will fix this today and do some stress tests with memguard on.  Sorry 
for the difficulty.


Thanks,
Jeff



Yes; please see
http://docs.FreeBSD.org/cgi/mid.cgi?20130812160154.GF1570 for a copy
of the message that shows the resulting panic.  (Or see previous
messages i this thread, if that's easier.)  It looks (from my naive
perspective) as if mti_zone hadn't been initialized (properly?  at
all?).

In any case, I remain willing to test, subject to Internet connectivity
flakiness where I am now and other demands on my time.

Peace,
david
--
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: UMA: Increase vm.boot_pages with 32 CPUs

2013-08-13 Thread Jeff Roberson

On Mon, 12 Aug 2013, Colin Percival wrote:


Hi all,

A HEAD@254238 kernel fails to boot in EC2 with

panic: UMA: Increase vm.boot_pages

on 32-CPU instances.  Instances with up to 16 CPUs boot fine.

I know there has been some mucking about with VM recently -- anyone want
to claim this, or should I start doing a binary search?


It's not any one commit really, just creeping demand for more pages before 
the VM can get started.  I would suggest making boot pages scale with 
MAXCPU.  Or just raising it as the panic suggests.  We could rewrite the 
way that the vm gets these early pages but it's a lot of work and 
typically people just bump it and forget about it.


Thanks,
Jeff



--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Early drop to debugger with DEBUG_MEMGUARD

2013-08-13 Thread Jeff Roberson

On Tue, 13 Aug 2013, Jeff Roberson wrote:


On Mon, 12 Aug 2013, David Wolfskill wrote:


On Tue, Aug 13, 2013 at 08:29:44AM +0300, Konstantin Belousov wrote:

...
The r254025 indeed introduced the problem, and Davide pointed out you a
workaround for the assertion triggering.


Right; I tried one of those -- I hope I got it right...


Proper fix for the memguard requires a policy of M_NEXTFIT or like, to
avoid a reuse of the previous allocated range as long as possible.


That's why I passed a start address as a lower bound to vmem_xalloc.  I would 
like to eventually implement nextfit.




Ah.


But, you have some further issue even after the assertion was silenced,
isn't it ?


I will fix this today and do some stress tests with memguard on.  Sorry for 
the difficulty.


Please try 254308.  It is working for me.

Thanks,
Jeff



Thanks,
Jeff



Yes; please see
http://docs.FreeBSD.org/cgi/mid.cgi?20130812160154.GF1570 for a copy
of the message that shows the resulting panic.  (Or see previous
messages i this thread, if that's easier.)  It looks (from my naive
perspective) as if mti_zone hadn't been initialized (properly?  at
all?).

In any case, I remain willing to test, subject to Internet connectivity
flakiness where I am now and other demands on my time.

Peace,
david
--
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: UMA: Increase vm.boot_pages with 32 CPUs

2013-08-13 Thread Jeff Roberson

On Tue, 13 Aug 2013, Jim Harris wrote:





On Tue, Aug 13, 2013 at 3:05 PM, Jeff Roberson jrober...@jroberson.net
wrote:
  On Mon, 12 Aug 2013, Colin Percival wrote:

Hi all,

A HEAD@254238 kernel fails to boot in EC2 with
  panic: UMA: Increase vm.boot_pages

on 32-CPU instances.  Instances with up to 16 CPUs
boot fine.

I know there has been some mucking about with VM
recently -- anyone want
to claim this, or should I start doing a binary
search?


It's not any one commit really, just creeping demand for more pages
before the VM can get started.  I would suggest making boot pages
scale with MAXCPU.  Or just raising it as the panic suggests.  We
could rewrite the way that the vm gets these early pages but it's a
lot of work and typically people just bump it and forget about it.


I ran into this problem today when enabling hyperthreading on my dual-socket
Xeon E5 system.

It looks like r254025 is actually the culprit.  Specifically, the new
mallocinit()/kmeminit() now invoke the new vmem_init() before
uma_startup2(), which allocates 16 zones out of the boot pages if I am
reading this correctly.  This is all done before uma_startup2() is called,
triggering the panic.



I just disabled the quantum caches in vmem which allocate those 16 zones. 
This may alleviate the problem for now.


Thanks,
Jeff


Anything less than 28 CPUs, and the zone size (uma_zone + uma_cache *
(mp_maxid + 1)) is = PAGE_SIZE and we can successfully boot.  So at 32
CPUs, we need two boot pages per zone which consumes more than the default
64 boot pages.  The size of these structures do not appear to have
materially changed any time recently.

Scaling with MAXCPU seems to be an OK solution, but should it be based
directly on the size of (uma_zone + uma_cache * MAXCPU)?  I am not very
familiar with uma startup, but it seems like these zones are the primary
consumers of the boot pages, so the UMA_BOOT_PAGES default should be based
directly on that size..

Regards,

-Jim


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Kernel build fails on ARM: Cannot fork: Cannot allocate memory

2013-06-24 Thread Jeff Roberson

On Sun, 23 Jun 2013, Ruslan Bukin wrote:


On Sun, Jun 23, 2013 at 07:50:40PM +0300, Konstantin Belousov wrote:

On Sun, Jun 23, 2013 at 08:44:25PM +0400, Ruslan Bukin wrote:

On Sun, Jun 23, 2013 at 07:16:17PM +0300, Konstantin Belousov wrote:

On Sun, Jun 23, 2013 at 06:43:46PM +0400, Ruslan Bukin wrote:


Trying to mount root from ufs:/dev/da0 []...
WARNING: / was not properly dismounted
warning: no time-of-day clock registered, system time will not be set accurately
panic: __rw_wlock_hard: recursing but non-recursive rw pmap pv global @ 
/usr/home/br/dev/head/sys/arm/arm/pmap-v6.c:1289

KDB: enter: panic
[ thread pid 1 tid 11 ]
Stopped at  kdb_enter+0x48: ldrbr15, [r15, r15, ror r15]!
db bt
Tracing pid 1 tid 11 td 0xc547f620
_end() at 0xde9d0530
scp=0xde9d0530 rlv=0xc1211458 (db_trace_thread+0x34)
rsp=0xde9d0514 rfp=0xc12d1b60
Bad frame pointer: 0xc12d1b60
db

This is completely broken.  It seems that witness triggered the panic,
and ddb is unable to obtain a backtrace from the normal panic(9) call.

Show the output of the 'show alllocks'.


No such command

Do you have witness in the kernel config ? If not, add it to the config
and retry.


Trying to mount root from ufs:/dev/da0 []...
WARNING: / was not properly dismounted
warning: no time-of-day clock registered, system time will not be set accurately
panic: __rw_wlock_hard: recursing but non-recursive rw pmap pv global @ 
/usr/home/br/dev/head/sys/arm/arm/pmap-v6.c:1289

KDB: enter: panic
[ thread pid 1 tid 11 ]
Stopped at  kdb_enter+0x48: ldrbr15, [r15, r15, ror r15]!
db show alllocks
Process 1 (kernel) thread 0xc55fc620 (11)
exclusive sleep mutex pmap (pmap) r = 0 (0xc5600590) locked @ 
/usr/home/br/dev/head/sys/arm/arm/pmap-v6.c:729
exclusive rw pmap pv global (pmap pv global) r = 0 (0xc1479dd0) locked @ 
/usr/home/br/dev/head/sys/arm/arm/pmap-v6.c:728
shared rw vm object (vm object) r = 0 (0xc1551d4c) locked @ 
/usr/home/br/dev/head/sys/vm/vm_map.c:1809
exclusive sx vm map (user) (vm map (user)) r = 0 (0xc5600528) locked @ 
/usr/home/br/dev/head/sys/kern/imgact_elf.c:445
exclusive lockmgr ufs (ufs) r = 0 (0xc56f7914) locked @ 
/usr/home/br/dev/head/sys/kern/imgact_elf.c:821
exclusive sleep mutex Giant (Giant) r = 0 (0xc147c778) locked @ 
/usr/home/br/dev/head/sys/kern/vfs_mount.c:1093
db



Would any of the arm users be interested in testing a larger patch that 
changes the way the kernel allocations KVA?  It also has some UMA code 
that lessens kernel memory utilization.


http://people.freebsd.org/~jeff/vmem.diff

Any reports would be helpful.  Is there any ETA on getting stack tracing 
fixed?  I suspect the pmap recursion encountered with Kostik's patch exist 
in the current kernel.  The other changes in this patch my fix that as 
well.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel build fails on ARM: Cannot fork: Cannot allocate memory

2013-06-22 Thread Jeff Roberson

On Fri, 21 Jun 2013, Zbyszek Bodek wrote:


On 21.06.2013 01:56, Jeff Roberson wrote:

On Thu, 20 Jun 2013, Jeff Roberson wrote:


On Wed, 19 Jun 2013, Zbyszek Bodek wrote:


Hello,

I've been trying to compile the kernel on my ARMv7 platform using the
sources from the current FreeBSD HEAD.

make buildkernel . -j5

1/2 builds fails in the way described below:
--

ing-include-dirs -fdiagnostics-show-option   -nostdinc  -I.
-I/root/src/freebsd-arm-superpages/sys
-I/root/src/freebsd-arm-superpages/sys/contrib/altq
-I/root/src/freebsd-arm-superpages/sys/contrib/libfdt -D_KERNEL
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000  -mno-thumb-interwork -ffreestanding -Werror
/root/src/freebsd-arm-superpages/sys/ufs/ffs/ffs_snapshot.c
Cannot fork: Cannot allocate memory
*** [ffs_snapshot.o] Error code 2
1 error
*** [buildkernel] Error code 2
1 error
*** [buildkernel] Error code 2
1 error
5487.888u 481.569s 7:35.65 1310.0%  1443+167k 1741+5388io 221pf+0w
--


The warning from std err is:
--

vm_thread_new: kstack allocation failed
vm_thread_new: kstack allocation failed
--


I was trying to find out which commit is causing this (because I was
previously working on some older revision) and using bisect I got to:

--

Author: jeff j...@freebsd.org
Date:   Tue Jun 18 04:50:20 2013 +

   Refine UMA bucket allocation to reduce space consumption and improve
   performance.

- Always free to the alloc bucket if there is space.  This gives
LIFO
  allocation order to improve hot-cache performance.  This also
allows
  for zones with a single bucket per-cpu rather than a pair if the
entire
  working set fits in one bucket.
- Enable per-cpu caches of buckets.  To prevent recursive bucket
  allocation one bucket zone still has per-cpu caches disabled.
- Pick the initial bucket size based on a table driven maximum size
  per-bucket rather than the number of items per-page.  This gives
  more sane initial sizes.
- Only grow the bucket size when we face contention on the zone
lock, this
  causes bucket sizes to grow more slowly.
- Adjust the number of items per-bucket to account for the header
space.
  This packs the buckets more efficiently per-page while making them
  not quite powers of two.
- Eliminate the per-zone free bucket list.  Always return buckets
back
  to the bucket zone.  This ensures that as zones grow into larger
  bucket sizes they eventually discard the smaller sizes.  It
persists
  fewer buckets in the system.  The locking is slightly trickier.
- Only switch buckets in zalloc, not zfree, this eliminates
pathological
  cases where we ping-pong between two buckets.
- Ensure that the thread that fills a new bucket gets to allocate
from
  it to give a better upper bound on allocation time.

   Sponsored by:EMC / Isilon Storage Division
--


I checked this several times and this commits seems to be causing this.


Can you tell me how many cores and how much memory you have?  And
paste the output of vmstat -z when you see this error.

You can try changing bucket_select() at line 339 in uma_core.c to read:

static int
bucket_select(int size)
{
return (MAX(PAGE_SIZE / size, 1));
}

This will approximate the old bucket sizing behavior.


Just to add some more information;  On my machine with 16GB of ram the
handful of recent UMA commits save about 20MB of kmem on boot.  There
are 30% fewer buckets allocated.  And all of the malloc zones have
similar amounts of cached space.  Actually the page size malloc bucket
is taking up much less space.

I don't know if the problem is unique to arm but I have tested x86
limited to 512MB of ram without trouble.  I will need the stats I
mentioned before to understand what has happened.



Hello Jeff,

Thank you for your interest in my problem.

My system is a quad-core ARMv7 with 2048 MB of RAM on board.
Please see attachment for the output from vmstat -z when the error occurs.

Changing bucket_select() to

static int
bucket_select(int size)
{
   return (MAX(PAGE_SIZE / size, 1));
}

as you suggested helps for the problem. I've performed numerous attempts
to build the kernel and none of them failed.



I don't really see a lot of wasted memory in the zones.  There is 
certainly some.  Can you give me sysctl vm from both a working and 
non-working kernel after the build is done or fails?


Thanks,
Jeff



Best regards
Zbyszek Bodek

Re: Kernel build fails on ARM: Cannot fork: Cannot allocate memory

2013-06-20 Thread Jeff Roberson

On Wed, 19 Jun 2013, Zbyszek Bodek wrote:


Hello,

I've been trying to compile the kernel on my ARMv7 platform using the
sources from the current FreeBSD HEAD.

make buildkernel . -j5

1/2 builds fails in the way described below:
--
ing-include-dirs -fdiagnostics-show-option   -nostdinc  -I.
-I/root/src/freebsd-arm-superpages/sys
-I/root/src/freebsd-arm-superpages/sys/contrib/altq
-I/root/src/freebsd-arm-superpages/sys/contrib/libfdt -D_KERNEL
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000  -mno-thumb-interwork -ffreestanding -Werror
/root/src/freebsd-arm-superpages/sys/ufs/ffs/ffs_snapshot.c
Cannot fork: Cannot allocate memory
*** [ffs_snapshot.o] Error code 2
1 error
*** [buildkernel] Error code 2
1 error
*** [buildkernel] Error code 2
1 error
5487.888u 481.569s 7:35.65 1310.0%  1443+167k 1741+5388io 221pf+0w
--

The warning from std err is:
--
vm_thread_new: kstack allocation failed
vm_thread_new: kstack allocation failed
--

I was trying to find out which commit is causing this (because I was
previously working on some older revision) and using bisect I got to:

--
Author: jeff j...@freebsd.org
Date:   Tue Jun 18 04:50:20 2013 +

   Refine UMA bucket allocation to reduce space consumption and improve
   performance.

- Always free to the alloc bucket if there is space.  This gives LIFO
  allocation order to improve hot-cache performance.  This also allows
  for zones with a single bucket per-cpu rather than a pair if the
entire
  working set fits in one bucket.
- Enable per-cpu caches of buckets.  To prevent recursive bucket
  allocation one bucket zone still has per-cpu caches disabled.
- Pick the initial bucket size based on a table driven maximum size
  per-bucket rather than the number of items per-page.  This gives
  more sane initial sizes.
- Only grow the bucket size when we face contention on the zone
lock, this
  causes bucket sizes to grow more slowly.
- Adjust the number of items per-bucket to account for the header
space.
  This packs the buckets more efficiently per-page while making them
  not quite powers of two.
- Eliminate the per-zone free bucket list.  Always return buckets back
  to the bucket zone.  This ensures that as zones grow into larger
  bucket sizes they eventually discard the smaller sizes.  It persists
  fewer buckets in the system.  The locking is slightly trickier.
- Only switch buckets in zalloc, not zfree, this eliminates
pathological
  cases where we ping-pong between two buckets.
- Ensure that the thread that fills a new bucket gets to allocate from
  it to give a better upper bound on allocation time.

   Sponsored by:EMC / Isilon Storage Division
--

I checked this several times and this commits seems to be causing this.


Can you tell me how many cores and how much memory you have?  And paste 
the output of vmstat -z when you see this error.


You can try changing bucket_select() at line 339 in uma_core.c to read:

static int
bucket_select(int size)
{
return (MAX(PAGE_SIZE / size, 1));
}

This will approximate the old bucket sizing behavior.

Thanks,
Jeff



Does anyone observe similar behavior or have a solution?

Best regards
Zbyszek Bodek


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel build fails on ARM: Cannot fork: Cannot allocate memory

2013-06-20 Thread Jeff Roberson

On Thu, 20 Jun 2013, Jeff Roberson wrote:


On Wed, 19 Jun 2013, Zbyszek Bodek wrote:


Hello,

I've been trying to compile the kernel on my ARMv7 platform using the
sources from the current FreeBSD HEAD.

make buildkernel . -j5

1/2 builds fails in the way described below:
--
ing-include-dirs -fdiagnostics-show-option   -nostdinc  -I.
-I/root/src/freebsd-arm-superpages/sys
-I/root/src/freebsd-arm-superpages/sys/contrib/altq
-I/root/src/freebsd-arm-superpages/sys/contrib/libfdt -D_KERNEL
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000  -mno-thumb-interwork -ffreestanding -Werror
/root/src/freebsd-arm-superpages/sys/ufs/ffs/ffs_snapshot.c
Cannot fork: Cannot allocate memory
*** [ffs_snapshot.o] Error code 2
1 error
*** [buildkernel] Error code 2
1 error
*** [buildkernel] Error code 2
1 error
5487.888u 481.569s 7:35.65 1310.0%  1443+167k 1741+5388io 221pf+0w
--

The warning from std err is:
--
vm_thread_new: kstack allocation failed
vm_thread_new: kstack allocation failed
--

I was trying to find out which commit is causing this (because I was
previously working on some older revision) and using bisect I got to:

--
Author: jeff j...@freebsd.org
Date:   Tue Jun 18 04:50:20 2013 +

   Refine UMA bucket allocation to reduce space consumption and improve
   performance.

- Always free to the alloc bucket if there is space.  This gives LIFO
  allocation order to improve hot-cache performance.  This also allows
  for zones with a single bucket per-cpu rather than a pair if the
entire
  working set fits in one bucket.
- Enable per-cpu caches of buckets.  To prevent recursive bucket
  allocation one bucket zone still has per-cpu caches disabled.
- Pick the initial bucket size based on a table driven maximum size
  per-bucket rather than the number of items per-page.  This gives
  more sane initial sizes.
- Only grow the bucket size when we face contention on the zone
lock, this
  causes bucket sizes to grow more slowly.
- Adjust the number of items per-bucket to account for the header
space.
  This packs the buckets more efficiently per-page while making them
  not quite powers of two.
- Eliminate the per-zone free bucket list.  Always return buckets back
  to the bucket zone.  This ensures that as zones grow into larger
  bucket sizes they eventually discard the smaller sizes.  It persists
  fewer buckets in the system.  The locking is slightly trickier.
- Only switch buckets in zalloc, not zfree, this eliminates
pathological
  cases where we ping-pong between two buckets.
- Ensure that the thread that fills a new bucket gets to allocate from
  it to give a better upper bound on allocation time.

   Sponsored by:EMC / Isilon Storage Division
--

I checked this several times and this commits seems to be causing this.


Can you tell me how many cores and how much memory you have?  And paste the 
output of vmstat -z when you see this error.


You can try changing bucket_select() at line 339 in uma_core.c to read:

static int
bucket_select(int size)
{
return (MAX(PAGE_SIZE / size, 1));
}

This will approximate the old bucket sizing behavior.


Just to add some more information;  On my machine with 16GB of ram the 
handful of recent UMA commits save about 20MB of kmem on boot.  There are 
30% fewer buckets allocated.  And all of the malloc zones have similar 
amounts of cached space.  Actually the page size malloc bucket is taking 
up much less space.


I don't know if the problem is unique to arm but I have tested x86 limited 
to 512MB of ram without trouble.  I will need the stats I mentioned before 
to understand what has happened.


Jeff



Thanks,
Jeff



Does anyone observe similar behavior or have a solution?

Best regards
Zbyszek Bodek




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic @r251934: _mtx_lock_sleep: recursed on non-recursive mutex vm map (system) @ /usr/src/sys/vm/vm_kern.c:430

2013-06-18 Thread Jeff Roberson

On Tue, 18 Jun 2013, David Wolfskill wrote:


This is on my (i386) build machine; laptop (also i386) did not exhibit
the symptom.  I had hand-applied r251953 (to get through buildworld).

After installworld, mergemaster -i,  make delete-old, I rebooted, and
this is what I saw:


This is my fault.  I will have a fix in a day or so.

Thanks,
Jeff



...
Booting...
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.0-CURRENT #1197  r251934M/251934:135: Tue Jun 18 10:47:29 PDT 
2013
   r...@freebeast.catwhisker.org:/common/S4/obj/usr/src/sys/GENERIC i386
FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
WARNING: WITNESS option enabled, expect reduced performance.
CPU: Intel(R) Xeon(TM) CPU 3.60GHz (3600.20-MHz 686-class CPU)
 Origin = GenuineIntel  Id = 0xf41  Family = 0xf  Model = 0x4  Stepping = 1
 
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
 Features2=0x659dSSE3,DTES64,MON,DS_CPL,EST,TM2,CNXT-ID,CX16,xTPR
 AMD Features=0x2010NX,LM
 TSC: P-state invariant
real memory  = 2147483648 (2048 MB)
avail memory = 2064138240 (1968 MB)
Event timer LAPIC quality 400
ACPI APIC Table: PTLTD  APIC  
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 2 package(s) x 1 core(s)
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  6
...
pass3 at aacp0 bus 0 scbus0 target 3 lun 0
pass3: SEAGATE ST373454LC 0005 Fixed Uninstalled SCSI-3 device
pass3: 3.300MB/s transfers
SMP: AP CPU #1 Launched!
panic: _mtx_lock_sleep: recursed on non-recursive mutex vm map (system) @ 
/usr/src/sys/vm/vm_kern.c:430

cpuid = 1
KDB: enter: panic
[ thread pid 13 tid 100010 ]
Stopped at  kdb_enter+0x3d: movl$0,kdb_why
db bt
Tracing pid 13 tid 100010 td 0xc62ef930
kdb_enter(c10ff600,c10ff600,c10fd42c,c5f766a4,c10fd42c,...) at 
kdb_enter+0x3d/frame 0xc5f76638
vpanic(c127a538,100,c10fd42c,c5f766a4,c5f766a4,...) at vpanic+0x143/frame 
0xc5f76674
kassert_panic(c10fd42c,c1107ba3,c113ab63,1ae,0,...) at kassert_panic+0xea/frame 
0xc5f76698
__mtx_lock_sleep(c1b540f8,c62ef930,c113ab63,c113ab63,1ae,...) at 
__mtx_lock_sleep+0x3c7/frame 0xc5f76704
__mtx_lock_flags(c1b540f8,0,c113ab63,1ae,c1b57200,...) at 
__mtx_lock_flags+0xfd/frame 0xc5f76738
_vm_map_lock(c1b5408c,c113ab63,1ae,c5f76798,c0a9f7b7,...) at 
_vm_map_lock+0x31/frame 0xc5f76754
kmem_malloc(c1b5408c,1000,101,3d1,c1b57200,...) at kmem_malloc+0x2a/frame 
0xc5f76798
startup_alloc(c1b56240,1000,c5f767eb,101,c1b57210,...) at 
startup_alloc+0xd6/frame 0xc5f767bc
keg_alloc_slab(1,4,c1139e3e,883,c62ef930,...) at keg_alloc_slab+0xc8/frame 
0xc5f767f8
keg_fetch_slab(1,0,0,1,c5f76890,...) at keg_fetch_slab+0x14f/frame 0xc5f76838
zone_fetch_slab(c1b56240,0,1,956,1,...) at zone_fetch_slab+0x2f/frame 0xc5f76850
zone_import(c1b56240,c5f768a8,1,1,0,...) at zone_import+0x67/frame 0xc5f76890
zone_alloc_item(1,18,c5f76900,c0a9f7b7,c1b57300,...) at 
zone_alloc_item+0x33/frame 0xc5f768c0
uma_zalloc_arg(c1b56240,0,1,857,c1b57500,...) at uma_zalloc_arg+0x60d/frame 
0xc5f76900
uma_zalloc_arg(c1b563c0,0,201,857,c144a0b0,...) at uma_zalloc_arg+0x319/frame 
0xc5f76940
uma_zalloc_arg(c1b566c0,0,1,47a,8,...) at uma_zalloc_arg+0x319/frame 0xc5f76980
vm_map_insert(c1b5408c,c144a0b0,6a0e000,0,c690d000,...) at 
vm_map_insert+0x47e/frame 0xc5f769dc
kmem_back(c1b5408c,c690d000,1000,101,c0a9f7b7,...) at kmem_back+0x79/frame 
0xc5f76a38
kmem_malloc(c1b5408c,1000,101,3d1,c1b57300,...) at kmem_malloc+0x2d5/frame 
0xc5f76a7c
startup_alloc(c1b563c0,1000,c5f76acf,101,c1b57310,...) at 
startup_alloc+0xd6/frame 0xc5f76aa0
keg_alloc_slab(1,4,c1139e3e,883,c62ef930,...) at keg_alloc_slab+0xc8/frame 
0xc5f76adc
keg_fetch_slab(1,16,c1b1ad10,0,c5f76b74,...) at keg_fetch_slab+0x14f/frame 
0xc5f76b1c
zone_fetch_slab(c1b563c0,c1b57300,1,93e,1,...) at zone_fetch_slab+0x2f/frame 
0xc5f76b34
zone_import(c1b563c0,c1b49f0c,1d,1,c1b49f0c,...) at zone_import+0x67/frame 
0xc5f76b74
uma_zalloc_arg(c1b563c0,0,1,857,c1b35580,...) at uma_zalloc_arg+0x374/frame 
0xc5f76bb4
uma_zalloc_arg(c1b56300,0,1,a5c,c1b31310,...) at uma_zalloc_arg+0x319/frame 
0xc5f76bf4
uma_zfree_arg(c1b31300,c68d9390,0) at uma_zfree_arg+0x277/frame 0xc5f76c2c
g_destroy_bio(c68d9390,0,c68d9390,c5f76c80,c0b412d2,...) at 
g_destroy_bio+0x22/frame 0xc5f76c40
g_std_done(c68d9390,0,c110d464,e07,0,...) at g_std_done+0x33/frame 0xc5f76c54
biodone(c68d9390,0,c10ece47,60,0,...) at biodone+0xb2/frame 0xc5f76c80
g_io_schedule_up(c62ef930,0,c10ed066,5f,c5f76cf4,...) at 
g_io_schedule_up+0x129/frame 0xc5f76cb4
g_up_procbody(0,c5f76d08,c10f90d8,3d7,0,...) at g_up_procbody+0x9d/frame 
0xc5f76ccc
fork_exit(c0a15550,0,c5f76d08) at fork_exit+0x7f/frame 0xc5f76cf4
fork_trampoline() 

Re: [head tinderbox] failure on i386/i386

2013-05-12 Thread Jeff Roberson

On Sun, 12 May 2013, David Wolfskill wrote:


It appears that the issue is i386- (or 32-bit-) specific.

On Sun, May 12, 2013 at 07:16:48AM -0700, David Wolfskill wrote:

On Sun, May 12, 2013 at 11:45:37AM +, FreeBSD Tinderbox wrote:

TB --- 2013-05-12 05:50:18 - tinderbox 2.10 running on freebsd-current.sentex.ca
TB --- 2013-05-12 05:50:18 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE 
FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 
d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC  amd64
TB --- 2013-05-12 05:50:18 - starting HEAD tinderbox run for i386/i386
TB --- 2013-05-12 05:50:18 - cleaning the object tree
TB --- 2013-05-12 05:50:18 - /usr/local/bin/svn stat /src
TB --- 2013-05-12 05:50:23 - At svn revision 250553
TB --- 2013-05-12 05:50:24 - building world

...
...
ctfconvert -L VERSION -g vfs_lookup.o
clang -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  
-Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs 
-fdiagnostics-show-option  -Wno-error-tautological-compare 
-Wno-error-empty-body  -Wno-error-parentheses-equality -nostdinc  -I. 
-I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL 
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h  -mno-aes -mno-avx -mno-mmx 
-mno-sse -msoft-float -ffreestanding -fstack-protector -Werror  
/usr/src/sys/kern/vfs_mountroot.c
: export_syms
awk -f /usr/src/sys/conf/kmod_syms.awk drm2.kld  export_syms | xargs -J% 
objcopy % drm2.kld
clang -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  
-Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs 
-fdiagnostics-show-option  -Wno-error-tautological-compare 
-Wno-error-empty-body  -Wno-error-parentheses-equality -nostdinc  -I. 
-I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL 
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h  -mno-aes -mno-avx -mno-mmx 
-mno-sse -msoft-float -ffreestanding -fstack-protector -Werror  
/usr/src/sys/kern/vfs_subr.c
ctfconvert -L VERSION -g vfs_mount.o
/usr/src/sys/kern/vfs_subr.c:305:1: error: '__assert_4' declared as an array 
with a negative size
PCTRIE_DEFINE(BUF, buf, b_lblkno, buf_trie_alloc, buf_trie_free);
^~~~
/usr/src/sys/sys/pctrie.h:40:66: note: expanded from macro 'PCTRIE_DEFINE'
CTASSERT(sizeof(((struct type *)0)-field) == sizeof(uint64_t));\
^
/usr/src/sys/sys/systm.h:100:21: note: expanded from macro '\
CTASSERT'
#define CTASSERT(x) _Static_assert(x, compile-time assertion failed)
^~
/usr/src/sys/sys/cdefs.h:251:30: note: expanded from macro '_Static_assert'
#define _Static_assert(x, y)__Static_assert(x, __COUNTER__)
^~~
/usr/src/sys/sys/cdefs.h:252:31: note: expanded from macro '__Static_assert'
#define __Static_assert(x, y)   ___Static_assert(x, y)
^~
/usr/src/sys/sys/cdefs.h:253:60: note: expanded from macro '___Static_assert'
#define ___Static_assert(x, y)  typedef char __assert_ ## y[(x) ? 1 : -1]
^~~~
1 error generated.
*** [vfs_subr.o] Error code 1



Based on the above, I reverted r250551 and re-started the make
buildkernel -- which succeeded:

FreeBSD g1-227.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #897  
r250557M/250558:132: Sun May 12 06:44:01 PDT 2013 
r...@g1-227.catwhisker.org:/usr/obj/usr/src/sys/CANARY  i386



However, I did not need to revert r250551 to build successfully on amd64:

FreeBSD g1-227.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #2  
rM/:132: Sun May 12 10:09:59 PDT 2013 
r...@g1-227.catwhisker.org:/usr/obj/usr/src/sys/CANARY  amd64



Thanks.  It looks like it's actually an alignment problem and that compile 
error is erroneous.  I'm going to check in a fix by weakening the 
alignment requirement to 32bit and then build locally but I will probably 
race tinderbox to verify that it resolves 32bit.


Jeff


Peace,
david
--
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Call for testers, users with scsi cards

2012-12-05 Thread Jeff Roberson

On Wed, 5 Dec 2012, Jim Harris wrote:




On Tue, Dec 4, 2012 at 2:36 PM, Jeff Roberson jrober...@jroberson.net
wrote:
  http://people.freebsd.org/~jeff/loadccb.diff

  This patch consolidates all of the functions that map cam
  control blocks for DMA into one central function.  This change
  is a precursor to adding new features to the I/O stack.  It is
  mostly mechanical.  If you are running current on a raid or scsi
  card, especially if it is a lesser used one, I would really like
  you to apply this patch and report back any problems.  If it
  works you should notice nothing.  If it doesn't work you will
  probably panic immediately on I/O or otherwise no I/O will
  happen.


+int
+bus_dmamap_load_ccb(bus_dma_tag_t dmat, bus_dmamap_t map, union ccb *ccb,
+            bus_dmamap_callback_t *callback, void *callback_arg,
+            int flags)
+{
+    struct ccb_ataio *ataio;
+    struct ccb_scsiio *csio;
+    struct ccb_hdr *ccb_h;
+    void *data_ptr;
+    uint32_t dxfer_len;
+    uint16_t sglist_cnt;
+
+    ccb_h = ccb-ccb_h;
+    if ((ccb_h-flags  CAM_DIR_MASK) == CAM_DIR_NONE) {
+        callback(callback_arg, NULL, 0, 0);
+    }
+

I think you need to return here after invoking the callback.  Otherwise you
drop through and then either invoke the callback again or call
bus_dmamap_load (which will in turn invoke the callback again).

This fix allows the ahci.c change to go back to:



Thanks Jim.  That was silly of me.  I have decided to move this work to a 
branch and keep expanding on it.  I'll solicit more testing once the 
branch is closer to the ultimate goal.


Thanks,
Jeff


Index: sys/dev/ahci/ahci.c
===
--- sys/dev/ahci/ahci.c (revision 243900)
+++ sys/dev/ahci/ahci.c (working copy)
@@ -1667,23 +1667,9 @@
    (ccb-ataio.cmd.flags  (CAM_ATAIO_CONTROL |
CAM_ATAIO_NEEDRESULT)))
    ch-aslots |= (1  slot-slot);
    slot-dma.nsegs = 0;
-   /* If request moves data, setup and load SG list */
-   if ((ccb-ccb_h.flags  CAM_DIR_MASK) != CAM_DIR_NONE) {
-   void *buf;
-   bus_size_t size;
-
-   slot-state = AHCI_SLOT_LOADING;
-   if (ccb-ccb_h.func_code == XPT_ATA_IO) {
-   buf = ccb-ataio.data_ptr;
-   size = ccb-ataio.dxfer_len;
-   } else {
-   buf = ccb-csio.data_ptr;
-   size = ccb-csio.dxfer_len;
-   }
-   bus_dmamap_load(ch-dma.data_tag, slot-dma.data_map,
-   buf, size, ahci_dmasetprd, slot, 0);
-   } else
-   ahci_execute_transaction(slot);
+   slot-state = AHCI_SLOT_LOADING;
+   bus_dmamap_load_ccb(ch-dma.data_tag, slot-dma.data_map, ccb,
+   ahci_dmasetprd, slot, 0);
 }
 
 /* Locked by busdma engine. */

This is almost what you head earlier, but adding back setting of the slot's
state to AHCI_SLOT_LOADING, to cover the case where the load is deferred. 
It seems OK to do this even in case where no load is actually happening
(i.e. CAM_DIR_NONE).

-Jim





___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Call for testers, users with scsi cards

2012-12-04 Thread Jeff Roberson

http://people.freebsd.org/~jeff/loadccb.diff

This patch consolidates all of the functions that map cam control blocks 
for DMA into one central function.  This change is a precursor to adding 
new features to the I/O stack.  It is mostly mechanical.  If you are 
running current on a raid or scsi card, especially if it is a lesser used 
one, I would really like you to apply this patch and report back any 
problems.  If it works you should notice nothing.  If it doesn't work you 
will probably panic immediately on I/O or otherwise no I/O will happen.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Call for testers, users with scsi cards

2012-12-04 Thread Jeff Roberson

On Tue, 4 Dec 2012, Ian Lepore wrote:


On Tue, 2012-12-04 at 14:49 -0700, Warner Losh wrote:

On Dec 4, 2012, at 2:36 PM, Jeff Roberson wrote:


http://people.freebsd.org/~jeff/loadccb.diff

This patch consolidates all of the functions that map cam control blocks for 
DMA into one central function.  This change is a precursor to adding new 
features to the I/O stack.  It is mostly mechanical.  If you are running 
current on a raid or scsi card, especially if it is a lesser used one, I would 
really like you to apply this patch and report back any problems.  If it works 
you should notice nothing.  If it doesn't work you will probably panic 
immediately on I/O or otherwise no I/O will happen.


I haven't tested it yet.  My only comment from reading it though would be to 
make subr_busdma.c be dependent on cam, since it can only used from cam.  We've 
grown sloppy about noting these dependencies in the tree...

Warner


Hmmm, if it's only used by cam, why isn't it in cam/ rather than kern/ ?


kib pointed out drivers that use ccbs but do not depend on cam.  I also 
intend to consolidate many of the busdma_load_* functions into this 
subr_busdma.c eventually.  I will add a load_bio and things like load_uio 
and load_mbuf don't need to be re-implemented for every machine.  I will 
define a MD function that allows you to add virtual or physical segments 
piecemeal (as they all currently have) so that function may be called for 
each member in the uio, mbuf, ccb, or bio.


Thanks,
Jeff



-- Ian


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Experimental SUJ feature; cache synchronization

2012-11-08 Thread Jeff Roberson

Hello,

As of rev 242815 current has a feature to issue a synchronize cache 
command to the drive in between journal records and the metadata they 
modify.  This should make SUJ more safe in the face of power failure. 
This should be considered experimental at this phase.  If you wish to try 
it you may enable it at any time with:


sysctl debug.softdep.flushcache=1

Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: DragonFly vs FreeBSD scheduler

2012-11-03 Thread Jeff Roberson

On Sat, 3 Nov 2012, O. Hartmann wrote:


Am 11/03/12 15:17, schrieb Mark Felder:

On Sat, 3 Nov 2012 21:18:55 +0800
Alie Tan a...@affle.com wrote:


Hi,

No offence, just curious about scheduler and its functionality.

What is the different between this two that makes FreeBSD performance far
behind DragonFly BSD? http://www.dragonflybsd.org/release32/



I don't have any details but I do know that Dragonfly has been putting a lot of 
work into their scheduler. Hopefully some of that will trickle back our way.



Obviously they made the right decissions, but a single benchmark with a
DB server like postgresql doesn't tell the whole story. Let's see what
Phoronix will come up with. I'd like to see some more benchmarks of
DragonFly 3.2.

I doubt that the DragonFly scheduler approaches will go/flow easily into
FreeBSD. But I'd like to see it, even dumping ULE for a better approach.


It's not the scheduler.  It's lock contention in the vm and buffer cache. 
The scheduler can only schedule what is runnable.  We are working to 
address this problem.


Thanks,
Jeff











___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


ULE patch, call for testers

2012-11-02 Thread Jeff Roberson
I have a small patch to the ULE scheduler that makes a fairly large change 
to the way timeshare threads are handled.


http://people.freebsd.org/~jeff/schedslice.diff

Previously ULE used a fixed slice size for all timeshare threads.  Now it 
scales the slice size down based on load.  This should reduce latency for 
timeshare threads as load increases.  It is important to note that this 
does not impact interactive threads.  But when a thread transitions to 
interactive from timeshare it should see some improvement.  This happens 
when something like Xorg chews up a lot of CPU.


If anyone has perf tests they'd like to run please report back.  I have 
done a handful of validation.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ULE patch, call for testers

2012-11-02 Thread Jeff Roberson



On Fri, 2 Nov 2012, Eitan Adler wrote:


On 2 November 2012 14:26, Jeff Roberson jrober...@jroberson.net wrote:

I have a small patch to the ULE scheduler that makes a fairly large change
to the way timeshare threads are handled.

http://people.freebsd.org/~jeff/schedslice.diff

Previously ULE used a fixed slice size for all timeshare threads.  Now it
scales the slice size down based on load.  This should reduce latency for
timeshare threads as load increases.  It is important to note that this does
not impact interactive threads.  But when a thread transitions to
interactive from timeshare it should see some improvement.  This happens
when something like Xorg chews up a lot of CPU.

If anyone has perf tests they'd like to run please report back.  I have done
a handful of validation.


does it make sense to make these sysctls?

+#defineSCHED_SLICE_DEFAULT_DIVISOR 10  /* 100 ms. */
+#defineSCHED_SLICE_MIN_DIVISOR 4   /* DEFAULT/MIN = 25 ms. 
*/



DEFAULT_DIVISOR is indirectly through the sysctls that modify the slice. 
The min divisor could be.  I will consider adding that.


Thanks,
Jeff



--
Eitan Adler


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Increase the degree of interactivity ULE scheduler

2011-11-03 Thread Jeff Roberson

On Sat, 22 Oct 2011, Ivan Klymenko wrote:


Hello people!

I have:
CPU: Intel(R) Core(TM)2 Duo CPU T7250  @ 2.00GHz (1994.48-MHz K8-class CPU)
FreeBSD 10.0-CURRENT r226607 amd64

For example during the building of the port lang/gcc46 in four streams (-j 4) 
with a heavy load on the processor - use the system was nearly impossible - 
responsiveness was terrible - the mouse cursor sometimes froze on the spot a 
few seconds...


Am I right in understanding that you have only two cores?  What else is 
running that achieves poor interactivity?  What is the cpu utilization of 
your x server at this time?




I managed to achieve a significant increase in the degree of interactivity ULE 
scheduler due to the following changes:


This patch probably breaks nice, adaptive idling, and slows the 
interactivity computation.  That being said I'm not sure why it helps you.


It seems that there are increasing reports of bad interactivity creeping 
in to ULE over the last year.  If people can help provide me with data I 
can look into this more.


Thanks for your report.

Jeff



##
--- sched_ule.c.orig2011-10-22 11:40:30.0 +0300
+++ sched_ule.c 2011-10-22 12:25:05.0 +0300
@@ -2119,6 +2119,14 @@

THREAD_LOCK_ASSERT(td, MA_OWNED);
tdq = TDQ_SELF();
+   if (td-td_pri_class  PRI_FIFO_BIT)
+   return;
+   ts = td-td_sched;
+   /*
+* We used up one time slice.
+*/
+   if (--ts-ts_slice  0)
+   return;
#ifdef SMP
/*
 * We run the long term load balancer infrequently on the first cpu.
@@ -2144,9 +2152,6 @@
if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
tdq-tdq_ridx = tdq-tdq_idx;
}
-   ts = td-td_sched;
-   if (td-td_pri_class  PRI_FIFO_BIT)
-   return;
if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
/*
 * We used a tick; charge it to the thread so
@@ -2157,11 +2162,6 @@
sched_priority(td);
}
/*
-* We used up one time slice.
-*/
-   if (--ts-ts_slice  0)
-   return;
-   /*
 * We're out of time, force a requeue at userret().
 */
ts-ts_slice = sched_slice;
##

What do you think about this?

Thanks!

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: couple of sched_ule issues

2011-11-03 Thread Jeff Roberson

On Thu, 15 Sep 2011, Andriy Gapon wrote:



This is more of a just for the record email.
I think I've already stated the following observations, but I suspect that they
drowned in the noise of a thread in which I mentioned them.

1. Incorrect topology is built for single-package SMP systems.
That topology has two levels (shared nothing and shared package) with 
exactly
the same CPU sets.  That doesn't work well with the rebalancing algorithm which
assumes that each level is a proper/strict subset of its parent.

2. CPU load comparison algorithms are biased towards lower logical CPU IDs.
With all other things being equal the algorithms will always pick a CPU with a
lower ID.  This creates certain load asymmetry and predictable patterns in load
distribution.


If all other things truly are equal why does selecting a lower cpu number 
matter?




Another observation.
It seems that ULE makes a decision about thread-to-CPU affinity at the time 
when a
thread gets switched out.  This looks logical from the implementation point of
view.  But it doesn't seem logical from a general point of view - when the 
thread
will be becoming running again its affinity profile may become completely
different.  I think that it would depend on how much a thread actually spends 
not
running.


The decision is made at sched_add() time.  sched_pickcpu() does the work 
and selects the run-queue we will be added to.  We consider the CPU that 
the thread was last running on but the decision is made at the time that a 
run queue must be selected.


Jeff



--
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Increase the degree of interactivity ULE scheduler

2011-11-03 Thread Jeff Roberson

On Thu, 3 Nov 2011, Ivan Klymenko wrote:


Thank you for taking the time to answer me.

? Thu, 3 Nov 2011 10:21:48 -1000 (HST)
Jeff Roberson jrober...@jroberson.net ?:


On Sat, 22 Oct 2011, Ivan Klymenko wrote:


Hello people!

I have:
CPU: Intel(R) Core(TM)2 Duo CPU T7250  @ 2.00GHz (1994.48-MHz
K8-class CPU) FreeBSD 10.0-CURRENT r226607 amd64

For example during the building of the port lang/gcc46 in four
streams (-j 4) with a heavy load on the processor - use the system
was nearly impossible - responsiveness was terrible - the mouse
cursor sometimes froze on the spot a few seconds...


Am I right in understanding that you have only two cores?


Yes.


What else is running that achieves poor interactivity?


This is mainly a compilation with make option -j = ncpu*2
And as an example - launching a large number of programs
http://www.youtube.com/watch?v=1CLCp-dqWu0
This patch allows me to make do with ULE nearly as well as with FBFS
Without the patch, somewhere in the middle of the time with ULE has
been difficult to control the mouse cursor.


What is the cpu utilization of your x server at this time?


~2.00% - 20.00% WCPU time... But sometimes there are up to 79%...
Upon unloading the CPU returns to normal...


When the x server is down at 20% is it laggy?  Can you tell me the 
priorities of the x server and the compile tasks?  You can use the 'pri' 
keyword with ps and write a short script to log all priorities once per 
second during your test.  That would be most helpful.  Let me know if you 
need assistance with that.


Jeff







I managed to achieve a significant increase in the degree of
interactivity ULE scheduler due to the following changes:


This patch probably breaks nice, adaptive idling, and slows the
interactivity computation.  That being said I'm not sure why it helps
you.

It seems that there are increasing reports of bad interactivity
creeping in to ULE over the last year.  If people can help provide me
with data I can look into this more.



I'll be glad to provide data


Thanks for your report.

Jeff


How to repeat your tests on my system?
http://jeffr-tech.livejournal.com/24280.html

Sorry for my english.

Thanks!


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


SUJ progress report

2011-06-19 Thread Jeff Roberson

Hi Folks,

Kirk, Peter and I have been working hard on SUJ.  We still have a few bugs 
related to snapshots but we fixed the couple of potential corruption 
problems that came up over the last year.  If you are not currently using 
SUJ I implore you to do so and report any problems you may find..  We need 
to get better coverage if we are going to enable it for 9.0 which I think 
everyone would like to see.


I will send another update when it is safe to use SUJ + snapshots.

Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [head tinderbox] failure on sparc64/sun4v

2011-03-23 Thread Jeff Roberson
I did not notice this was still failing.  I didn't realize this make 
universe would pull in my modules.  I will fix this now.


Thanks,
Jeff


On Wed, 23 Mar 2011, FreeBSD Tinderbox wrote:


TB --- 2011-03-23 05:47:53 - tinderbox 2.6 running on freebsd-current.sentex.ca
TB --- 2011-03-23 05:47:53 - starting HEAD tinderbox run for sparc64/sun4v
TB --- 2011-03-23 05:47:53 - cleaning the object tree
TB --- 2011-03-23 05:48:03 - cvsupping the source tree
TB --- 2011-03-23 05:48:03 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
/tinderbox/HEAD/sparc64/sun4v/supfile
TB --- 2011-03-23 05:48:16 - building world
TB --- 2011-03-23 05:48:16 - MAKEOBJDIRPREFIX=/obj
TB --- 2011-03-23 05:48:16 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2011-03-23 05:48:16 - TARGET=sun4v
TB --- 2011-03-23 05:48:16 - TARGET_ARCH=sparc64
TB --- 2011-03-23 05:48:16 - TZ=UTC
TB --- 2011-03-23 05:48:16 - __MAKE_CONF=/dev/null
TB --- 2011-03-23 05:48:16 - cd /src
TB --- 2011-03-23 05:48:16 - /usr/bin/make -B buildworld

World build started on Wed Mar 23 05:48:16 UTC 2011
Rebuilding the temporary build tree
stage 1.1: legacy release compatibility shims
stage 1.2: bootstrap tools
stage 2.1: cleaning up the object tree
stage 2.2: rebuilding the object tree
stage 2.3: build tools
stage 3: cross tools
stage 4.1: building includes
stage 4.2: building libraries
stage 4.3: make dependencies
stage 4.4: building everything
World build completed on Wed Mar 23 06:52:31 UTC 2011

TB --- 2011-03-23 06:52:31 - generating LINT kernel config
TB --- 2011-03-23 06:52:31 - cd /src/sys/sun4v/conf
TB --- 2011-03-23 06:52:31 - /usr/bin/make -B LINT
TB --- 2011-03-23 06:52:31 - building LINT kernel
TB --- 2011-03-23 06:52:31 - MAKEOBJDIRPREFIX=/obj
TB --- 2011-03-23 06:52:31 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2011-03-23 06:52:31 - TARGET=sun4v
TB --- 2011-03-23 06:52:31 - TARGET_ARCH=sparc64
TB --- 2011-03-23 06:52:31 - TZ=UTC
TB --- 2011-03-23 06:52:31 - __MAKE_CONF=/dev/null
TB --- 2011-03-23 06:52:31 - cd /src
TB --- 2011-03-23 06:52:31 - /usr/bin/make -B buildkernel KERNCONF=LINT

Kernel build for LINT started on Wed Mar 23 06:52:31 UTC 2011
stage 1: configuring the kernel
stage 2.1: cleaning up the object tree
stage 2.2: rebuilding the object tree
stage 2.3: build tools
stage 3.1: making dependencies
stage 3.2: building everything

[...]
from 
/src/sys/modules/mlx4/../../ofed/include/linux/spinlock.h:37,
from /src/sys/modules/mlx4/../../ofed/include/linux/mm.h:31,
from 
/src/sys/modules/mlx4/../../ofed/drivers/net/mlx4/alloc.c:36:
./machine/cpu.h:71:1: error: unlikely redefined
In file included from /src/sys/modules/mlx4/../../ofed/include/linux/types.h:33,
from /src/sys/modules/mlx4/../../ofed/include/linux/slab.h:36,
from 
/src/sys/modules/mlx4/../../ofed/drivers/net/mlx4/alloc.c:35:
/src/sys/modules/mlx4/../../ofed/include/linux/compiler.h:60:1: error: this is 
the location of the previous definition
*** Error code 1

Stop in /src/sys/modules/mlx4.
*** Error code 1

Stop in /src/sys/modules.
*** Error code 1

Stop in /obj/sun4v.sparc64/src/sys/LINT.
*** Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2011-03-23 07:08:58 - WARNING: /usr/bin/make returned exit code  1
TB --- 2011-03-23 07:08:58 - ERROR: failed to build lint kernel
TB --- 2011-03-23 07:08:58 - 3828.70 user 746.21 system 4864.72 real


http://tinderbox.freebsd.org/tinderbox-head-HEAD-sparc64-sun4v.full
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: OFED stack merge tonight.

2011-03-23 Thread Jeff Roberson

On Wed, 23 Mar 2011, Mattia Rossi wrote:


On 21/03/11 18:36, Jeff Roberson wrote:
[..]

I would not expect any instability in non ofed systems from this import.



Hi Jeff,

I've tried to compile netstat, and it bails out immediately if I try to make 
obj:


/usr/src/usr.bin/netstat/Makefile, line 21: Malformed conditional 
(${MK_OFED} != no)

/usr/src/usr.bin/netstat/Makefile, line 23: if-less endif
make: fatal errors encountered -- cannot continue


It looks to me like you need to upgrade your /usr/share/mk files from a 
buildworld.  MK_OFED is only defined if bsd.own.mk is updated.


Thanks,
Jeff



Mat


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


HEADS UP: OFED stack merge tonight.

2011-03-21 Thread Jeff Roberson

Hi Folks,

Just a notice that the 1.5.3 OFED Infiniband stack will be committed as 
soon as a fresh buildworld/installworld completes on my test machine. 
This brings in new drivers and user code and a very small number of 
changes to system sources.  OFED will not be built by default and 
WITH_OFED must be defined in /etc/make.conf to enable it.  I would not 
expect any instability in non ofed systems from this import.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


SUJ Bugs

2011-03-20 Thread Jeff Roberson

Hello,

I have lost track of what bugs are pesently outstanding in SUJ.  If you 
have experienced a problem that prevented you from using SUJ please try 
again and report your issue to me directly.  I don't always have the time 
to read current@ so mails sent without cc will get more attention.


We had a solid bug busting push a few months ago.  I have more time now to 
look at things again and I intend to address the few performance issues 
we've turned up as well.


The goal is to make SUJ the default for 9.0 but we can only make this 
happen with your assistance.  My apologies if I missed emails with issues.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic on current when enabling SUJ

2010-06-03 Thread Jeff Roberson

On Thu, 3 Jun 2010, John Doe wrote:


Boot into single user-mode

# tunefs -j enable /
# tunefs -j enable /usr
# tunefs -j enable /tmp
# tunefs -j enable /var
# reboot

The machine then panics.

Looks like the machine is trying to write to a read-only filesystem.


Can you please give me information on the panic?  What was the state of 
the filesystems upon reboot?  Does dumpfs show suj enabled?


Thanks,
Jeff





___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ffs_copyonwrite panics

2010-05-23 Thread Jeff Roberson

On Sun, 23 May 2010, Roman Bogorodskiy wrote:


 Jeff Roberson wrote:


On Tue, 18 May 2010, Roman Bogorodskiy wrote:


Hi,

I've been using -CURRENT last update in February for quite a long time
and few weeks ago decided to finally update it. The update was quite
unfortunate as system became very unstable: it just hangs few times a
day and panics sometimes.

Some things can be reproduced, some cannot. Reproducible ones:

1. background fsck always makes system hang
2. system crashes on operations with nullfs mounts (disabled that for
now)

The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
The thing is that if I will run 'startx' on it with some X apps it will
panic just in few minutes. When I leave the box with nearly no stress
(just use it as internet gateway for my laptop) it behaves a little
better but will eventually crash in few hours anyway.


This may have been my fault.  Can you please update and let me know if it
is resolved?  There was both a deadlock and a copyonwrite panic as a
result of the softupdates journaling import.  I just fixed the deadlock
today.


Tried today's -CURRENT and unfortunately the behaviour is still same.


Can you give me a full stack trace?  Do you have coredumps enabled?  I 
would like to have you look at a few things in a core or send it to me 
with your kernel.


Thanks,
Jeff



Roman Bogorodskiy


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: LOR: ufs vs bufwait

2010-05-21 Thread Jeff Roberson

On Fri, 21 May 2010, Erik Cederstrand wrote:



Den 12/05/2010 kl. 22.44 skrev Jeff Roberson:


I think Peter Holm also saw this once while we were testing SUJ and reproduced 
~30 second hangs with stock sources.  At this point we need to brainstorm ideas 
for adding debugging instrumentation and come up with the quickest possible 
repro.


FWIW, I get this LOR on a ClangBSD virtual machine running the stess2 test 
suite.

I can reproduce the LOR reliably like this:

# cd stress2
#./run.sh lockf.cfg
- press ctrl-C
- another LOR is triggered by the ctrl-C (a dirhash/bufwait LOR described in 
kern/137852)
# ./run.sh mkdir.cfg
- LOR is triggered immediately

Erik


The LOR is actually safe.  I need to bless the acquisition.  We have 
always acquired the buffers in this order.


The deadlocks people were seeing were actually livelocks due to 
softdepflush looping indefinitely.  I have committed a fix for that.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ffs_copyonwrite panics

2010-05-19 Thread Jeff Roberson

On Tue, 18 May 2010, Roman Bogorodskiy wrote:


Hi,

I've been using -CURRENT last update in February for quite a long time
and few weeks ago decided to finally update it. The update was quite
unfortunate as system became very unstable: it just hangs few times a
day and panics sometimes.

Some things can be reproduced, some cannot. Reproducible ones:

1. background fsck always makes system hang
2. system crashes on operations with nullfs mounts (disabled that for
now)

The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
The thing is that if I will run 'startx' on it with some X apps it will
panic just in few minutes. When I leave the box with nearly no stress
(just use it as internet gateway for my laptop) it behaves a little
better but will eventually crash in few hours anyway.


This may have been my fault.  Can you please update and let me know if it 
is resolved?  There was both a deadlock and a copyonwrite panic as a 
result of the softupdates journaling import.  I just fixed the deadlock 
today.


Thanks,
Jeff



The even more annoying thing is that when I cannot save the dump,
because when the system boots and runs 'savecore' it leads to
fss_copyonwrite panic as well. The panic happens when about 90% complete
(as seem via ctrl-t).

Any ideas how to debug and get rid of this issue?

System arch is amd64. I don't know what other details could be useful.

Roman Bogorodskiy


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


SUJ Changes

2010-05-17 Thread Jeff Roberson
I fixed the sparse inode tunefs bug and changed the tunefs behavior based 
on discussions here on curr...@.  Hopefully this works for everyone.


I have one bad perf bug and one journal overflow bug left to resolve. 
Please keeps the reports coming and thank you for your help.


Thanks,
Jeff

-- Forwarded message --
Date: Tue, 18 May 2010 01:45:28 + (UTC)
From: Jeff Roberson j...@freebsd.org
To: src-committ...@freebsd.org, svn-src-...@freebsd.org,
svn-src-h...@freebsd.org
Subject: svn commit: r208241 - head/sbin/tunefs

Author: jeff
Date: Tue May 18 01:45:28 2010
New Revision: 208241
URL: http://svn.freebsd.org/changeset/base/208241

Log:
   - Round up the journal size to the block size so we don't confuse fsck.

  Reported by:  Mikolaj Golub to.my.troc...@gmail.com

   - Only require 256k of blocks per-cg when trying to allocate contiguous
 journal blocks.  The storage may not actually be contiguous but is at
 least within one cg.
   - When disabling SUJ leave SU enabled and report this to the user.  It
 is expected that users will upgrade SU filesystems to SUJ and want
 a similar downgrade path.

Modified:
  head/sbin/tunefs/tunefs.c

Modified: head/sbin/tunefs/tunefs.c
==
--- head/sbin/tunefs/tunefs.c   Tue May 18 00:46:15 2010(r208240)
+++ head/sbin/tunefs/tunefs.c   Tue May 18 01:45:28 2010(r208241)
@@ -358,10 +358,12 @@ main(int argc, char *argv[])
warnx(%s remains unchanged as disabled, name);
} else {
journal_clear();
-   sblock.fs_flags = ~(FS_DOSOFTDEP | FS_SUJ);
+   sblock.fs_flags = ~FS_SUJ;
sblock.fs_sujfree = 0;
-   warnx(%s cleared, 
-   remove .sujournal to reclaim space, name);
+   warnx(%s cleared but soft updates still set.,
+   name);
+
+   warnx(remove .sujournal to reclaim space);
}
}
}
@@ -546,7 +548,7 @@ journal_balloc(void)
 * Try to minimize fragmentation by requiring a minimum
 * number of blocks present.
 */
-   if (cgp-cg_cs.cs_nbfree  128 * 1024 * 1024)
+   if (cgp-cg_cs.cs_nbfree  256 * 1024)
break;
if (contig == 0  cgp-cg_cs.cs_nbfree)
break;
@@ -906,6 +908,8 @@ journal_alloc(int64_t size)
if (size / sblock.fs_fsize  sblock.fs_fpg)
size = sblock.fs_fpg * sblock.fs_fsize;
size = MAX(SUJ_MIN, size);
+   /* fsck does not support fragments in journal files. */
+   size = roundup(size, sblock.fs_bsize);
}
resid = blocks = size / sblock.fs_bsize;
if (sblock.fs_cstotal.cs_nbfree  blocks) {
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: LOR: ufs vs bufwait

2010-05-12 Thread Jeff Roberson

On Wed, 12 May 2010, Ulrich Sp?rlein wrote:


On Mon, 10.05.2010 at 22:53:32 +0200, Attilio Rao wrote:

2010/5/10 Peter Jeremy peterjer...@acm.org:

On 2010-May-08 12:20:05 +0200, Ulrich Sp?rlein u...@spoerlein.net wrote:

This LOR also is not yet listed on the LOR page, so I guess it's rather
new. I do use SUJ.

lock order reversal:
1st 0xc48388d8 ufs (ufs) @ /usr/src/sys/kern/vfs_lookup.c:502
2nd 0xec0fe304 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_softdep.c:11363
3rd 0xc49e56b8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2091


I'm seeing exactly the same LOR (and subsequent deadlock) on a recent
-current without SUJ.


I think this LOR was reported since a long time.
The deadlock may be new and someway related to the vm_page_lock work
(if not SUJ).


I was not able to reproduce this with a kernel prior to SUJ, a kernel
just after SUJ went it shows this deadlock or infinite loop ...

Now it might be that the SUJ kernel only increases the pressure so it
happens during a systems uptime. It does not seem directly related to
actually using SUJ on a volume, as I could reproduce it with SU only,
too.

I will try to get a hang not involving GELI and also re-do my tests when
the volumes have neither SUJ nor SU enabled, which led to 10-20s hangs
of the system IIRC. It seems SU/SUJ then only prolongs these hangs ad
infinitum.


I think Peter Holm also saw this once while we were testing SUJ and 
reproduced ~30 second hangs with stock sources.  At this point we need to 
brainstorm ideas for adding debugging instrumentation and come up with the 
quickest possible repro.


It would probably be good to add some KTR tracing and log that when it 
wedges.  The core I looked at was hung in bufwait.  Is there any cpu 
activity or io activity when things hang?  You'll prboably have to keep 
iostat/vmstat in memory to find out so they don't try to fault in pages 
once things are hung.


Thanks,
Jeff



I'll be back next week with new results here

Uli
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: LOR: ufs vs bufwait

2010-05-08 Thread Jeff Roberson

On Sat, 8 May 2010, Ulrich Sp?rlein wrote:


On Sat, 08.05.2010 at 18:00:50 +0200, Attilio Rao wrote:

2010/5/8 Ulrich Sp?rlein u...@spoerlein.net:

On Sat, 08.05.2010 at 12:20:05 +0200, Ulrich Sp?rlein wrote:

This LOR also is not yet listed on the LOR page, so I guess it's rather
new. I do use SUJ.

lock order reversal:
 1st 0xc48388d8 ufs (ufs) @ /usr/src/sys/kern/vfs_lookup.c:502
 2nd 0xec0fe304 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_softdep.c:11363
 3rd 0xc49e56b8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2091
KDB: stack backtrace:
db_trace_self_wrapper(c09394fe,fb817308,c062e515,c061e8ab,c093c4f1,...) at 
db_trace_self_wrapper+0x26
kdb_backtrace(c061e8ab,c093c4f1,c418b168,c418ef28,fb817364,...) at 
kdb_backtrace+0x29
_witness_debugger(c093c4f1,c49e56b8,c092e785,c418ef28,c094369d,...) at 
_witness_debugger+0x25
witness_checkorder(c49e56b8,9,c094369d,82b,0,...) at witness_checkorder+0x839
__lockmgr_args(c49e56b8,80100,c49e56d8,0,0,...) at __lockmgr_args+0x7f9
ffs_lock(fb817488,c062e2bb,c0942b3f,80100,c49e5660,...) at ffs_lock+0x82
VOP_LOCK1_APV(c09bd600,fb817488,c4827cd4,c09d62a0,c49e5660,...) at 
VOP_LOCK1_APV+0xb5
_vn_lock(c49e5660,80100,c094369d,82b,4,...) at _vn_lock+0x5e
vget(c49e5660,80100,c4827c30,50,0,...) at vget+0xb9
vfs_hash_get(c47bea20,b803,8,c4827c30,fb8175d8,...) at vfs_hash_get+0xe6
ffs_vgetf(c47bea20,b803,8,fb8175d8,1,...) at ffs_vgetf+0x49
softdep_sync_metadata(c4838880,0,c0962957,144,0,...) at 
softdep_sync_metadata+0xc82
ffs_syncvnode(c4838880,1,c4827c30,fb817698,246,...) at ffs_syncvnode+0x3e2
ffs_truncate(c4838880,200,0,880,c41fb480,...) at ffs_truncate+0x862
ufs_direnter(c4838880,c49e5660,fb81794c,fb817bd4,0,...) at ufs_direnter+0x8d4
ufs_makeinode(fb817bd4,0,fb817b30,fb817a94,c08e4cf5,...) at ufs_makeinode+0x517
ufs_create(fb817b30,fb817b48,0,0,fb817ba8,...) at ufs_create+0x30
VOP_CREATE_APV(c09bd600,fb817b30,2,fb817ac0,0,...) at VOP_CREATE_APV+0xa5
vn_open_cred(fb817ba8,fb817c5c,1a4,0,c41fb480,...) at vn_open_cred+0x1de
vn_open(fb817ba8,fb817c5c,1a4,c47e2428,0,...) at vn_open+0x3b
kern_openat(c4827c30,ff9c,804c5e8,0,602,...) at kern_openat+0x125
kern_open(c4827c30,804c5e8,0,601,21b6,...) at kern_open+0x35
open(c4827c30,fb817cf8,c0972725,c091f062,c47ea2a8,...) at open+0x30
syscall(fb817d38) at syscall+0x220
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (5, FreeBSD ELF32, open), eip = 0x2817bf33, esp = 0xbfbfec4c, ebp = 
0xbfbfecb8 ---


And now the system is hanging again. While I can still ping and receive
dmesg updates (eg. USB ports appearing), I/O is frozen solid. This is
during portupgrade, when the configure script runs and usually takes 1-2
minutes to provoke.

This part looks supsicious to me:

db show alllocks
Process 28014 (mkdir) thread 0xc691ac30 (100152)
exclusive lockmgr bufwait (bufwait) r = 0 (0xec2bdaf0) locked @ 
/usr/src/sys/ufs/ffs/ffs_softdep.c:10684
exclusive lockmgr ufs (ufs) r = 0 (0xc6bcd5a8) locked @ 
/usr/src/sys/kern/vfs_subr.c:2091
exclusive lockmgr bufwait (bufwait) r = 0 (0xec2983f4) locked @ 
/usr/src/sys/ufs/ffs/ffs_softdep.c:11363
exclusive lockmgr ufs (ufs) r = 0 (0xc6d976b8) locked @ 
/usr/src/sys/kern/vfs_lookup.c:502
Process 1990 (sshd) thread 0xc5462750 (100117)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xc546e08c) locked @ 
/usr/src/sys/kern/uipc_sockbuf.c:148
Process 12 (intr) thread 0xc41f4750 (14)
exclusive sleep mutex ttymtx (ttymtx) r = 0 (0xc425ae04) locked @ 
/usr/src/sys/dev/dcons/dcons_os.c:232
db


Along with show alllocks may you also get the following from DDB:
ps, show pcpu, alltrace, lockedvnods.


1. a kernel before SUJ went in is running fine with SU only
2. the following is on a recent -CURRENT that has SUJ, *but* i've
disabled it, so it is running with soft-updates only (I hope)

I ran a portupgrade and the first configure script triggered the I/O
hang

db ps
 pid  ppid  pgrp   uid   state   wmesg wchancmd
13467 13444 12937 0  R+  mkdir
13444 13204 12937 0  S+  wait 0xc54352a8 sh
13204 13035 12937 0  S+  wait 0xc5436000 sh
13035 12937 12937 0  S+  wait 0xc4ffad48 sh
12937 12936 12937 0  Ss+ wait 0xc4ff9d48 make
12936  3722  3722 0  R+  script
3722  2021  3722 0  S+  (threaded)  ruby18
100132   S   wait 0xc4ffa7f8 ruby18
2404  2007  2404  1000  Ss+ ttyin0xc4d74870 zsh
2325  2015  2325  1000  R+  top
2021  2009  2021 0  S+  pause0xc4ff9058 csh
2015  2007  2015  1000  Ss+ pause0xc4ffa058 zsh
2009  2007  2009  1000  Ss+ pause0xc4d4e850 zsh
2007  2006  2007  1000  Rs  screen
2006  1991  2006  1000  R+  screen
2005  2001  2005 0  R+  systat
2001  1976  2001 0  S+  pause0xc3d52058 csh
2000 1  2000 0  Ss  select   0xc3d5b1a4 ssh-agent
1991  1990  1991  1000  Ss+ pause0xc3d52850 zsh

Re: SUJ update - new panic - ffs_copyonwrite: recursive call

2010-05-07 Thread Jeff Roberson

On Sun, 2 May 2010, Vladimir Grebenschikov wrote:


Hi

While 'make buildworld'

kgdb /boot/kernel/kernel /var/crash/vmcore.13
GNU gdb 6.1.1 [FreeBSD]


Hi Vladimir,

I checked in a fix for this at revision 207742.  If you can verify that it 
works for you it would be appreciated.


Thanks!
Jeff


...
#0  0xc056b93c in doadump ()
(kgdb) bt
#0  0xc056b93c in doadump ()
#1  0xc0489019 in db_fncall ()
#2  0xc0489411 in db_command ()
#3  0xc048956a in db_command_loop ()
#4  0xc048b3ed in db_trap ()
#5  0xc05985a4 in kdb_trap ()
#6  0xc06f8b5e in trap ()
#7  0xc06dd6eb in calltrap ()
#8  0xc059870a in kdb_enter ()
#9  0xc056c1d1 in panic ()
#10 0xc066d602 in ffs_copyonwrite ()
#11 0xc068742a in ffs_geom_strategy ()
#12 0xc05d8955 in bufwrite ()
#13 0xc0686e64 in ffs_bufwrite ()
#14 0xc067a8a2 in softdep_sync_metadata ()
#15 0xc068c568 in ffs_syncvnode ()
#16 0xc0681425 in softdep_prealloc ()
#17 0xc066592a in ffs_balloc_ufs2 ()
#18 0xc066a252 in ffs_snapblkfree ()
#19 0xc065eb9a in ffs_blkfree ()
#20 0xc0673de0 in freework_freeblock ()
#21 0xc06797c7 in handle_workitem_freeblocks ()
#22 0xc0679aaf in process_worklist_item ()
#23 0xc06821f4 in softdep_process_worklist ()
#24 0xc0682940 in softdep_flush ()
#25 0xc0542a00 in fork_exit ()
#26 0xc06dd760 in fork_trampoline ()
(kgdb) x/s panicstr
0xc07c2b80:  ffs_copyonwrite: recursive call
(kgdb)



--
Vladimir B. Grebenschikov
v...@fbsd.ru


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ deadlock

2010-05-07 Thread Jeff Roberson

On Fri, 7 May 2010, Fabien Thomas wrote:


fixed/works a lot better for me.


Thanks Fabien,  I just committed this.

Thanks everyone for the assistance finding bugs so far.  Please let me 
know if you run into anything else.  For now I don't know of any other 
than some feature/change requests for tunefs.


Thanks,
Jeff




Applied and restarted portupgrade.
Will tell you tomorrow.

Fabien

Le 6 mai 2010 ? 00:54, Jeff Roberson a ?crit :


On Mon, 3 May 2010, Fabien Thomas wrote:


Hi Jeff,

I'm with r207548 now and since some days i've system deadlock.
It seems related to SUJ with process waiting on suspfs or ppwait.


I've also seen it stalled in suspfs, but this information is way better
than what I was able to garner.   I was only able to tell via ctrl-t on
a stalled 'ls' process in a terminal before hard booting.

Right now it occurs everytime I attempt to do the portmaster -a upgrade
of X/KDE on this system.


I've spotted this during multiple portupgrade -aR :)


Can anyone who has experienced this hang test this patch:

Thanks,
Jeff

Index: ffs_softdep.c
===
--- ffs_softdep.c   (revision 207480)
+++ ffs_softdep.c   (working copy)
@@ -9301,7 +9301,7 @@
  hadchanges = 1;
  }
  /* Leave this inodeblock dirty until it's in the list. */
-   if ((inodedep-id_state  (UNLINKED | DEPCOMPLETE)) == UNLINKED)
+   if ((inodedep-id_state  (UNLINKED | UNLINKONLIST)) == UNLINKED)
  hadchanges = 1;
  /*
   * If we had to rollback the inode allocation because of




Fabien
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ deadlock

2010-05-05 Thread Jeff Roberson

On Mon, 3 May 2010, Fabien Thomas wrote:


Hi Jeff,

I'm with r207548 now and since some days i've system deadlock.
It seems related to SUJ with process waiting on suspfs or ppwait.


I've also seen it stalled in suspfs, but this information is way better
than what I was able to garner.   I was only able to tell via ctrl-t on
a stalled 'ls' process in a terminal before hard booting.

Right now it occurs everytime I attempt to do the portmaster -a upgrade
of X/KDE on this system.


I've spotted this during multiple portupgrade -aR :)


Can anyone who has experienced this hang test this patch:

Thanks,
Jeff

Index: ffs_softdep.c
===
--- ffs_softdep.c   (revision 207480)
+++ ffs_softdep.c   (working copy)
@@ -9301,7 +9301,7 @@
hadchanges = 1;
}
/* Leave this inodeblock dirty until it's in the list. */
-   if ((inodedep-id_state  (UNLINKED | DEPCOMPLETE)) == UNLINKED)
+   if ((inodedep-id_state  (UNLINKED | UNLINKONLIST)) == UNLINKED)
hadchanges = 1;
/*
 * If we had to rollback the inode allocation because of




Fabien
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ update

2010-05-04 Thread Jeff Roberson

On Mon, 3 May 2010, Ed Maste wrote:


On Mon, May 03, 2010 at 04:32:37PM -0700, Doug Barton wrote:


I also don't want to bikeshed this to death. I imagine that once the
feature is stable that users will just twiddle it once and then leave it
alone, or it will be set at install time and then not twiddled at all. :)


Speaking of which, is there any reason for us not to support enabling SU+J
at newfs time?  (Other than just needing a clean way to share the code
between tunefs and newfs.)


The code is actually totally different between the two so it'll 
essentially have to be rewritten in newfs.  tunefs uses libufs and some of 
the code for manipulating directories that was added to tunefs needs to be 
moved back into libufs and made more general.  However, newfs doesn't use 
libufs anyway.  So it'd have to be converted or you'd just have to 
re-write journal creation.


For now, I think an extra step in the installer is probably easier.

Thanks,
Jeff



-Ed
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ deadlock

2010-05-03 Thread Jeff Roberson

On Mon, 3 May 2010, Fabien Thomas wrote:


Hi Jeff,

I'm with r207548 now and since some days i've system deadlock.
It seems related to SUJ with process waiting on suspfs or ppwait.


I've also seen it stalled in suspfs, but this information is way better
than what I was able to garner.   I was only able to tell via ctrl-t on
a stalled 'ls' process in a terminal before hard booting.

Right now it occurs everytime I attempt to do the portmaster -a upgrade
of X/KDE on this system.


I've spotted this during multiple portupgrade -aR :)


Hi folks,

I'm really not sure why I haven't been able to reproduce this.  I do have 
some debugging info reported by others.  Hopefully it will be sufficient. 
I will send another mail when I resolve the issue and if I can not I may 
ask for coredumps or other details.


Thanks,
Jeff



Fabien
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ update

2010-05-02 Thread Jeff Roberson

On Sun, 2 May 2010, Fabien Thomas wrote:


Hi Jeff,

Before sending the 'bad' part i would like to say that it is very useful and 
save me a lot of time after a crash.

I've updated the ports and there was no more space on the FS.
It end up with this backtrace (After one reboot the kernel crashed a second 
time with the same backtrace):


When did you update?  I fixed a bug that looked just like this a day or 
two ago.


Thanks,
Jeff



(kgdb) bt
#0  doadump () at /usr/home/fabient/fabient-sandbox/sys/kern/kern_shutdown.c:245
#1  0xc0a1a8fe in boot (howto=260) at 
/usr/home/fabient/fabient-sandbox/sys/kern/kern_shutdown.c:416
#2  0xc0a1ad4c in panic (fmt=Could not find the frame base for panic.
) at /usr/home/fabient/fabient-sandbox/sys/kern/kern_shutdown.c:590
#3  0xc0d058b3 in remove_from_journal (wk=0xc4b4aa80) at 
/usr/home/fabient/fabient-sandbox/sys/ufs/ffs/ffs_softdep.c:2204
#4  0xc0d07ebb in cancel_jaddref (jaddref=0xc4b4aa80, inodedep=0xc46bed00, 
wkhd=0xc46bed5c)
   at /usr/home/fabient/fabient-sandbox/sys/ufs/ffs/ffs_softdep.c:3336
#5  0xc0d09401 in softdep_revert_mkdir (dp=0xc46ba6cc, ip=0xc4bba244)
   at /usr/home/fabient/fabient-sandbox/sys/ufs/ffs/ffs_softdep.c:3898
#6  0xc0d37c49 in ufs_mkdir (ap=0xc8510b2c) at 
/usr/home/fabient/fabient-sandbox/sys/ufs/ufs/ufs_vnops.c:1973
#7  0xc0e7bc6e in VOP_MKDIR_APV (vop=0xc1085ea0, a=0xc8510b2c) at 
vnode_if.c:1534
#8  0xc0add64a in VOP_MKDIR (dvp=0xc485e990, vpp=0xc8510bec, cnp=0xc8510c00, 
vap=0xc8510b6c) at vnode_if.h:665
#9  0xc0add58f in kern_mkdirat (td=0xc4649720, fd=-100, path=0x804e9a0 Address 
0x804e9a0 out of bounds,
   segflg=UIO_USERSPACE, mode=448) at 
/usr/home/fabient/fabient-sandbox/sys/kern/vfs_syscalls.c:3783
#10 0xc0add2fe in kern_mkdir (td=0xc4649720, path=0x804e9a0 Address 0x804e9a0 out 
of bounds, segflg=UIO_USERSPACE, mode=448)
   at /usr/home/fabient/fabient-sandbox/sys/kern/vfs_syscalls.c:3727
#11 0xc0add289 in mkdir (td=0xc4649720, uap=0x0) at 
/usr/home/fabient/fabient-sandbox/sys/kern/vfs_syscalls.c:3706
#12 0xc0e5324b in syscall (frame=0xc8510d38) at 
/usr/home/fabient/fabient-sandbox/sys/i386/i386/trap.c:1116
#13 0xc0e2b3c0 in Xint0x80_syscall () at 
/usr/home/fabient/fabient-sandbox/sys/i386/i386/exception.s:261
#14 0x0033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

Regards,
Fabien



Hello,

I fixed a few SUJ bugs.  If those of you who reported one of the following bugs 
could re-test I would greatly appreciate it.

1)  panic on gnome start via softdep_cancel_link().
2)  Difficulty setting flags on /.  This can only be done from a direct boot 
into single user but there were problems with tunefs that could lead to the 
kernel and disk becoming out of sync with filesystem state.
3)  Kernel compiles without SOFTUPDATES defined in the config now work.

I have had some reports of a hang waiting on journal space with certain types 
of activity.  I have only had this reported twice and I am not able to 
reproduce no matter how much load I throw at the machine.  If you reproduce 
this please try to get a coredump or minidump.

Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ update - new panic - ffs_copyonwrite: recursive call

2010-05-02 Thread Jeff Roberson

On Sun, 2 May 2010, Vladimir Grebenschikov wrote:


Hi

While 'make buildworld'


This is a problem with snapshots and the journal full condition.  I will 
address it shortly.


Thanks,
Jeff



kgdb /boot/kernel/kernel /var/crash/vmcore.13
GNU gdb 6.1.1 [FreeBSD]
...
#0  0xc056b93c in doadump ()
(kgdb) bt
#0  0xc056b93c in doadump ()
#1  0xc0489019 in db_fncall ()
#2  0xc0489411 in db_command ()
#3  0xc048956a in db_command_loop ()
#4  0xc048b3ed in db_trap ()
#5  0xc05985a4 in kdb_trap ()
#6  0xc06f8b5e in trap ()
#7  0xc06dd6eb in calltrap ()
#8  0xc059870a in kdb_enter ()
#9  0xc056c1d1 in panic ()
#10 0xc066d602 in ffs_copyonwrite ()
#11 0xc068742a in ffs_geom_strategy ()
#12 0xc05d8955 in bufwrite ()
#13 0xc0686e64 in ffs_bufwrite ()
#14 0xc067a8a2 in softdep_sync_metadata ()
#15 0xc068c568 in ffs_syncvnode ()
#16 0xc0681425 in softdep_prealloc ()
#17 0xc066592a in ffs_balloc_ufs2 ()
#18 0xc066a252 in ffs_snapblkfree ()
#19 0xc065eb9a in ffs_blkfree ()
#20 0xc0673de0 in freework_freeblock ()
#21 0xc06797c7 in handle_workitem_freeblocks ()
#22 0xc0679aaf in process_worklist_item ()
#23 0xc06821f4 in softdep_process_worklist ()
#24 0xc0682940 in softdep_flush ()
#25 0xc0542a00 in fork_exit ()
#26 0xc06dd760 in fork_trampoline ()
(kgdb) x/s panicstr
0xc07c2b80:  ffs_copyonwrite: recursive call
(kgdb)



--
Vladimir B. Grebenschikov
v...@fbsd.ru


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SUJ update

2010-05-01 Thread Jeff Roberson

On Sat, 1 May 2010, Bruce Cran wrote:


On Thu, Apr 29, 2010 at 06:37:00PM -1000, Jeff Roberson wrote:


I fixed a few SUJ bugs.  If those of you who reported one of the
following bugs could re-test I would greatly appreciate it.



I've started seeing a panic Sleeping thread owns a non-sleepable lock,
though it seems to be occurring both with and without journaling. The
back trace when journaling is disabled is:


Can you tell me what the lock is?  This may be related to recent vm work 
which went in at the same time.




sched_switch
mi_switch
sleepq_wait
_sleep
bwait
bufwait
bufwrite
ffs_balloc_ufs2
ffs_write
VOP_WRITE_APV
vnode_pager_generic_putpages
VOP_PUTPAGES
vnode_pager_putpages
vm_pageout_flush
vm_object_page_collect_flush
vm_object_page_clean
vfs_msync
sync_fsync
VOP_FSYNC_APV
sync_vnode
sched_sync
fork_exit
fork_trampoline

I've also noticed that since disabling journaling a full fsck seems to
be occurring on boot; background fsck seems to have been disabled.


When you disable journaling it also disables soft-updates.  You need to 
re-enable it.  I could decouple this.  It's hard to say which is the POLA.


Thanks,
Jeff



--
Bruce Cran


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


SUJ update

2010-04-29 Thread Jeff Roberson

Hello,

I fixed a few SUJ bugs.  If those of you who reported one of the following 
bugs could re-test I would greatly appreciate it.


1)  panic on gnome start via softdep_cancel_link().
2)  Difficulty setting flags on /.  This can only be done from a direct 
boot into single user but there were problems with tunefs that could lead 
to the kernel and disk becoming out of sync with filesystem state.

3)  Kernel compiles without SOFTUPDATES defined in the config now work.

I have had some reports of a hang waiting on journal space with certain 
types of activity.  I have only had this reported twice and I am not able 
to reproduce no matter how much load I throw at the machine.  If you 
reproduce this please try to get a coredump or minidump.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-27 Thread Jeff Roberson

On Sun, 25 Apr 2010, Scott Long wrote:


On Apr 24, 2010, at 8:57 PM, Jeff Roberson wrote:

On Sun, 25 Apr 2010, Alex Keda wrote:


try in single user mode:

tunefs -j enable /
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled

tunefs -j enable /dev/ad0s2a
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled
tunefs: /dev/ad0s2a: failed to write superblock


There is a bug that prevents enabling journaling on a mounted filesystem. So 
for now you can't enable it on /.  I see that you have a large / volume but in 
general I would also suggest people not enable suj on / anyway as it's 
typically not very large.  I only run it on my /usr and /home filesystems.

I will send a mail out when I figure out why tunefs can't enable suj on / while 
it is mounted read-only.



This would preclude enabling journaling on / on an existing system, but I would 
think that you could enable it on / on a system that is being installed, since 
(at least in theory) the target / filesystem won't be the actual root of the 
system, and therefore can be unmounted at will.


That's definitely true.  Some users have had mixed success enabling it on 
/.  It looks like it is a bug either in g_access or ffs's use of g_access 
which does not allow tunefs to write after a downgrade.  I'm not yet sure 
how this is presently working for the softdep flag itself, or if it 
actually is at all.


To clarify my earlier statements:  Journaling only makes sense when the 
fsck time is longer than a few tens of seconds.  So volumes less than a 
gig or two don't really need journaling.  It just costs extra writes and 
fsck time will likely be similar.  In some pathological cases it can even 
be faster to fsck a small volume than it is to run the journal recovery on 
it.


Thanks,
Jeff



Scott


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-27 Thread Jeff Roberson

On Sun, 25 Apr 2010, Bruce Cran wrote:


On Sunday 25 April 2010 19:47:00 Scott Long wrote:

On Apr 24, 2010, at 8:57 PM, Jeff Roberson wrote:

On Sun, 25 Apr 2010, Alex Keda wrote:

try in single user mode:

tunefs -j enable /
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled

tunefs -j enable /dev/ad0s2a
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled
tunefs: /dev/ad0s2a: failed to write superblock


There is a bug that prevents enabling journaling on a mounted filesystem.
So for now you can't enable it on /.  I see that you have a large /
volume but in general I would also suggest people not enable suj on /
anyway as it's typically not very large.  I only run it on my /usr and
/home filesystems.

I will send a mail out when I figure out why tunefs can't enable suj on /
while it is mounted read-only.


This would preclude enabling journaling on / on an existing system, but I
would think that you could enable it on / on a system that is being
installed, since (at least in theory) the target / filesystem won't be the
actual root of the system, and therefore can be unmounted at will.


It worked here - it's shown as enabled after I booted in single-user mode and
enabled it yesterday:


I think some people are enabling after returning to single user from a 
live system rather than booting into single user.  This is a different 
path in the filesystem as booting directly just mounts read-only while the 
other option updates a mount from read/write.  I believe this is the path 
that is broken.


Thanks,
Jeff



core# dumpfs / | grep -i journal
flags   soft-updates+journal

--
Bruce Cran


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-27 Thread Jeff Roberson

On Mon, 26 Apr 2010, pluknet wrote:


On 26 April 2010 17:42, dikshie diks...@gmail.com wrote:

Hi Jeff,
thanks for SUJ.
btw, why there is nan% utilization? and what does it mean?
--
** SU+J Recovering /dev/ad0s1g
** Reading 33554432 byte journal from inode 4.
** Building recovery table.
** Resolving unreferenced inode list.
** Processing journal entries.
** 0 journal records in 0 bytes for nan% utilization 
** Freed 0 inodes (0 dirs) 0 blocks, and 0 frags.
--



That may be due to an empty journal (the only plausible version for me),
so jrecs and jblocks are not updated.


Yes, this is it exactly.  It's a simple bug, I will post a fix in the next 
few days.


Thanks,
Jeff



   /* Next ensure that segments are ordered properly. */
   seg = TAILQ_FIRST(allsegs);
   if (seg == NULL) {
   if (debug)
   printf(Empty journal\n);
   return;
   }

--
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today - panic on rename()

2010-04-26 Thread Jeff Roberson

On Mon, 26 Apr 2010, Vladimir Grebenschikov wrote:


Hi

First, many thanks for this effort, it is really very appreciated,

Panic on Gnome starting:


Thank you for the report with stack.  That was very helpful.  I know how 
to fix this bug but it will take me a day or two as my primary test 
machine seems to have died.


For now you will have to tunefs -j disable on that volume.

Thanks,
Jeff



# kgdb -q /usr/obj/usr/src/sys/VBOOK/kernel.debug /var/crash/vmcore.12
...
#0  doadump () at pcpu.h:246
246 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) x/s panicstr
0xc07c2160 buf.13793:remove_from_journal: 0xc581ec40 is not in journal
(kgdb) bt
#0  doadump () at pcpu.h:246
#1  0xc056b883 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416
#2  0xc056babd in panic (fmt=Variable fmt is not available.
) at /usr/src/sys/kern/kern_shutdown.c:590
#3  0xc0488ba9 in db_fncall (dummy1=1, dummy2=0, dummy3=-1065321792, dummy4=0xd90d572c 
) at /usr/src/sys/ddb/db_command.c:548
#4  0xc0488fa1 in db_command (last_cmdp=0xc07abb1c, cmd_table=0x0, dopager=1) 
at /usr/src/sys/ddb/db_command.c:445
#5  0xc04890fa in db_command_loop () at /usr/src/sys/ddb/db_command.c:498
#6  0xc048af7d in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:229
#7  0xc0597f54 in kdb_trap (type=3, code=0, tf=0xd90d58c4) at 
/usr/src/sys/kern/subr_kdb.c:535
#8  0xc06f842e in trap (frame=0xd90d58c4) at /usr/src/sys/i386/i386/trap.c:694
#9  0xc06dcf7b in calltrap () at /usr/src/sys/i386/i386/exception.s:165
#10 0xc05980ba in kdb_enter (why=0xc0747a43 panic, msg=0xc0747a43 panic) at 
cpufunc.h:71
#11 0xc056baa1 in panic (fmt=0xc0755fee remove_from_journal: %p is not in 
journal) at /usr/src/sys/kern/kern_shutdown.c:573
#12 0xc0672135 in remove_from_journal (wk=0xc0c3ec2f) at 
/usr/src/sys/ufs/ffs/ffs_softdep.c:2204
#13 0xc067e273 in cancel_jaddref (jaddref=0xc581ec40, inodedep=0xc5c58700, 
wkhd=0xc5c5875c) at /usr/src/sys/ufs/ffs/ffs_softdep.c:3336
#14 0xc067f163 in softdep_revert_link (dp=0xc681f9f8, ip=0xc681f910) at 
/usr/src/sys/ufs/ffs/ffs_softdep.c:3871
#15 0xc0697fd0 in ufs_rename (ap=0xd90d5c1c) at 
/usr/src/sys/ufs/ufs/ufs_vnops.c:1546
#16 0xc070ead6 in VOP_RENAME_APV (vop=0xc0796340, a=0xd90d5c1c) at 
vnode_if.c:1474
#17 0xc05f2902 in kern_renameat (td=0xc586e8c0, oldfd=-100, old=0x4856ca30 
Address 0x4856ca30 out of bounds, newfd=-100,
   new=0x4856ca90 Address 0x4856ca90 out of bounds, pathseg=UIO_USERSPACE) at 
vnode_if.h:636
#18 0xc05f29b6 in kern_rename (td=0xc586e8c0, from=0x4856ca30 Address 0x4856ca30 out of 
bounds, to=0x4856ca90 Address 0x4856ca90 out of bounds, pathseg=UIO_USERSPACE)
   at /usr/src/sys/kern/vfs_syscalls.c:3574
#19 0xc05f29e9 in rename (td=0xc586e8c0, uap=0xd90d5cf8) at 
/usr/src/sys/kern/vfs_syscalls.c:3551
#20 0xc06f7c49 in syscall (frame=0xd90d5d38) at 
/usr/src/sys/i386/i386/trap.c:1113
#21 0xc06dcfe0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261
#22 0x0033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)


Just after fsck -y  tunefs -j enable for both / and /usr in
single-user mode and then usual boot

panic is reproducible


--
Vladimir B. Grebenschikov
v...@fbsd.ru


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-26 Thread Jeff Roberson

On Sun, 25 Apr 2010, Lucius Windschuh wrote:


Hi Jeff,
thank you for your effort in implementing the soft update journaling.
I tried to test SUJ on a provider with 4 kB block size. My system runs
9-CURRENT r207195 (i386).
Unfortunately, tunefs is unable to cope with the device. It can easily
reproduced with these steps:

# mdconfig -s 128M -S 4096
0
#  newfs -U /dev/md0


Thanks for the repro.  This is an interesting case.  I'll have to slightly 
rewrite the directory handling code in tunefs but it should not take long.


Thanks,
Jeff


/dev/md0: 128.0MB (262144 sectors) block size 16384, fragment size 4096
   using 4 cylinder groups of 32.02MB, 2049 blks, 2112 inodes.
   with soft updates
# tunefs -j enable /dev/md0
Using inode 4 in cg 0 for 4194304 byte journal
tunefs: Failed to read dir block: Invalid argument
tunefs: soft updates journaling can not be enabled

The bread() in tunefs.c:701 fails as the requested block size (512) is
smaller than the provider's block size (4096 bytes).

As a simply attempt to fix it, I changed tunefs.c:760 to if
(dir_extend(blk, nblk, size, ino) == -1), as I thought that this made
more sense. Then, tunefs succeeded, but mounting the file system
resulted in a panic:
panic: ufs_dirbad: /mnt/md-test: bad dir ino 2 at offset 512: mangled entry

db:0:kdb.enter.default  bt
Tracing pid 2714 tid 100262 td 0xc7ea6480
kdb_enter(c0a21226,c0a21226,c0a49886,eb1e6714,0,...) at kdb_enter+0x3a
panic(c0a49886,c688f468,2,200,c0a498df,...) at panic+0x136
ufs_dirbad(c81bb000,200,c0a498df,0,eb1e67b0,...) at ufs_dirbad+0x46
ufs_lookup_ino(c81d5990,0,eb1e67d8,eb1e6800,0,...) at ufs_lookup_ino+0x367
softdep_journal_lookup(c688f288,eb1e68c4,c0a45eca,750,eb1e6834,...) at
softdep_journal_lookup+0xb0
softdep_mount(c7e3fbb0,c688f288,c8165000,c7bdf900,c7bdf900,...) at
softdep_mount+0xdb
ffs_mount(c688f288,0,c0a2df89,3d6,0,...) at ffs_mount+0x23e1
vfs_donmount(c7ea6480,0,c7bc6100,c7bc6100,c8031000,...) at vfs_donmount+0x1000
nmount(c7ea6480,eb1e6cf8,c,c,207,...) at nmount+0x64
syscall(eb1e6d38) at syscall+0x1da
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (378, FreeBSD ELF32, nmount), eip = 0x280f205b, esp =
0xbfbfdcec, ebp = 0xbfbfe248 ---

... so this attempt did not succeed, but was worth a try ;-)

But it would be nice to use SUJ even on such a unusual configuration.

Lucius


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-26 Thread Jeff Roberson

On Sun, 25 Apr 2010, Gary Jennejohn wrote:


On Sat, 24 Apr 2010 16:57:59 -1000 (HST)
Jeff Roberson jrober...@jroberson.net wrote:


On Sun, 25 Apr 2010, Alex Keda wrote:


try in single user mode:

tunefs -j enable /
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled

tunefs -j enable /dev/ad0s2a
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled
tunefs: /dev/ad0s2a: failed to write superblock


There is a bug that prevents enabling journaling on a mounted filesystem.
So for now you can't enable it on /.  I see that you have a large / volume
but in general I would also suggest people not enable suj on / anyway as
it's typically not very large.  I only run it on my /usr and /home
filesystems.

I will send a mail out when I figure out why tunefs can't enable suj on /
while it is mounted read-only.



Jeff -
One thing which surprised me was that I couldn't reuse the existing
.sujournal files on my disks.  I did notice that there are now more
flags set on them.  Was that the reason?  Or were you just being
careful?


There were a few iterations of the code to create and discover the actual 
journal inode.  I may have introduced an incompatibility when making fsck 
more careful about what it treats as a journal.  If it were to attempt to 
apply changes from a garbage file it could corrupt your filesystem.


Thanks,
Jeff



--
Gary Jennejohn


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-24 Thread Jeff Roberson

On Sun, 25 Apr 2010, Alex Keda wrote:


try in single user mode:

tunefs -j enable /
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled

tunefs -j enable /dev/ad0s2a
tunefs: Insuffient free space for the journal
tunefs: soft updates journaling can not be enabled
tunefs: /dev/ad0s2a: failed to write superblock


There is a bug that prevents enabling journaling on a mounted filesystem. 
So for now you can't enable it on /.  I see that you have a large / volume 
but in general I would also suggest people not enable suj on / anyway as 
it's typically not very large.  I only run it on my /usr and /home 
filesystems.


I will send a mail out when I figure out why tunefs can't enable suj on / 
while it is mounted read-only.


Thanks,
Jeff



on / (/dev/ad0s2a) ~40Gb free.
dc7700p$ uname -a
FreeBSD dc7700p.lissyara.su 9.0-CURRENT FreeBSD 9.0-CURRENT #0 r207156: Sun 
Apr 25 00:04:24 MSD 2010 
lissy...@dc7700p.lissyara.su:/usr/obj/usr/src/sys/GENERIC  amd64

dc7700p$
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: HEADS UP: SUJ Going in to head today

2010-04-23 Thread Jeff Roberson

On Wed, 21 Apr 2010, Garrett Cooper wrote:


On Wed, Apr 21, 2010 at 12:39 AM, Gary Jennejohn
gary.jennej...@freenet.de wrote:

On Tue, 20 Apr 2010 12:15:48 -1000 (HST)
Jeff Roberson jrober...@jroberson.net wrote:


Hi Folks,

You may have seen my other Soft-updates journaling (SUJ) announcements.
If not, it is a journaling system that works cooperatively with
soft-updates to eliminate the full background filesystem check after an
unclean shutdown.  SUJ may be enabled with tunefs -j enable and disabled
with tunefs -j disable on an unmounted filesystem.  It is backwards
compatible with soft-updates with no journal.

I'm going to do another round of tests and buildworld this afternoon to
verify the diff and then I'm committing to head.  This is a very large
feature and fundamentally changes softupdates.  Although it has been
extensively tested by many there may be unforseen problems.  If you run
into an issue that you think may be suj please email me directly as well
as posting on current as I sometimes miss list email and this will ensure
the quickest response.



And the crowd goes wild.

SUJ is _great_ and I'm glad to see it finally making it into the tree.


   Indeed. I'm looking forward to testing the junk out of this --
this is definitely a good move forward with UFS2 :]...
Cheers,
-Garrett

PS How does this interact with geom with journaling BTW? Has this been
tested performance wise (I know it doesn't make logistical sense, but
it does kind of seem to null and void the importance of geom with
journaling, maybe...)?



A quick update;  I found a bug with snapshots that held up the commit. 
Hopefully I will be done with it tonight.


About gjournal; there would be no reason to use the two together.  There 
may be cases where each is faster.  In fact it is very likely.  pjd has 
said he thinks suj will simply replace gjournal.  GEOM itself is no less 
important with suj in place as it of course fills many roles.


Performance testing has been done.  There is no regression in softdep 
performance with journaling disabled.  With journaling enabled there are 
some cases that are slightly slower.  It adds an extra ordered write so 
any time you modify the filesystem metadata and then require it to be 
synchronously written to disk you may wait for an extra transaction.


There are ways to further improve the performance.  In fact I did some 
experiments that showed dbench performance nearly identical to vanilla 
softdep if I can resolve one wait situation.  Although this is not trivial 
it is possible.  The CPU overhead ended up being surprisingly trivial in 
the cases I tested.  Really the extra overhead is only when doing sync 
writes that allocate new blocks.


I am eager to see wider coverage and hear feedback from more people.  I 
suspect for all desktop and nearly all server use it will simply be 
transparent.


Thanks,
Jeff___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: HEADS UP: SUJ Going in to head today

2010-04-21 Thread Jeff Roberson

On Tue, 20 Apr 2010, Patrick Tracanelli wrote:


Jeff Roberson escreveu:

Hi Folks,

You may have seen my other Soft-updates journaling (SUJ) announcements.
If not, it is a journaling system that works cooperatively with
soft-updates to eliminate the full background filesystem check after an
unclean shutdown.  SUJ may be enabled with tunefs -j enable and disabled
with tunefs -j disable on an unmounted filesystem.  It is backwards
compatible with soft-updates with no journal.

I'm going to do another round of tests and buildworld this afternoon to
verify the diff and then I'm committing to head.  This is a very large
feature and fundamentally changes softupdates.  Although it has been
extensively tested by many there may be unforseen problems.  If you run
into an issue that you think may be suj please email me directly as well
as posting on current as I sometimes miss list email and this will
ensure the quickest response.


Hello Jeff, McKusick and others envolved.

Is an MFC technically possible? If so, are there plans to do so?


I do have an 8 backport branch available although it is a little stale.  I 
intend to keep it somewhat up to date.  I think it will take some time 
before we have sufficient experience with SUJ in head before we want to 
put it back in 8.  It is quite a complex and disruptive feature.


Thanks,
Jeff



Thank you.

--
Patrick Tracanelli


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


HEADS UP: SUJ Going in to head today

2010-04-20 Thread Jeff Roberson

Hi Folks,

You may have seen my other Soft-updates journaling (SUJ) announcements. 
If not, it is a journaling system that works cooperatively with 
soft-updates to eliminate the full background filesystem check after an 
unclean shutdown.  SUJ may be enabled with tunefs -j enable and disabled 
with tunefs -j disable on an unmounted filesystem.  It is backwards 
compatible with soft-updates with no journal.


I'm going to do another round of tests and buildworld this afternoon to 
verify the diff and then I'm committing to head.  This is a very large 
feature and fundamentally changes softupdates.  Although it has been 
extensively tested by many there may be unforseen problems.  If you run 
into an issue that you think may be suj please email me directly as well 
as posting on current as I sometimes miss list email and this will ensure 
the quickest response.


Thanks,
Jeff
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: amd64/SMP(/ata-raid ?) not happy...

2003-11-30 Thread Jeff Roberson
On Sun, 30 Nov 2003, Poul-Henning Kamp wrote:


 Timecounters tick every 10.000 msec
 GEOM: create disk ad0 dp=0xff00eebfaca0
 ad0: 35772MB IBM-DPTA-353750 [72680/16/63] at ata0-master UDMA66
 GEOM: create disk ad4 dp=0xff00eebfa4a0
 ad4: 35304MB WDC WD360GD-00FNA0 [71730/16/63] at ata2-master UDMA133
 GEOM: create disk ad6 dp=0xff00eebfa0a0
 ad6: 35304MB WDC WD360GD-00FNA0 [71730/16/63] at ata3-master UDMA133
 GEOM: create disk ad8 dp=0xff00014c4ea0
 ad8: 35304MB WDC WD360GD-00FNA0 [71730/16/63] at ata4-master UDMA133
 GEOM: create disk ar0 dp=0xff00f04a3270
 ar0: 105913MB ATA RAID0 array [13502/255/63] status: READY subdisks:
  disk0 READY on ad4 at ata2-master
  disk1 READY on ad6 at ata3-master
  disk2 READY on ad8 at ata4-master
 SMP: AP CPU #1 Launched!
 panic: mtx_lock() of spin mutex (null) @ ../../../vm/uma_core.c:1716

I mailed re about this.  There has been some disagreement over how
mp_maxid is implemented on all architectures.  Until this gets resolved
and stamped as approved by re, please as mp_maxid++; at line 187 of
amd64/amd64/mp_machdep.c

Thanks,
Jeff


 cpuid = 1;
 Stack backtrace:
 backtrace() at backtrace+0x17
 panic() at panic+0x1d2
 _mtx_lock_flags() at _mtx_lock_flags+0x4f
 uma_zfree_arg() at uma_zfree_arg+0x7e
 g_destroy_bio() at g_destroy_bio+0x1b
 g_disk_done() at g_disk_done+0x85
 biodone() at biodone+0x66
 ad_done() at ad_done+0x31
 ata_completed() at ata_completed+0x237
 taskqueue_run() at taskqueue_run+0x88
 taskqueue_swi_run() at taskqueue_swi_run+0x10
 ithread_loop() at ithread_loop+0x189
 fork_exit() at fork_exit+0xbd
 fork_trampoline() at fork_trampoline+0xe
 --- trap 0, rip = 0, rsp = 0xad5b0d30, rbp = 0 ---
 Debugger(panic)
 Stopped at  Debugger+0x4c:  xchgl   %ebx,0x2caefe
 db

 --
 Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
 [EMAIL PROTECTED] | TCP/IP since RFC 956
 FreeBSD committer   | BSD since 4.3-tahoe
 Never attribute to malice what can adequately be explained by incompetence.
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: user:sys time ratio

2003-11-30 Thread Jeff Roberson
On Sun, 30 Nov 2003, Colin Percival wrote:

Robert Watson suggested that I compare performance from UP and SMP kernels:

 # /usr/bin/time -hl sh -c 'make -s buildworld 21'  /dev/null
Real  UserSys
UP kernel   38m33.29s 27m10.09s   10m59.15s
   (retest) 38m33.18s 27m04.40s   11m05.73s
SMP w/o HTT 41m01.54s 27m10.27s   13m29.82s
   (retest) 39m47.50s 27m08.05s   12m12.20s
SMP w/HTT   42m17.16s 28m12.82s   14m04.93s
   (retest) 44m09.61s 28m15.31s   15m44.86s

That enabling HTT degrades performance is not surprising, since I'm not
 passing the -j option to make; but a 5% performance delta between UP and
 SMP kernels is rather surprising (to me, at least), and the fact that the
 system time varies so much on the SMP kernel also seems peculiar.

So you have enabled SMP on a system with one physical core and two logical
cores?  Looks like almost a 20% slowdown in system time with the SMP
kernel.  It's too bad it's enabled by default now.  I suspect that some of
this is due to using the lock prefix on P4 cores.  It makes the cost of a
mutex over 300 cycles vs 50.  It might be interesting to do an experiment
without HTT, but with SMP enabled and the lock prefix commented out.

I have a set of changes for ULE that should fix some of the HTT slowdown,
although it is inevitable that there will always be some.  If you would
like to try the patch, it's available at:

http://www.chesapeake.net/~jroberson/ulehtt.diff

Cheers,
Jeff

Is this normal?

 Colin Percival

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kernel compile fails in uma_core.c

2003-11-30 Thread Jeff Roberson

On Sun, 30 Nov 2003, Paulius Bulotas wrote:

 Hello,

 when building kernel:
 ../../../vm/uma_core.c: In function `zone_timeout':
 ../../../vm/uma_core.c:345: error: `mp_maxid' undeclared (first use in
 this function)
 and so on.

 Anything I missed?

I just fixed this, sorry.


 Paulius
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Frequent lockups with 5.2-BETA

2003-11-27 Thread Jeff Roberson


On 27 Nov 2003, Christian Laursen wrote:

 Since upgrading from 5.1-RELEASE to 5.2-BETA, I've been
 experiencing hard lockups once or twice every day.

 I managed to get a trace by enabling the watchdog, which
 put me into the debugger. This is the trace:

 db trace
 Debugger(c06f85ec,1bc169d,0,c0754fc4,c0754bf8) at Debugger+0x54
 watchdog_fire(d73e2bcc,c067b433,c0c57100,0,d73e2bcc) at watchdog_fire+0xc1
 hardclock(d73e2bcc,0,0,d73e2b98,c43bd400) at hardclock+0x15a
 clkintr(d73e2bcc,d73e2b9c,c06c6bf6,0,c1925000) at clkintr+0xe9
 intr_execute_handlers(c0756be0,d73e2bcc,c0c570a0,1000,c0c61300) at intr_execute8
 atpic_handle_intr(0) at atpic_handle_intr+0xef
 Xatpic_intr0() at Xatpic_intr0+0x1e
 --- interrupt, eip = 0xc068db74, esp = 0xd73e2c10, ebp = 0xd73e2c14 ---
 uma_zone_slab(c0c61300,1,0,c068e516,c074edd8) at uma_zone_slab+0x4
 uma_zalloc_internal(c0c61300,0,1,0,d73e2c80) at uma_zalloc_internal+0x5c
 bucket_alloc(2f,1,0,0,0) at bucket_alloc+0x65
 uma_zfree_arg(c0c48240,c49121b8,0,c1922580,3600) at uma_zfree_arg+0x2c6
 tcp_hc_purge(0,c1922580,161e9,142b64fd,c05d7c70) at tcp_hc_purge+0x11f

Great debugging work.  I'm glad to see the software watchdog put to use.
This looks like a problem with the hostcache.  Perhaps andre can look at
it.

Thanks,
Jeff

 softclock(0,0,0,0,c192b54c) at softclock+0x25e
 ithread_loop(c1922580,d73e2d48,0,11,55ff44fd) at ithread_loop+0x1d8
 fork_exit(c0524ec0,c1922580,d73e2d48) at fork_exit+0x80
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xd73e2d7c, ebp = 0 ---

 This is the dmesg from the last boot:

 Copyright (c) 1992-2003 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
 FreeBSD 5.2-BETA #7: Wed Nov 26 17:24:32 CET 2003
 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/BORG
 Preloaded elf kernel /boot/kernel/kernel at 0xc0823000.
 Timecounter i8254 frequency 1193182 Hz quality 0
 CPU: Intel(R) Celeron(R) CPU 1.70GHz (1716.04-MHz 686-class CPU)
   Origin = GenuineIntel  Id = 0xf13  Stepping = 3
   
 Features=0x3febfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM
 real memory  = 536805376 (511 MB)
 avail memory = 511885312 (488 MB)
 Pentium Pro MTRR support enabled
 acpi0: AMIINT INTEL845 on motherboard
 pcibios: BIOS version 2.10
 Using $PIR table, 15 entries at 0xc00f7810
 acpi0: Power Button (fixed)
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
 acpi_cpu0: CPU port 0x530-0x537 on acpi0
 acpi_cpu1: CPU port 0x530-0x537 on acpi0
 device_probe_and_attach: acpi_cpu1 attach returned 6
 acpi_button0: Power Button on acpi0
 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
 pci0: ACPI PCI bus on pcib0
 agp0: Intel Generic host to PCI bridge mem 0xe000-0xe3ff at device 0.0 on 
 pci0
 pcib1: PCI-PCI bridge at device 1.0 on pci0
 pci1: PCI bus on pcib1
 pcib0: slot 1 INTA is routed to irq 5
 pcib1: slot 0 INTA is routed to irq 5
 pci1: display, VGA at device 0.0 (no driver attached)
 pcib2: ACPI PCI-PCI bridge at device 30.0 on pci0
 pci2: ACPI PCI bus on pcib2
 pcib2: slot 10 INTA is routed to irq 10
 pcib2: slot 13 INTA is routed to irq 3
 pcm0: CMedia CMI8738 port 0xdc00-0xdcff irq 10 at device 10.0 on pci2
 em0: Intel(R) PRO/1000 Network Connection, Version - 1.7.19 port 0xd800-0xd83f mem 
 0xdfec-0xdfed,0xdfee-0xdfef irq 3 at device 13.0 on pci2
 em0:  Speed:N/A  Duplex:N/A
 isab0: PCI-ISA bridge at device 31.0 on pci0
 isa0: ISA bus on isab0
 atapci0: Intel ICH4 UDMA100 controller port 0xfc00-0xfc0f,0-0x3,0-0x7,0-0x3,0-0x7 
 at device 31.1 on pci0
 ata0: at 0x1f0 irq 14 on atapci0
 ata0: [MPSAFE]
 ata1: at 0x170 irq 15 on atapci0
 ata1: [MPSAFE]
 pci0: serial bus, SMBus at device 31.3 (no driver attached)
 atkbdc0: Keyboard controller (i8042) port 0x64,0x60 irq 1 on acpi0
 atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0
 kbd0 at atkbd0
 psm0: PS/2 Mouse irq 12 on atkbdc0
 psm0: model MouseMan+, device ID 0
 fdc0: cmd 3 failed at out byte 1 of 3
 sio0 port 0x3f8-0x3ff irq 4 on acpi0
 sio0: type 16550A, console
 ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0
 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
 ppc0: FIFO with 16/16/9 bytes threshold
 ppbus0: Parallel port bus on ppc0
 plip0: PLIP network interface on ppbus0
 lpt0: Printer on ppbus0
 lpt0: Interrupt-driven port
 ppi0: Parallel I/O on ppbus0
 acpi_cpu1: CPU port 0x530-0x537 on acpi0
 device_probe_and_attach: acpi_cpu1 attach returned 6
 fdc0: cmd 3 failed at out byte 1 of 3
 npx0: [FAST]
 npx0: math processor on motherboard
 npx0: INT 16 interface
 orm0: Option ROM at iomem 0xe-0xe0fff on isa0
 pmtimer0 on isa0
 fdc0: cannot reserve I/O port range (6 ports)
 sc0: System console at flags 0x100 on isa0
 sc0: VGA 16 virtual consoles, flags=0x100
 sio1: configured irq 3 not in 

Re: zone(9) is broken on SMP?!

2003-11-26 Thread Jeff Roberson
On Thu, 27 Nov 2003, Florian C. Smeets wrote:

 Max Laier wrote:
  If I build attached kmod and kldload/-unload it on a GENERIC kernel w/
  SMP  apic it'll error out:
  Zone was not empty (xx items).  Lost X pages of memory.
 
  This is on a p4 HTT, but seems reproducible on proper SMP systems as
  well. UP systems don't show it however.
 
  Can somebody please try and report? Thanks!

 Yes this is reproducible on a real SMP system:

 bender kernel: Zone UMA test zone was not empty (65 items).  Lost 1
 pages of memory.

I'll look into this over the weekend thanks.

Cheers,
Jeff


 Regards,
 flo
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: LOR (swap_pager.c:1323, swap_pager.c:1838, uma_core.c:876) (current:Nov17)

2003-11-18 Thread Jeff Roberson

On Tue, 18 Nov 2003, Cosmin Stroe wrote:

 Here is the stack backtrace:


Thanks, this is known and is actually safe.  We're pursuing ways to quiet
these warnings.


 lock order reversal
  1st 0xc1da318c vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323
  2nd 0xc0724900 swap_pager swhash (swap_pager swhash) @ 
 /usr/src/sys/vm/swap_pager.c:1838
  3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876
 Stack backtrace:
 backtrace(c0692be9,c0c358c4,c06a376c,c06a376c,c06a464d) at backtrace+0x17
 witness_lock(c0c358c4,8,c06a464d,36c,1) at witness_lock+0x672
 _mtx_lock_flags(c0c358c4,0,c06a464d,36c,1) at _mtx_lock_flags+0xba
 obj_alloc(c0c22480,1000,c976f9db,101,c06f3f50) at obj_alloc+0x3f
 slab_zalloc(c0c22480,1,c06a464d,68c,c0c22494) at slab_zalloc+0xb3
 uma_zone_slab(c0c22480,1,c06a464d,68c,c0c22520) at uma_zone_slab+0xd6
 uma_zalloc_internal(c0c22480,0,1,5c1,72e,c06f55a8) at uma_zalloc_internal+0x3e
 uma_zalloc_arg(c0c22480,0,1,72e,2) at uma_zalloc_arg+0x3ab
 swp_pager_meta_build(c1da318c,7,0,2,0) at swp_pager_meta_build+0x174
 swap_pager_putpages(c1da318c,c976fbb8,8,0,c976fb20) at swap_pager_putpages+0x32d
 default_pager_putpages(c1da318c,c976fbb8,8,0,c976fb20) at default_pager_putpages+0x2e
 vm_pageout_flush(c976fbb8,8,0,0,c06f36a0) at vm_pageout_flush+0x17a
 vm_pageout_clean(c0dae2d8,0,c06a4468,32a,0) at vm_pageout_clean+0x305
 vm_pageout_scan(0,0,c06a4468,5a9,1f4) at vm_pageout_scan+0x65f
 vm_pageout(0,c976fd48,c068d4ed,311,0) at vm_pageout+0x31b
 fork_exit(c0625250,0,c976fd48) at fork_exit+0xb4
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xc976fd7c, ebp = 0 ---
 Debugger(witness_lock)
 Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
 db

 I'm running the sources from yesterday, nov 17:

 FreeBSD 5.1-CURRENT #0: Mon Nov 17 06:40:05 CST 2003 
 root@:/usr/obj/usr/src/sys/GALAXY

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ULE and very bad responsiveness

2003-11-14 Thread Jeff Roberson

On Fri, 14 Nov 2003, Jonathan Fosburgh wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Thursday 13 November 2003 06:01 pm, Harald Schmalzbauer wrote:

  I also could play quake(2) and have something compiling in the background
  but I see every new object file in form of a picture freeze. Also every
  other disk access seems to block the whole machine for a moment.
  I'll try again if somebody has an idea what's wrong. Then I can try running
  seti wtih nice 20 but that's not really a solution. It's working perfectly
  with nice 15 and the old scheduler.
 

 I see something similar, as a file is generated during a compile a get a
 momentary hang in the mouse, but it is not every compile.  I think I see it
 mostly when running some invocation of make -j, but I've not been able to
 lock down a particular set of circumstances where I do see it.  My
 sched_ule.c is at 1.80.  I have a UP system.  This behaviour, intermittant
 though it is, persists across a normal UP kernel, and also one with SMP+APIC
 (I was *supposed* to have two CPUs, but that is another issue ...) enabled. I
 have a PS/2 mouse and use moused.  I'm running KDE3.1.4.

This does not happen with SCHED_4BSD?  How fast is your system?  Can you
give me an example including what applications you're running and what
you're compiling?

 - --
 Jonathan Fosburgh
 AIX and Storage Administrator
 UT MD Anderson Cancer Center
 Houston, TX
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.2.3 (FreeBSD)

 iD8DBQE/tNYwqUvQmqp7omYRAnzjAKCx8by6w77iT5G+7NiBOC8lVkxJ3QCcDgWP
 J9I+Sgx4yuzqOOQ+Gu9Ge3s=
 =GEi2
 -END PGP SIGNATURE-

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ULE and very bad responsiveness

2003-11-14 Thread Jeff Roberson
On Fri, 14 Nov 2003, Jonathan Fosburgh wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On Friday 14 November 2003 01:52 pm, Jeff Roberson wrote:

 
  This does not happen with SCHED_4BSD?  How fast is your system?  Can you
  give me an example including what applications you're running and what
  you're compiling?

 I haven't tried SCHED_4BSD lately.  It will probably be next week before I
 have a chance.  Basically this is while running things such as konqueror,
 kmail, konsole, sometimes Mozilla or Firebird, usually wine for Lotus Notes.
 I think I see it more often on building the world, and again mostly with -j,
 even set at 4 or 5.  This is a 600mHz with ~380M RAM on an ATA drive at
 UDMA-66.

I suspect that you are experiencing some paging activity.  Does top show
that any of your swap is in use?  You probably don't have enough memory to
fit a parallelized buildworld, all the files that it touches, mozilla
(60MB on my machine), Xwindows (Another 60mb on my machine), and your
window manager, which if you're using kde, is probably at least another
60mb.


 - --
 Jonathan Fosburgh
 AIX and Storage Administrator
 UT MD Anderson Cancer Center
 Houston, TX
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.2.3 (FreeBSD)

 iD8DBQE/tUF5qUvQmqp7omYRAsaSAJ0Y8fZBrNEQ8UcTtf1XfVUHnE3lPwCfcup4
 k4bw4D68b7Lrdf0ygWJ4zrE=
 =ZXZ4
 -END PGP SIGNATURE-

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ULE and very bad responsiveness

2003-11-13 Thread Jeff Roberson

On Thu, 13 Nov 2003, Harald Schmalzbauer wrote:

 On Thursday 13 November 2003 07:17, Harald Schmalzbauer wrote:
  Hi,
 
  from comp.unix.bsd.freebsd.misc:
 
  Kris Kennaway wrote:
   On 2003-11-13, Harald Schmalzbauer [EMAIL PROTECTED] wrote:
   Well, I don't have any measurements but in my case it's not neccessary
   at all. I built a UP kernel with ULE like Kris advised me.
  
   Are you running an up-to-date 5.1-CURRENT?  ULE was broken with these
   characteristics until very recently.  If you're up-to-date and still
   see these problems, you need to post to the current mailing list.
  
   Kris
 
  Yes, I am running current as of 13. Nov.
 
  Find attached my first problem description.

 This time I also attached my dmesg and kernel conf

Try running seti with nice +20 rather than 15.  Do you experience bad
interactivity without seti running?

Thanks,
Jeff


 
  Thanks,
 
  -Harry


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SYSENTER in FreeBSD

2003-11-05 Thread Jeff Roberson
On Wed, 5 Nov 2003, David Xu wrote:

 Jun Su wrote:

 I noticed that Jeff Roberson implement this already. Is whi will be commit?
 http://kerneltrap.org/node/view/1531
 
 I google this because I found this feature is listed in the list of Kernel 
 Improvement of WindowsXP. :-)
 
 Thanks,
 Jun Su
 
 
 
 I have almost done this experiment about 10 months ago.
 http://people.freebsd.org/~davidxu/fastsyscall/
 The patch is out of date and still not complete.
 Also it can give you some performance improve, but I think too many
 things need to be changed,
 and this really makes user ret code very dirty, some syscalls, for
 example, pipe() can not use
 this fast syscall, becaues pipe() seems using two registers to return
 file handle, the performance gain
 is immediately lost when the assemble code becomes more complex. I don't
 think this hack is worth
 to do on IA32, I heard AMD has different way to support fast syscall,
 that may already in FreeBSD
 AMD 64 branch.

This works with every syscall.  I have a patch in perforce that doesn't
require any changes to userret().  The performance gain is not so
substantial for most things but I feel that it is worth it.  Mini is
probably going to finish this up over the next week or so.

Cheers,
Jeff



 David Xu


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-11-04 Thread Jeff Roberson
On Tue, 4 Nov 2003, Sheldon Hearn wrote:

 On (2003/11/04 09:29), Eirik Oeverby wrote:

  The problem is two parts: The mouse tends to 'lock up' for brief moments
  when the system is under load, in particular during heavy UI operations
  or when doing compile jobs and such.
  The second part of the problem is related, and is manifested by the
  mouse actually making movements I never asked it to make.

 Wow, I just assumed it was a local problem.  I'm also seeing unrequested
 mouse movement, as if the signals from movements are repeated or
 amplified.

 The thing is, I'm using 4BSD, not ULE, so I wouldn't trouble Jeff to
 look for a cause for that specific problem in ULE.

How long have you been seeing this?  Are you using a usb mouse?  Can you
try with PS/2 if you are?

Thanks,
Jeff


 Ciao,
 Sheldon.
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Was: More ULE bugs fixed. Is: Mouse problem?

2003-11-04 Thread Jeff Roberson
On Wed, 5 Nov 2003, Eirik Oeverby wrote:

 Alex Wilkinson wrote:
  On Wed, Nov 05, 2003 at 12:27:04AM +0100, Eirik Oeverby wrote:
 
  Just for those interested:
  I do *not* get any messages at all from the kernel (or elsewhere) when
  my mouse goes haywire. And it's an absolute truth (just tested back and
  forth 8 times) that it *only* happens with SCHED_ULE and *only* with old
  versions (~1.50) and the very latest ones (1.75 as I'm currently
  running). 1.69 for instance did *not* show any such problems.
 
  I will, however, update my kernel again now, to get the latest
  sched_ule.c (if any changes have been made since 1.75) and to test with
  the new interrupt handler. I have a suspicion it might be a combination
  of SCHED_ULE and some signal/message/interrupt handling causing messages
  to get lost along the way. Because that's exactly how it feels...
 
  Question: How can I find out what verion of SCHED_ULE I am running ?

 I asked the same recently, and here's what I know:
   - check /usr/src/sys/kern/sched_ule.c - a page or so down there's a
 line with the revision
   - ident /boot/kernel/kernel | grep sched_ule

Ident also works on source files.

Cheers,
Jeff


 /Eirik

 
   - aW


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Was: More ULE bugs fixed. Is: Mouse problem?

2003-11-04 Thread Jeff Roberson

On Wed, 5 Nov 2003, Eirik Oeverby wrote:

 Eirik Oeverby wrote:
  Just for those interested:
  I do *not* get any messages at all from the kernel (or elsewhere) when
  my mouse goes haywire. And it's an absolute truth (just tested back and
  forth 8 times) that it *only* happens with SCHED_ULE and *only* with old
  versions (~1.50) and the very latest ones (1.75 as I'm currently
  running). 1.69 for instance did *not* show any such problems.
 
  I will, however, update my kernel again now, to get the latest
  sched_ule.c (if any changes have been made since 1.75) and to test with
  the new interrupt handler. I have a suspicion it might be a combination
  of SCHED_ULE and some signal/message/interrupt handling causing messages
  to get lost along the way. Because that's exactly how it feels...

 Whee. Either the bump from sched_ule.c 1.75 to 1.77 changed something
 back to the old status, or the new interrupt handling has had some major
 influence.
 All I can say is - wow. My system is now more responsive than ever, I
 cannot (so far) reproduce any mouse jerkiness or bogus input or
 anything, and things seem smoother.

 As always I cannot guarantee that this report is not influenced by the
 placebo effect, but I do feel that it's a very real improvement. The
 fact that I can start VMWare, Firebird, Thunderbird, Gaim and gkrellm at
 the same time without having *one* mouse hickup speaks for itself. I
 couldn't even do that with ULE.

 So Jeff or whoever did the interrupt stuff - what did you do?

This is wonderful news.  I fixed a few bugs over the last couple of days.
I'm not sure which one caused your problem.  I'm very pleased to hear your
report though.

Cheers,
Jeff


 /Eirik

 
  Greetings,
  /Eirik
 
  Morten Johansen wrote:
 
  On Tue, 4 Nov 2003, Sheldon Hearn wrote:
 
  On (2003/11/04 09:29), Eirik Oeverby wrote:
 
   The problem is two parts: The mouse tends to 'lock up' for brief
  moments
   when the system is under load, in particular during heavy UI
  operations
   or when doing compile jobs and such.
   The second part of the problem is related, and is manifested by the
   mouse actually making movements I never asked it to make.
 
  Wow, I just assumed it was a local problem.  I'm also seeing unrequested
  mouse movement, as if the signals from movements are repeated or
  amplified.
 
  The thing is, I'm using 4BSD, not ULE, so I wouldn't trouble Jeff to
  look for a cause for that specific problem in ULE.
 
 
 
 
  Me too. Have had this problem since I got a Intellimouse PS/2
  wheel-mouse. (It worked fine with previous mice (no wheel)).
  With any scheduler in 5-CURRENT and even more frequent in 4-STABLE,
  IIRC. Using moused or not doesn't make a difference.
  Get these messages on console: psmintr: out of sync, and the mouse
  freezes then goes wild for a few seconds.
  Can happen under load and sometimes when closing Mozilla (not often).
  It could be related to the psm-driver. Or maybe I have a bad mouse, I
  don't know.
  I will try another mouse, but it does work perfectly in Linux and
  Windogs...
 
  mj
 
 
 
  ___
  [EMAIL PROTECTED] mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-current
  To unsubscribe, send any mail to
  [EMAIL PROTECTED]
 
 
 
  ___
  [EMAIL PROTECTED] mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-current
  To unsubscribe, send any mail to [EMAIL PROTECTED]


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


How nice should behave (was Re: More ULE bugs fixed.)

2003-11-03 Thread Jeff Roberson

On Tue, 4 Nov 2003, Bruce Evans wrote:

 On Sun, 2 Nov 2003, Jeff Roberson wrote:

  You commented on the nice cutoff before.  What do you believe the correct
  behavior is?  In ULE I went to great lengths to be certain that I emulated
  the old behavior of denying nice +20 processes cpu time when anything nice
  0 or above was running.  As a result of that, nice -20 processes inhibit
  any processes with a nice below zero from receiving cpu time.  Prior to a
  commit earlier today, nice -20 would stop nice 0 processes that were
  non-interactive.  I've changed that though so nice 0 will always be able
  to run, just with a small slice.  Based on your earlier comments, you
  don't believe that this behavior is correct, why, and what would you like
  to see?

 Only RELENG_4 has that old behaviour.

 I think the existence of rtprio and a non-broken idprio makes infinite
 deprioritization using niceness unnecessary.  (idprio is still broken
 (not available to users) in -current, but it doesn't need to be if
 priority propagation is working as it should be.)  It's safer and fairer
 for all niced processes to not completely prevent each other being
 scheduled, and use the special scheduling classes for cases where this
 is not wanted.  I'd mainly like the slices for nice -20 vs nice --20
 processes to be very small and/or infrequent.

idprio should be able to function properly since we have priority
propagation and elevated priorities for m/tsleep.  I believe that many
people rely on the nice +20 behavior.  We could change this and make it a
matter of user education.

ULE's nice mechanism is very flexible in this regard.  I would only have
to change one define to force the slice assignment to scale across the
whole slice range.  Although, I only have 14 possible slice values to
hand out, so small differences would be meaningless.


 Bruce
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-11-03 Thread Jeff Roberson
On Mon, 3 Nov 2003, Eirik Oeverby wrote:

 Hi,

 Just recompiled yesterday, running sched_ule.c 1.75. It seems to have
 re-introduced the bogus mouse events I talked about earlier, after a
 period of having no problems with it. The change happened between 1.69
 and 1.75, and there's also the occational glitch in keyboard input.

How unfortunate, it seems to have fixed other problems.  Can you describe
the mouse problem?  Is it jittery constantly or only under load?  Or are
you having other problems?  Have you tried reverting to SCHED_4BSD?  What
window manager do you run?

Thanks for the report.

Cheers,
Jeff


 If you need me to do anything to track this down, let me know. I am, and
 have always been, running with moused, on a uniprocessor box (ThinkPad
 T21 1ghz p3).

 Best regards,
 /Eirik

 Jeff Roberson wrote:
  On Fri, 31 Oct 2003, Bruno Van Den Bossche wrote:
 
 
 Jeff Roberson [EMAIL PROTECTED] wrote:
 
 
 On Wed, 29 Oct 2003, Jeff Roberson wrote:
 
 
 On Thu, 30 Oct 2003, Bruce Evans wrote:
 
 
 Test for scheduling buildworlds:
 
 cd /usr/src/usr.bin
 for i in obj depend all
 do
 MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i
 done /tmp/zqz 21
 
 (Run this with an empty /somewhere/obj.  The all stage doesn't
 quite finish.)  On an ABIT BP6 system with a 400MHz and a 366MHz
 CPU, with/usr (including /usr/src) nfs-mounted (with 100 Mbps
 ethernet and a reasonably fast server) and /somewhere/obj
 ufs1-mounted (on a fairly slow disk; no soft-updates), this
 gives the following times:
 
 SCHED_ULE-yesterday, with not so careful setup:
40.37 real 8.26 user 6.26 sys
   278.90 real59.35 user41.32 sys
   341.82 real   307.38 user69.01 sys
 SCHED_ULE-today, run immediately after booting:
41.51 real 7.97 user 6.42 sys
   306.64 real59.66 user40.68 sys
   346.48 real   305.54 user69.97 sys
 SCHED_4BSD-yesterday, with not so careful setup:
   [same as today except the depend step was 10 seconds
   slower (real)]
 SCHED_4BSD-today, run immediately after booting:
18.89 real 8.01 user 6.66 sys
   128.17 real58.33 user43.61 sys
   291.59 real   308.48 user72.33 sys
 SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz
 CPU) with
 many local changes and not so careful setup:
17.39 real 8.28 user 5.49 sys
   130.51 real60.97 user34.63 sys
   390.68 real   310.78 user60.55 sys
 
 Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for
 the obj and depend stages.  These stages have little
 parallelism.  SCHED_ULE was only 19% slower for the all stage.
 ...
 
 I reran this with -current (sched_ule.c 1.68, etc.).  Result: no
 significant change.  However, with a UP kernel there was no
 significant difference between the times for SCHED_ULE and
 SCHED_4BSD.
 
 There was a significant difference on UP until last week.  I'm
 working on SMP now.  I have some patches but they aren't quite ready
 yet.
 
 I have commited my SMP fixes.  I would appreciate it if you could post
 update results.  ULE now outperforms 4BSD in a single threaded kernel
 compile and performs almost identically in a 16 way make.  I still
 have a few more things that I can do to improve the situation.  I
 would expect ULE to pull further ahead in the months to come.
 
 I recently had to complete a little piece of software in a course on
 parallel computing.  I've put it online[1] (we only had to write the
 pract2.cpp file).  It calculates the inverse of a Vandermonde matrix and
 allows you to spawn multiple slave-processes who each perform a part of
 the work.  Everything happens in memory so
 I've used it lately to test the different changes you made to
 sched_ule.c and these last fixes do improve the performance on my dual
 p3 machine a lot.
 
 Here are the results of my (very limited tests) :
 
 sched4bsd
 ---
 dimension   slaves  time
 10001   90.925408
 10002   58.897038
 
 200 1   0.735962
 200 2   0.676660
 
 sched_ule 1.68
 ---
 dimension   slaves  time
 10001   90.951015
 10002   70.402845
 
 200 1   0.743551
 200 2   1.900455
 
 sched_ule 1.70
 ---
 dimension   slaves  time
 10001   90.782309
 10002   57.207351
 
 200 1   0.739998
 200 2   0.383545
 
 
 I'm not really sure if this is very relevant to you, but from the
 end-user point of view (me :-)) this does means something.
 Thanks!
 
 
  I welcome the feedback, positive or negative, as it helps me improve
  things.  Thanks for the report!  Could you run

Re: More ULE bugs fixed.

2003-11-02 Thread Jeff Roberson
On Sat, 1 Nov 2003, Bruce Evans wrote:

 On Fri, 31 Oct 2003, Jeff Roberson wrote:

  I have commited my SMP fixes.  I would appreciate it if you could post
  update results.  ULE now outperforms 4BSD in a single threaded kernel
  compile and performs almost identically in a 16 way make.  I still have a
  few more things that I can do to improve the situation.  I would expect
  ULE to pull further ahead in the months to come.

 My simple make benchmark now takes infinitely longer with ULE under SMP,
 since make -j 16 with ULE under SMP now hangs nfs after about a minute.
 4BSD works better.  However, some networking bugs have developed in the
 last few days.  One of their manifestations is that SMP kernels always
 panic in sbdrop() on shutdown.

  The nice issue is still outstanding, as is the incorrect wcpu reporting.

 It may be related to nfs processes not getting any cycles even when there
 are no niced processes.


I've just run your script myself.  I was using sched_ule.c rev 1.75.  I
did not encounter any problem.  I also have not run it with 4BSD so I
don't have any performance comparisons.  Hopefully the next time you have
an opportunity to test things will go smoothly.  I fixed a bug in
sched_prio() that may have caused this behavior.

You commented on the nice cutoff before.  What do you believe the correct
behavior is?  In ULE I went to great lengths to be certain that I emulated
the old behavior of denying nice +20 processes cpu time when anything nice
0 or above was running.  As a result of that, nice -20 processes inhibit
any processes with a nice below zero from receiving cpu time.  Prior to a
commit earlier today, nice -20 would stop nice 0 processes that were
non-interactive.  I've changed that though so nice 0 will always be able
to run, just with a small slice.  Based on your earlier comments, you
don't believe that this behavior is correct, why, and what would you like
to see?

Thanks,
Jeff



 Bruce
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Sticky mouse with SCHED_ULE 10-30-03

2003-11-02 Thread Jeff Roberson
On Sun, 2 Nov 2003, Schnoopay wrote:

  Are you using moused?  Is this SMP or UP?  What CPUs are you using?
   
Thanks,
Jeff
 
  I am having similar problems after my last cvsup (10-31-03) also using a
  USB MS Intellimouse. Mouse is slow to respond under ULE but fine under
  4BSD. The mouse feels like it's being sampled at a slow rate.
 
  I am using moused, on a UP Athlon XP 1800+. I am running seti at home at
  nice 15, but kill the seti process made no notable difference. I failed
  to check objective performance as the interactive experience was truly
  difficult to work with and I just wanted to get my work done. =]
 
  -Schnoopay

 I just disabled moused and told X to read from /dev/ums0 and the mouse
 problems are gone. I haven't changed anything else from when the mouse
 was sticky so I guess not using moused is a good work around.


I'm not able to reproduce this at all.  Could any of you folks that are
experiencing this problem update to sched_ule.c rev 1.75 and tell me if it
persists?


 -Schnoopay

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-31 Thread Jeff Roberson
On Wed, 29 Oct 2003, Jeff Roberson wrote:

 On Thu, 30 Oct 2003, Bruce Evans wrote:

   Test for scheduling buildworlds:
  
 cd /usr/src/usr.bin
 for i in obj depend all
 do
 MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i
 done /tmp/zqz 21
  
   (Run this with an empty /somewhere/obj.  The all stage doesn't quite
   finish.)  On an ABIT BP6 system with a 400MHz and a 366MHz CPU, with
   /usr (including /usr/src) nfs-mounted (with 100 Mbps ethernet and a
   reasonably fast server) and /somewhere/obj ufs1-mounted (on a fairly
   slow disk; no soft-updates), this gives the following times:
  
   SCHED_ULE-yesterday, with not so careful setup:
  40.37 real 8.26 user 6.26 sys
 278.90 real59.35 user41.32 sys
 341.82 real   307.38 user69.01 sys
   SCHED_ULE-today, run immediately after booting:
  41.51 real 7.97 user 6.42 sys
 306.64 real59.66 user40.68 sys
 346.48 real   305.54 user69.97 sys
   SCHED_4BSD-yesterday, with not so careful setup:
 [same as today except the depend step was 10 seconds slower (real)]
   SCHED_4BSD-today, run immediately after booting:
  18.89 real 8.01 user 6.66 sys
 128.17 real58.33 user43.61 sys
 291.59 real   308.48 user72.33 sys
   SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz CPU) with
   many local changes and not so careful setup:
  17.39 real 8.28 user 5.49 sys
 130.51 real60.97 user34.63 sys
 390.68 real   310.78 user60.55 sys
  
   Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for the
   obj and depend stages.  These stages have little parallelism.  SCHED_ULE
   was only 19% slower for the all stage.  ...
 
  I reran this with -current (sched_ule.c 1.68, etc.).  Result: no
  significant change.  However, with a UP kernel there was no significant
  difference between the times for SCHED_ULE and SCHED_4BSD.

 There was a significant difference on UP until last week.  I'm working on
 SMP now.  I have some patches but they aren't quite ready yet.

I have commited my SMP fixes.  I would appreciate it if you could post
update results.  ULE now outperforms 4BSD in a single threaded kernel
compile and performs almost identically in a 16 way make.  I still have a
few more things that I can do to improve the situation.  I would expect
ULE to pull further ahead in the months to come.

The nice issue is still outstanding, as is the incorrect wcpu reporting.

Cheers,
Jeff


 
   Test 5 for fair scheduling related to niceness:
  
 for i in -20 -16 -12 -8 -4 0 4 8 12 16 20
 do
 nice -$i sh -c while :; do echo -n;done 
 done
 time top -o cpu
  
   With SCHED_ULE, this now hangs the system, but it worked yesterday.  Today
   it doesn't get as far as running top and it stops the nfs server responding.
   To unhang the system and see what the above does, run a shell at rtprio 0
   and start top before the above, and use top to kill processes (I normally
   use killall sh to kill all the shells generated by tests 1-5, but killall
   doesn't work if it is on nfs when the nfs server is not responding).
 
  This shows problems much more clearly with UP kernels.  It gives the
  nice -20 and -16 processes approx. 55% and 50% of the CPU, respectively
  (the total is significantly more than 100%), and it gives approx.  0%
  of the CPU to the other sh processes (perhaps exactly 0).  It also
  apparently gives gives 0% of the CPU to some important nfs process (I
  couldn't see exactly which) so the nfs server stops responding.
  SCHED_4BSD errs in the opposite direction by giving too many cycles to
  highly niced processes so it is naturally immune to this problem.  With
  SMP, SCHED_ULE lets many more processes run.

 I seem to have broken something related to nice.  I only tested
 interactivity and performance after my last round of changes.  I have a
 standard test that I do that is similar to the one that you have posted
 here.  I used it to gather results for my paper
 (http://www.chesapeake.net/~jroberson/ULE.pdf).  There you can see what
 the intended nice curve is like.  Oddly enough, I ran your test again on
 my laptop and I did not see 55% of the cpu going to nice -20.  It was
 spread proportionally from -20 to 0 with postive nice values not receiving
 cpu time, as intended.  It did not, however, let interactive processes
 proceed.  This is certainly a bug and it sounds like there may be others
 which lead to the problems that you're having.

 
  The nfs server also sometimes stops reponding with only non-negatively
  niced processes (0 through 20 in the above), but it takes longer.
 
  The nfs server restarts if enough of the hog processes are killed.
  Apparently nfs has some critical process running at only user

Re: More ULE bugs fixed.

2003-10-31 Thread Jeff Roberson
On Fri, 31 Oct 2003, Bruno Van Den Bossche wrote:

 Jeff Roberson [EMAIL PROTECTED] wrote:

  On Wed, 29 Oct 2003, Jeff Roberson wrote:
 
   On Thu, 30 Oct 2003, Bruce Evans wrote:
  
 Test for scheduling buildworlds:

   cd /usr/src/usr.bin
   for i in obj depend all
   do
   MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i
   done /tmp/zqz 21

 (Run this with an empty /somewhere/obj.  The all stage doesn't
 quite finish.)  On an ABIT BP6 system with a 400MHz and a 366MHz
 CPU, with/usr (including /usr/src) nfs-mounted (with 100 Mbps
 ethernet and a reasonably fast server) and /somewhere/obj
 ufs1-mounted (on a fairly slow disk; no soft-updates), this
 gives the following times:

 SCHED_ULE-yesterday, with not so careful setup:
40.37 real 8.26 user 6.26 sys
   278.90 real59.35 user41.32 sys
   341.82 real   307.38 user69.01 sys
 SCHED_ULE-today, run immediately after booting:
41.51 real 7.97 user 6.42 sys
   306.64 real59.66 user40.68 sys
   346.48 real   305.54 user69.97 sys
 SCHED_4BSD-yesterday, with not so careful setup:
   [same as today except the depend step was 10 seconds
   slower (real)]
 SCHED_4BSD-today, run immediately after booting:
18.89 real 8.01 user 6.66 sys
   128.17 real58.33 user43.61 sys
   291.59 real   308.48 user72.33 sys
 SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz
 CPU) with
 many local changes and not so careful setup:
17.39 real 8.28 user 5.49 sys
   130.51 real60.97 user34.63 sys
   390.68 real   310.78 user60.55 sys

 Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for
 the obj and depend stages.  These stages have little
 parallelism.  SCHED_ULE was only 19% slower for the all stage.
 ...
   
I reran this with -current (sched_ule.c 1.68, etc.).  Result: no
significant change.  However, with a UP kernel there was no
significant difference between the times for SCHED_ULE and
SCHED_4BSD.
  
   There was a significant difference on UP until last week.  I'm
   working on SMP now.  I have some patches but they aren't quite ready
   yet.
 
  I have commited my SMP fixes.  I would appreciate it if you could post
  update results.  ULE now outperforms 4BSD in a single threaded kernel
  compile and performs almost identically in a 16 way make.  I still
  have a few more things that I can do to improve the situation.  I
  would expect ULE to pull further ahead in the months to come.

 I recently had to complete a little piece of software in a course on
 parallel computing.  I've put it online[1] (we only had to write the
 pract2.cpp file).  It calculates the inverse of a Vandermonde matrix and
 allows you to spawn multiple slave-processes who each perform a part of
 the work.  Everything happens in memory so
 I've used it lately to test the different changes you made to
 sched_ule.c and these last fixes do improve the performance on my dual
 p3 machine a lot.

 Here are the results of my (very limited tests) :

 sched4bsd
 ---
 dimension   slaves  time
 10001   90.925408
 10002   58.897038

 200 1   0.735962
 200 2   0.676660

 sched_ule 1.68
 ---
 dimension   slaves  time
 10001   90.951015
 10002   70.402845

 200 1   0.743551
 200 2   1.900455

 sched_ule 1.70
 ---
 dimension   slaves  time
 10001   90.782309
 10002   57.207351

 200 1   0.739998
 200 2   0.383545


 I'm not really sure if this is very relevant to you, but from the
 end-user point of view (me :-)) this does means something.
 Thanks!

I welcome the feedback, positive or negative, as it helps me improve
things.  Thanks for the report!  Could you run this again under 4bsd and
ULE with the following in your .cshrc:

set time= ( 5 %Uu %Ss %E %P %X+%Dk %I+%Oio %Fpf+%Ww %cc/%ww )

And then time ./testpract 200 2, etc.  This will give me a few hints about
what's impacting your performance.

Thanks!
Jeff


 [1] http://users.pandora.be/bomberboy/mptest/final.tar.bz2
 It can be used by running testpract2 with two arguments, the dimension
 of the matrix and the number of slaves.  example './testpract2 200 2'
 will create a matrix with dimension 200 and 2 slaves.


 --
 Bruno

 ... And then there's the guy who bought 20,000 bras, cut them in half,
 and sold 40,000 yamalchas with chin straps

Re: Sticky mouse with SCHED_ULE 10-30-03

2003-10-31 Thread Jeff Roberson

On Fri, 31 Oct 2003, Michal wrote:

 FreeBSD 5.1-CURRENT #0: Thu Oct 30 17:49:13 EST 2003
 When kernel compiled with SCHED_ULE, USB mouse (MS USB Intellimouse) is
 almost unusable. Even if CPU is idle, mouse feels sticky. When loading
 mozilla or compiling comething mouse freezes for several seconds and is
 nonresponsive in general. Switched back to SCHED_4BSD and mouse is
 better than ever. No problems at all when loading programs or compiling.
 To me subjective feeling mouse respomds worse than month ago with
 SCHED_ULE and much better with SCHED_4BSD than before.

Are you using moused?  Is this SMP or UP?  What CPUs are you using?

Thanks,
Jeff


 Michal

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-29 Thread Jeff Roberson
On Thu, 30 Oct 2003, Bruce Evans wrote:

  Test for scheduling buildworlds:
 
  cd /usr/src/usr.bin
  for i in obj depend all
  do
  MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i
  done /tmp/zqz 21
 
  (Run this with an empty /somewhere/obj.  The all stage doesn't quite
  finish.)  On an ABIT BP6 system with a 400MHz and a 366MHz CPU, with
  /usr (including /usr/src) nfs-mounted (with 100 Mbps ethernet and a
  reasonably fast server) and /somewhere/obj ufs1-mounted (on a fairly
  slow disk; no soft-updates), this gives the following times:
 
  SCHED_ULE-yesterday, with not so careful setup:
 40.37 real 8.26 user 6.26 sys
278.90 real59.35 user41.32 sys
341.82 real   307.38 user69.01 sys
  SCHED_ULE-today, run immediately after booting:
 41.51 real 7.97 user 6.42 sys
306.64 real59.66 user40.68 sys
346.48 real   305.54 user69.97 sys
  SCHED_4BSD-yesterday, with not so careful setup:
[same as today except the depend step was 10 seconds slower (real)]
  SCHED_4BSD-today, run immediately after booting:
 18.89 real 8.01 user 6.66 sys
128.17 real58.33 user43.61 sys
291.59 real   308.48 user72.33 sys
  SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz CPU) with
  many local changes and not so careful setup:
 17.39 real 8.28 user 5.49 sys
130.51 real60.97 user34.63 sys
390.68 real   310.78 user60.55 sys
 
  Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for the
  obj and depend stages.  These stages have little parallelism.  SCHED_ULE
  was only 19% slower for the all stage.  ...

 I reran this with -current (sched_ule.c 1.68, etc.).  Result: no
 significant change.  However, with a UP kernel there was no significant
 difference between the times for SCHED_ULE and SCHED_4BSD.

There was a significant difference on UP until last week.  I'm working on
SMP now.  I have some patches but they aren't quite ready yet.


  Test 5 for fair scheduling related to niceness:
 
  for i in -20 -16 -12 -8 -4 0 4 8 12 16 20
  do
  nice -$i sh -c while :; do echo -n;done 
  done
  time top -o cpu
 
  With SCHED_ULE, this now hangs the system, but it worked yesterday.  Today
  it doesn't get as far as running top and it stops the nfs server responding.
  To unhang the system and see what the above does, run a shell at rtprio 0
  and start top before the above, and use top to kill processes (I normally
  use killall sh to kill all the shells generated by tests 1-5, but killall
  doesn't work if it is on nfs when the nfs server is not responding).

 This shows problems much more clearly with UP kernels.  It gives the
 nice -20 and -16 processes approx. 55% and 50% of the CPU, respectively
 (the total is significantly more than 100%), and it gives approx.  0%
 of the CPU to the other sh processes (perhaps exactly 0).  It also
 apparently gives gives 0% of the CPU to some important nfs process (I
 couldn't see exactly which) so the nfs server stops responding.
 SCHED_4BSD errs in the opposite direction by giving too many cycles to
 highly niced processes so it is naturally immune to this problem.  With
 SMP, SCHED_ULE lets many more processes run.

I seem to have broken something related to nice.  I only tested
interactivity and performance after my last round of changes.  I have a
standard test that I do that is similar to the one that you have posted
here.  I used it to gather results for my paper
(http://www.chesapeake.net/~jroberson/ULE.pdf).  There you can see what
the intended nice curve is like.  Oddly enough, I ran your test again on
my laptop and I did not see 55% of the cpu going to nice -20.  It was
spread proportionally from -20 to 0 with postive nice values not receiving
cpu time, as intended.  It did not, however, let interactive processes
proceed.  This is certainly a bug and it sounds like there may be others
which lead to the problems that you're having.


 The nfs server also sometimes stops reponding with only non-negatively
 niced processes (0 through 20 in the above), but it takes longer.

 The nfs server restarts if enough of the hog processes are killed.
 Apparently nfs has some critical process running at only user priority
 and nice 0 and even non-negatively niced processes are enough to prevent
 it it running.

This shouldn't be the case, it sounds like my interactivity boost is
somewhat broken.


 Top output with loops like the above shows many anomalies in PRI, TIME,
 WCPU and CPU, but no worse than the ones with SCHED_4BSD.  PRI tends to
 stick at 139 (the max) with SCHED_ULE.  With SCHED_4BSD, this indicates
 that the scheduler has entered an unfair scheduling region.  I don't
 know how to interpret it for SCHED_ULE (at first I thought 139 

Re: More ULE bugs fixed.

2003-10-27 Thread Jeff Roberson
On Fri, 17 Oct 2003, Bruce Evans wrote:

 On Fri, 17 Oct 2003, Jeff Roberson wrote:

  On Fri, 17 Oct 2003, Bruce Evans wrote:
 
   How would one test if it was an improvement on the 4BSD scheduler?  It
   is not even competitive in my simple tests.
   ...
 
  At one point ULE was at least as fast as 4BSD and in most cases faster.
  This is a regression.  I'll sort it out soon.

 How much faster?


make kernel on UP seems to be within 1% of 4BSD now.  I actually had some
runs which showed lower system time.  I think I can still improve the
situation some.  Anyway, I found some bugs relating to idle prio tasks,
and also ULE had been doing almost twice as many context switches as 4BSD.
Now it's doing about 8% more.  I'm still tracking this down.

Anyhow, it should be much closer now.  I still have some plans for SMP
that should improve things quite a bit there but UP is looking good.

Cheers,
Jeff

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ULE page fault with sched_ule.c 1.67

2003-10-27 Thread Jeff Roberson
On Mon, 27 Oct 2003, Jonathan Fosburgh wrote:

 On Monday 27 October 2003 12:06 pm, Arjan van Leeuwen wrote:
  Hi,
 
  I just cvsupped and built a new kernel that includes sched_ule.c 1.67. I'm
  getting a page fault when working in Mozilla Firebird. It happens pretty
  soon, after opening one or two pages. The trace shows that it panics at
  sched_prio().
 
 I should have said, I am getting the same panic, same trace, but not using
 Mozilla.  I get it shortly after launching my KDE session, though I'm not
 sure where in my session the problem is being hit.

It's KSE.  You can disable it to work around temporarily.  I will fix it
tonight.


 --
 Jonathan Fosburgh
 AIX and Storage Administrator
 UT MD Anderson Cancer Center
 Houston, TX

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-17 Thread Jeff Roberson
On Fri, 17 Oct 2003, Bruce Evans wrote:

 How would one test if it was an improvement on the 4BSD scheduler?  It
 is not even competitive in my simple tests.

[scripts results deleted]


 Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for the
 obj and depend stages.  These stages have little parallelism.  SCHED_ULE
 was only 19% slower for the all stage.  It apparently misses many
 oppurtunities to actually run useful processes.  This may be related
 to /usr being nfs mounted.  There is lots of idling waiting for nfs
 even in the SCHED_4BSD case.  The system times are smaller for SCHED_ULE,
 but this might not be significant.  E.g., zeroing pages can account
 for several percent of the system time in buildworld, but on unbalanced
 systems that have too much idle time most page zero gets done in idle
 time and doesn't show up in the system time.

At one point ULE was at least as fast as 4BSD and in most cases faster.
This is a regression.  I'll sort it out soon.



 Test 1 for fair scheduling related to niceness:

   for i in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
   do
   nice -$i sh -c while :; do echo -n;done 
   done
   top -o time

 [Output deleted].  This shows only a vague correlation between niceness
 and runtime for SCHED_ULE.  However, top -o cpu shows a strong correlation
 between %CPU and niceness.  Apparently, %CPU is very innacurate and/or
 not enough history is kept for long-term scheduling to be fair.

 Test 5 for fair scheduling related to niceness:

   for i in -20 -16 -12 -8 -4 0 4 8 12 16 20
   do
   nice -$i sh -c while :; do echo -n;done 
   done
   time top -o cpu

 With SCHED_ULE, this now hangs the system, but it worked yesterday.  Today
 it doesn't get as far as running top and it stops the nfs server responding.
 To unhang the system and see what the above does, run a shell at rtprio 0
 and start top before the above, and use top to kill processes (I normally
 use killall sh to kill all the shells generated by tests 1-5, but killall
 doesn't work if it is on nfs when the nfs server is not responding).

  661 root 112  -20   900K   608K RUN  0:24 27.80% 27.64% sh
  662 root 114  -16   900K   608K RUN  0:19 12.43% 12.35% sh
  663 root 114  -12   900K   608K RUN  0:15 10.66% 10.60% sh
  664 root 114   -8   900K   608K RUN  0:11  9.38%  9.33% sh
  665 root 115   -4   900K   608K RUN  0:10  7.91%  7.86% sh
  666 root 1150   900K   608K RUN  0:07  6.83%  6.79% sh
  667 root 1154   900K   608K RUN  0:06  5.01%  4.98% sh
  668 root 1158   900K   608K RUN  0:04  3.83%  3.81% sh
  669 root 115   12   900K   608K RUN  0:02  2.21%  2.20% sh
  670 root 115   16   900K   608K RUN  0:01  0.93%  0.93% sh

I think you cvsup'd at a bad time.  I fixed a bug that would have caused
the system to lock up in this case late last night.  On my system it
freezes for a few seconds and then returns.  I can stop that by turning
down the interactivity threshold.

Thanks,
Jeff


 Bruce
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-17 Thread Jeff Roberson

On Fri, 17 Oct 2003, Bruce Evans wrote:

 On Fri, 17 Oct 2003, Jeff Roberson wrote:

  On Fri, 17 Oct 2003, Bruce Evans wrote:
 
   How would one test if it was an improvement on the 4BSD scheduler?  It
   is not even competitive in my simple tests.
   ...
 
  At one point ULE was at least as fast as 4BSD and in most cases faster.
  This is a regression.  I'll sort it out soon.

 How much faster?

Apache benchmarked at 30% greater throughput due the cpu affinity some
time ago.  I haven't done more recent tests with apache.  buildworld is
the most degenerate case for per cpu run queues because cpu affinity
doesn't help much and load imbalances hurt a lot.  On my machine the
compiler hardly ever wants to run for more than a few slices before doing
a msleep() so it's not bouncing around between CPUs so much with 4BSD.



   Test 5 for fair scheduling related to niceness:
  
 for i in -20 -16 -12 -8 -4 0 4 8 12 16 20
 do
 nice -$i sh -c while :; do echo -n;done 
 done
 time top -o cpu
  
   With SCHED_ULE, this now hangs the system, but it worked yesterday.  Today
   it doesn't get as far as running top and it stops the nfs server responding.

661 root 112  -20   900K   608K RUN  0:24 27.80% 27.64% sh
662 root 114  -16   900K   608K RUN  0:19 12.43% 12.35% sh
663 root 114  -12   900K   608K RUN  0:15 10.66% 10.60% sh
664 root 114   -8   900K   608K RUN  0:11  9.38%  9.33% sh
665 root 115   -4   900K   608K RUN  0:10  7.91%  7.86% sh
666 root 1150   900K   608K RUN  0:07  6.83%  6.79% sh
667 root 1154   900K   608K RUN  0:06  5.01%  4.98% sh
668 root 1158   900K   608K RUN  0:04  3.83%  3.81% sh
669 root 115   12   900K   608K RUN  0:02  2.21%  2.20% sh
670 root 115   16   900K   608K RUN  0:01  0.93%  0.93% sh

 Perhaps the bug only affects SMP.  The above is for UP (no CPU column).


That is likely, I don't use my SMP machine much anymore.  I should setup
some automated tests.

 I see a large difference from the above, at least under SMP: %CPU
 tapers off to 0 at nice 0.

 BTW, I just noticed that SCHED_4BSD never really worked for the SMP case.
 sched_clock() is called for each CPU, and for N CPU's this has the same
 effect as calling sched_clock() N times too often for 1 CPU.  Calling
 sched_clock() too often was fixed for the UP case in kern_synch.c 1.83
 by introducing a scale factor.  The scale factor is fixed so it doesn't
 help for SMP.

Wait.. why are we calling sched_clock() too frequently on UP?


  I think you cvsup'd at a bad time.  I fixed a bug that would have caused
  the system to lock up in this case late last night.  On my system it
  freezes for a few seconds and then returns.  I can stop that by turning
  down the interactivity threshold.

 No, I tested with an up to date kernel (sched_ule.c 1.65).

Curious.  ULE seems to have suffered from bitrot.  These things were all
tested and working when I did my paper for BSDCon.  I have largely
neglected FreeBSD since.  I can't fix it this weekend, but I'm sure I'll
sort it out next weekend.

Cheers,
Jeff


 Bruce


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-17 Thread Jeff Roberson

On Fri, 17 Oct 2003, Sean Chittenden wrote:

  I think you cvsup'd at a bad time.  I fixed a bug that would have
  caused the system to lock up in this case late last night.  On my
  system it freezes for a few seconds and then returns.  I can stop
  that by turning down the interactivity threshold.

 Hrm, I must concur that while ULE seems a tad snappier on the
 responsiveness end, it seems to be lacking in terms of real world
 performance compared to 4BSD.

Thanks for the stats.  Is this on SMP or UP?


 Fresh CVSup (~midnight 2003-10-17) and build with a benchmark from
 before and after.  I was benchmarking a chump calc program using
 bison vs. lemon earlier today under 4BSD
 (http://groups.yahoo.com/group/sqlite/message/5506) and figured I'd
 throw my hat in on the subject with some relative numbers.  System
 time is down for ULE, but user and real are up.


 Under ULE:

 Running a dry run with bison calc...done.
 Running 1st run with bison calc... 52.11 real 45.63 user 0.56 sys
 Running 2nd run with bison calc... 52.16 real 45.52 user 0.69 sys
 Running 3rd run with bison calc... 51.80 real 45.32 user 0.87 sys

 Running a dry run with lemon calc...done.
 Running 1st run with lemon calc... 129.69 real 117.91 user 1.10 sys
 Running 2nd run with lemon calc... 130.26 real 117.88 user 1.13 sys
 Running 3rd run with lemon calc... 130.76 real 117.90 user 1.10 sys

 Time spent in user mode   (CPU seconds) : 654.049s
 Time spent in kernel mode (CPU seconds) : 7.047s
 Total time  : 12:19.06s
 CPU utilization (percentage): 89.4%
 Times the process was swapped   : 0
 Times of major page faults  : 34
 Times of minor page faults  : 2361


 And under 4BSD:

  Running a dry run with bison calc...done.
  Running 1st run with bison calc... 44.22 real 37.94 user 0.85 sys
  Running 2nd run with bison calc... 46.21 real 37.98 user 0.85 sys
  Running 3rd run with bison calc... 45.32 real 38.13 user 0.67 sys

  Running a dry run with lemon calc...done.
  Running 1st run with lemon calc... 116.53 real 100.10 user 1.13 sys
  Running 2nd run with lemon calc... 112.61 real 100.35 user 0.86 sys
  Running 3rd run with lemon calc... 114.16 real 100.19 user 1.04 sys

  Time spent in user mode (CPU seconds) : 553.392s
  Time spent in kernel mode (CPU seconds) : 6.978s
  Total time : 10:40.80s
  CPU utilization (percentage) : 87.4%
  Times the process was swapped : 223
  Times of major page faults : 50
  Times of minor page faults : 2750


 Just a heads up, it does indeed look as thought hings have gone
 backwards in terms of performance.  -sc

 --
 Sean Chittenden


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-16 Thread Jeff Roberson
On Thu, 16 Oct 2003, Eirik Oeverby wrote:

 Jeff Roberson wrote:
  On Wed, 15 Oct 2003, Eirik Oeverby wrote:
 
 
 Eirik Oeverby wrote:
 
 Jeff Roberson wrote:
 
 
 I fixed two bugs that were exposed due to more of the kernel running
 outside of Giant.  ULE had some issues with priority propagation that
 stopped it from working very well.
 
 Things should be much improved.  Feedback, as always, is welcome.  I'd
 like to look into making this the default scheduler for 5.2 if things
 start looking up.  I hope that scares you all into using it more. :-)
 
 
 Hi..
 Just tested, so far it seems good. System CPU load is floored (near 0),
 system is very responsive, no mouse sluggishness or random
 mouse/keyboard input.
 Doing a make -j 20 buildworld now (on my 1ghz p3 thinkpad ;), and
 running some SQLServer stuff in VMWare. We'll see how it fares.
 
 Hi, just a followup message.
 I'm now running the buildworld mentioned above, and the system is pretty
 much unusable. It exhibits the same symptoms as I have mentioned before,
 mouse jumpiness, bogus mouse input (movement, clicks), and the system is
 generally very jerky and unresponsive. This is particularily evident
 when doing things like webpage loading/browsing/rendering, but it's
 noticeable all the time, no matter what I am doing. As an example, the
 last sentence I wote without seeing a single character on screen before
 I was finsihed writing it, and it appeared with a lot more typos than I
 usually make ;)
 
 I'm running *without* invariants and witness right now, i.e. a kernel
 100% equal to the SCHED_4BSD kernel.
 
 
  Can you confirm the revision of your sys/kern/sched_ule.c file?  How does
  SCHED_4BSD respond in this same test?

 Yes I can. From file:
 __FBSDID($FreeBSD: src/sys/kern/sched_ule.c,v 1.59 2003/10/15 07:47:06
 jeff Exp $);
 I am running SCHED_4BSD now, with a make -j 20 buildworld running, and I
 do not experience any of the problems. Keyboard and mouse input is
 smooth, and though apps run slightly slower due to the massive load on
 the system, there is none of the jerkiness I have seen before.

 Anything else I can do to help?

Yup, try again. :-)  I found another bug and tuned some parameters of the
scheduler.  The bug was introduced after I did my paper for BSDCon and so
I never ran into it when I was doing serious stress testing.

Hopefully this will be a huge improvement.  I did a make -j16 buildworld
and used mozilla while in kde2.  It was fine unless I tried to scroll
around rapidly in a page full of several megabyte images for many minutes.


 /Eirik

  Thanks,
  Jeff
 
 
 Best regards,
 /Eirik
 
 
 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 
 
 
  ___
  [EMAIL PROTECTED] mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-current
  To unsubscribe, send any mail to [EMAIL PROTECTED]



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Page faults with today's current

2003-10-16 Thread Jeff Roberson

On Thu, 16 Oct 2003, Arjan van Leeuwen wrote:

 I just cvsupped and installed a new world and kernel (previous kernel was from
 October 13), and now my machine gets a page fault when I try to run any GTK2
 application (Firebird, Gnome 2). Are others seeing this as well?

 Arjan

If you're running ULE and KSE I just fixed a bug with that.  If not, pleae
provide a stack trace.  You can manually transcribe one by starting a gtk2
application from a console with your DISPLAY variable set appropriately.

Thanks,
Jeff


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sched_ule.c SMP error

2003-10-16 Thread Jeff Roberson

On Thu, 16 Oct 2003, Valentin Chopov wrote:

 I'm getting an error in the sched_ule.c

 It looks that sched_add is called with struct kse arg. instead of
 struct thread

Fixed, thanks.


 Thanks,

 Val


 cc -c -O -pipe -march=pentiumpro -Wall -Wredundant-decls -Wnested-externs
 -Wstri
 ct-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual
 -fform
 at-extensions -std=c99  -nostdinc -I-  -I. -I/usr/src/sys
 -I/usr/src/sys/contrib
 /dev/acpica -I/usr/src/sys/contrib/ipfilter -I/usr/src/sys/contrib/dev/ath
 -I/us
 r/src/sys/contrib/dev/ath/freebsd -D_KERNEL -include opt_global.h
 -fno-common -f
 inline-limit=15000 -fno-strict-aliasing  -mno-align-long-strings
 -mpreferred-sta
 ck-boundary=2 -ffreestanding -Werror  /usr/src/sys/kern/sched_ule.c
 /usr/src/sys/kern/sched_ule.c: In function `kseq_move':
 /usr/src/sys/kern/sched_ule.c:465: warning: passing arg 1 of `sched_add'
 from in
 compatible pointer type
 *** Error code 1

 Stop in /usr/obj/usr/src/sys/MYKERNEL.
 *** Error code 1

 Stop in /usr/src.
 *** Error code 1

 Stop in /usr/src.


 ==
 Valentin S. Chopov, CC[ND]P
 Sys/Net Admin
 SEI Data Inc.
 E-Mail: [EMAIL PROTECTED]
 ==


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


More ULE bugs fixed.

2003-10-15 Thread Jeff Roberson
I fixed two bugs that were exposed due to more of the kernel running
outside of Giant.  ULE had some issues with priority propagation that
stopped it from working very well.

Things should be much improved.  Feedback, as always, is welcome.  I'd
like to look into making this the default scheduler for 5.2 if things
start looking up.  I hope that scares you all into using it more. :-)

Cheers,
Jeff

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-15 Thread Jeff Roberson
On Wed, 15 Oct 2003, Eirik Oeverby wrote:

 Eirik Oeverby wrote:
  Jeff Roberson wrote:
 
  I fixed two bugs that were exposed due to more of the kernel running
  outside of Giant.  ULE had some issues with priority propagation that
  stopped it from working very well.
 
  Things should be much improved.  Feedback, as always, is welcome.  I'd
  like to look into making this the default scheduler for 5.2 if things
  start looking up.  I hope that scares you all into using it more. :-)
 
 
  Hi..
  Just tested, so far it seems good. System CPU load is floored (near 0),
  system is very responsive, no mouse sluggishness or random
  mouse/keyboard input.
  Doing a make -j 20 buildworld now (on my 1ghz p3 thinkpad ;), and
  running some SQLServer stuff in VMWare. We'll see how it fares.

 Hi, just a followup message.
 I'm now running the buildworld mentioned above, and the system is pretty
 much unusable. It exhibits the same symptoms as I have mentioned before,
 mouse jumpiness, bogus mouse input (movement, clicks), and the system is
 generally very jerky and unresponsive. This is particularily evident
 when doing things like webpage loading/browsing/rendering, but it's
 noticeable all the time, no matter what I am doing. As an example, the
 last sentence I wote without seeing a single character on screen before
 I was finsihed writing it, and it appeared with a lot more typos than I
 usually make ;)

 I'm running *without* invariants and witness right now, i.e. a kernel
 100% equal to the SCHED_4BSD kernel.

Can you confirm the revision of your sys/kern/sched_ule.c file?  How does
SCHED_4BSD respond in this same test?

Thanks,
Jeff


 Best regards,
 /Eirik


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: More ULE bugs fixed.

2003-10-15 Thread Jeff Roberson
On Wed, 15 Oct 2003, Daniel Eischen wrote:

 On Wed, 15 Oct 2003, Jeff Roberson wrote:

  I fixed two bugs that were exposed due to more of the kernel running
  outside of Giant.  ULE had some issues with priority propagation that
  stopped it from working very well.
 
  Things should be much improved.  Feedback, as always, is welcome.  I'd
  like to look into making this the default scheduler for 5.2 if things
  start looking up.  I hope that scares you all into using it more. :-)

 Before you do that, can you look into changing the scheduler
 interfaces to address David Xu's concern with it being
 suboptimal for KSE processes?

Certainly, it may not happen if I can't find out what's making things so
jerky for gnome/kde users.  If it looks like it will, I'll investigate the
kse issues.


 --
 Dan Eischen



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ULE status; interactivity fixed? nice uninvestigated, HTT broken

2003-10-13 Thread Jeff Roberson
On Mon, 13 Oct 2003, Arjan van Leeuwen wrote:

 On Sunday 12 October 2003 23:21, Jeff Roberson wrote:
  I commited a fix that would have caused all of the jerky behaviors under
  some load.  I was not able to reproduce this problem with kde running
  afterwards.

 Thanks for the fix! However, the problem is still here for me (using rev.
 1.58). I just noticed it when compiling Mozilla. I can also still see it when
 logging out of GNOME.

Is it somewhat better?  I specifically fixed the problem for Giant but
other locks could have the same issues.  I suspect that they are far less
frequently held without Giant, but I could be wrong.


 Arjan


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ULE status; interactivity fixed? nice uninvestigated, HTT broken

2003-10-13 Thread Jeff Roberson
On Tue, 14 Oct 2003, Arjan van Leeuwen wrote:

 On Monday 13 October 2003 21:27, Jeff Roberson wrote:
  On Mon, 13 Oct 2003, Arjan van Leeuwen wrote:
   On Sunday 12 October 2003 23:21, Jeff Roberson wrote:
I commited a fix that would have caused all of the jerky behaviors
under some load.  I was not able to reproduce this problem with kde
running afterwards.
  
   Thanks for the fix! However, the problem is still here for me (using rev.
   1.58). I just noticed it when compiling Mozilla. I can also still see it
   when logging out of GNOME.
 
  Is it somewhat better?  I specifically fixed the problem for Giant but
  other locks could have the same issues.  I suspect that they are far less
  frequently held without Giant, but I could be wrong.

 Now that I looked at it better, yes, it does indeed seem better :). It still
 seems to happen at the same places, but the jerkiness is less... jerky. the
 position of the mouse pointer is updated more often than used to be the case.

Thanks.  This feedback is very important for me to resolve this issues.  I
think I know how to solve the issue now, I'm going to make the required
changes soon.  I'll send a status update again when I do.

Cheers,
Jeff


 Arjan


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


ULE status; interactivity fixed? nice uninvestigated, HTT broken

2003-10-12 Thread Jeff Roberson
I commited a fix that would have caused all of the jerky behaviors under
some load.  I was not able to reproduce this problem with kde running
afterwards.

I'm going to look into the reports of some problems with nice, although I
suspect that they could have been caused by the same issues.

HTT is awaiting some jhb fixes which are awaiting some UMA fixes.  I'll
give an update on that later.

Cheers,
Jeff

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: softdep_deallocate_dependencies: dangling deps

2003-10-12 Thread Jeff Roberson
On Mon, 13 Oct 2003, Oliver Fischer wrote:

 My notebook was a little bit panic this night. After rebooting I found
   this message in my system log:

   panic: softdep_deallocate_dependencies: dangling deps

 ?

When are your sources from?


 Regards,

 Oliver Fischer

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Interesting...sched_ule discussion

2003-10-11 Thread Jeff Roberson
On Sat, 11 Oct 2003, Brendon and Wendy wrote:

 Hi,

 Just saw the talk about sched_ule, nvidia driver, moused and pauses...

 I was running -current up until about a month ago, using the nvidia
 driver, sched_bsd on a dual ht xeon, with htt disabled. Mouse
 interactivity with moused was terrible - I actually thought the mouse
 was faulty. Getting rid of moused and using psm0 was better, but not
 hugely so.

 I found that under kde, things were ok but under nautilus the system
 was bounding on unusable.

What kind of hardware do you have?  Were you running with WITNESS and
INVARIANTS?


 By contrast, under linux things are just fine.

What version of the linux kernel did you switch to?


 Maybe this will turn out to be useful data for you...

Could be, thanks.


 Thanks,
 Brendon


 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to [EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


ULE Update

2003-10-10 Thread Jeff Roberson
I have reproduced the lagging mouse issue on my laptop.  I tried moused to
no effect.  Eventually, I grudgingly installed kde and immediately started
encountering problems with mouse lag.  It would seem that twm was not
stressing my machine in the same ways that kde is. ;-)

I suspect a problem with IPC.  I will know more soon.

There have also been a few reports of problems related to nice.  I was
able to reproduce some awkward behavior but I have nothing conclusive yet.

There is still a known issue with hyperthreading.  I'm waiting on some of
john baldwin's work to fix this.  If you halt logical cpus your machine
will hang.

Expect some resolution on the ULE problems within a week or so.  Thanks
for the detailed bug reports everyone.

Cheers,
Jeff

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


  1   2   3   >