from:"Scott Long"

Re: IBM blade server abysmal disk write performances

2013-01-19 Thread Scott Long


On Jan 19, 2013, at 4:33 PM, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl 
wrote:

 to be enabled to get any speed-up from tagged commands. This was no
 risk with SCSI drives, since the cache did not make the drives lye
 
 i see no correlation between interface type and possibility of lying about 
 command completion.
 

Any interface that enables write cache will lie about write completions.  This
is true for SAS, SATA, SCSI, and PATA (and probably FC and iSCSI).  That's
the whole point of the write cache =-)

Where things got interesting was in the days of SCSI vs PATA.  There was
no tagged queuing for PATA, except for a hack that allowed CDROMs to
disconnect from the shared bus.  So you only got 1 command at a time, and
you payed a serialized latency penalty.  The only way to get reasonable
write performance on PATA was to enable the write cache.  Meanwhile,
SCSI had TCQ and could amortize the latency penalty to the point where
performance with TCQ and no WC was almost as good at with WC.  This
made SCSI the clear choice for performance + data safety.

With SATA vs SAS, the gap is much narrower.  The TCQ command set
(still used by SAS) is still better than the NCQ command set, but the
differences are minor enough that it doesn't matter for most applications.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Scott Long

Try adding the following to /boot/loader.conf and reboot:

hw.mpt.enable_sata_wc=1

The default value, -1, instructs the driver to leave the STA drives at their 
configuration default.  Often times this means that the MPT BIOS will turn off 
the write cache on every system boot sequence.  IT DOES THIS FOR A GOOD REASON! 
 An enabled write cache is counter to data reliability.  Yes, it helps make 
benchmarks look really good, and it's acceptable if your data can be safely 
thrown away (for example, you're just caching from a slower source, and the 
cache can be rebuilt if it gets corrupted).  And yes, Linux has many tricks to 
make this benchmark look really good.  The tricks range from buffering the raw 
device to having 'dd' recognize the requested task and short-circuit the 
process of going to /dev/null or pulling from /dev/zero.  I can't tell you how 
bogus these tests are and how completely irrelevant they are in predicting 
actual workload performance.  But, I'm not going to stop anyone from trying, so 
give the above tunable a try
 and let me know how it works.

Btw, I'm not subscribed to the hackers mailing list, so please redistribute 
this email as needed.

Scott





 From: Dieter BSD dieter...@gmail.com
To: freebsd-hackers@freebsd.org 
Cc: mja...@freebsd.org; gi...@freebsd.org; sco...@freebsd.org 
Sent: Thursday, January 17, 2013 9:03 PM
Subject: Re: IBM blade server abysmal disk write performances
 
 I am thinking that something fancy in that SAS drive is
 not being handled correctly by the FreeBSD driver.

I think so too, and I think the something fancy is tagged command queuing.
The driver prints da0: Command Queueing enabled and yet your SAS drive
is only getting 1 write per rev, and queuing should get you more than that.
Your SATA drive is getting the expected performance, which means that NCQ
must be working.

 Please let me know if there is anything you would like me to run on the
 BSD 9.1 system to help diagnose this issue?

Looking at the mpt driver, a verbose boot may give more info.
Looks like you can set a debug device hint, but I don't
see any documentation on what to set it to.

I think it is time to ask the driver wizards why TCQ isn't working,
so I'm cc-ing the authors listed on the mpt man page.



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Scott Long

- Original Message -

 From: Wojciech Puchar woj...@wojtek.tensor.gdynia.pl
 To: Scott Long scott4l...@yahoo.com
 Cc: Dieter BSD dieter...@gmail.com; freebsd-hackers@freebsd.org 
 freebsd-hackers@freebsd.org; gi...@freebsd.org gi...@freebsd.org; 
 sco...@freebsd.org sco...@freebsd.org; mja...@freebsd.org 
 mja...@freebsd.org
 Sent: Friday, January 18, 2013 11:10 AM
 Subject: Re: IBM blade server abysmal disk write performances

  The default value, -1, instructs the driver to leave the STA drives at 
 their configuration default.  Often times this means that the MPT BIOS will 
 turn 
 off the write cache on every system boot sequence.  IT DOES THIS FOR A GOOD 
 REASON!  An enabled write cache is counter to data reliability.  Yes, it 
 helps 
 make benchmarks look really good, and it's acceptable if your data can be 
 safely thrown away (for example, you're just caching from a slower source, 
 and the cache can be rebuilt if it gets corrupted).  And yes, Linux has many 
 tricks to make this benchmark look really good.  The tricks range from 
 buffering 
 the raw device to having 'dd' recognize the requested task and 
 short-circuit the process of going to /dev/null or pulling from /dev/zero.  I 
 can't tell you how bogus these tests are and how completely irrelevant they 
 are in predicting actual workload performance.  But, I'm not going to stop 
 anyone from trying, so give the above tunable a try
  and let me know how it works.

 If computer have UPS then write caching is fine. even if FreeBSD crash, 
 disk would write data

I suspect that I'm encountering situations right now at netflix where this 
advice is not true.  I have drives that are seeing intermittent errors, then 
being forced into reset after a timeout, and then coming back up with 
filesystem problems.  It's only a suspicion at this point, not a confirmed case.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: IBM blade server abysmal disk write performances

2013-01-18 Thread Scott Long


On Jan 18, 2013, at 1:12 PM, Dieter BSD dieter...@gmail.com wrote:
 It is inexcusable that FreeBSD defaults to leaving the write cache on
 for SATA  PATA drives.

This was completely driven by the need to satisfy idiotic benchmarkers,
tech writers, and system administrators.  It was a huge deal for FreeBSD
4.4, IIRC.  It had been silently enabled it, we turned it off, released 4.4,
and then got murdered in the press for being slow.

If I had my way, the WC would be off, everyone would be using SAS,
and anyone who enabled SATA WC or complained about I/O slowness
would be forced into Siberian salt mines for the remainder of their lives.


  At least the admin can easily fix this by
 adding hw.ata.wc=0 to /boot/loader.conf.  The bigger problem is that
 FreeBSD does not support queuing on all controllers that support it.
 Not something that admins can fix, and inexcusable for an OS that
 claims to care about performance.

You keep saying this, but I'm unclear on what you mean.  Can you
explain?

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: On cooperative work [Was: Re: newbus' ivar's limitation..]

2012-08-02 Thread Scott Long


On Aug 2, 2012, at 12:23 AM, Kevin Oberman kob6...@gmail.com wrote:

 Doug makes some good points.

No, he doesn't.  He and Arnould being argumentative and accusatory where none 
of that is warranted.

I used to run the devsummits, and we did tele-conference lines for remote 
people to participate.  After I stepped down, others took it up and did the 
same thing.  Usually, the lines were unused.  I suspect that organizers simply 
stopped thinking about them after a while because of poor interest.  There is 
no conspiracy of exclusion here, just simple human apathy.

The invite system for the devsummit was, and still is, purely about providing 
some order to the process.  It ensures that people attending are willing to 
demonstrate a minimum amount of interest, more than just wondering by a room 
one day and dropping in for free food and wifi.  If anyone feels that they are 
being excluded, it's because they are too lazy to go beyond being argumentative 
on a mailing list.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: geom - cam disk

2012-07-25 Thread Scott Long

Once the bio is put into the bioq from da_strategy, the CAM scheduler is 
called.  It may or may not wind up calling dastart right away; if the simq or 
devq is frozen, or if the devq has been exhausted, then the io will be deferred 
until later and the call stack will unwind back into g_down.  The bioq can 
therefore accumulate many bio's before being drained.  Draining will usually 
happen from the camisr, at which point you can potentially have i/o being 
initiated from both the camisr and the g_down threads in parallel.  The 
monolithic locking in CAM right now prevents this from actually happening, 
though that's a topic that needs to be revisited.

Scott

On Jul 25, 2012, at 1:27 PM, Andriy Gapon wrote:

 
 
 Preamble.  I am trying to understand in detail how things work at GEOM - 
 CAM
 disk boundary.  I am looking at scsi_da and ata_da which seem to be twins in
 this respect.
 
 I got an impression that the bioq_disksort calls in the strategy methods and 
 the
 related queues are completely useless in the GEOM single-threaded world.
 There is only one thread, g_down, that can call a strategy method, the method
 enqueues a bio, then calls a schedule function and through xpt_schedule the 
 call
 flow continues to a start method which dequeues the bio and off it goes.
 I currently can see how a bio queue can accumulate more than one bio.
 
 What am I missing? :-)
 I will be very glad to learn more about this layer if anyone is willing to
 educate me.
 Thank you in advance.
 
 P.S. I wrote a very simple to DTrace script to my theory experimentally and 
 my
 testing with various workloads didn't disprove the theory so far (which 
 doesn't
 mean that it is correct, of course).
 
 The script:
 fbt::bioq_disksort:entry
 /args[0]-queue.tqh_first == 0/
 {
@[empty] = count();
 }
 
 fbt::bioq_disksort:entry
 /args[0]-queue.tqh_first != 0/
 {
@[non-empty] = count();
 }
 
 It works on all bioq_disksort calls, but I stressing only ada disks at the 
 moment.
 -- 
 Andriy Gapon
 

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Improving the kernel/i386 timecounter performance (GSoC proposal)

2009-03-30 Thread Scott Long


David Xu wrote:

Julian Elischer wrote:

David Xu wrote:

David Xu wrote:

Julian Elischer wrote:


depends on the hardware.
anyhow I was only saying it was possible, not necessarily
good or even useful.




I had done some works for thread private page shared by kernel
and userland when I was doing userland spinlock, if userland asks
a page, kernel will allocate it and put some interesting thing in
it by scheduler etcs, these code may be useful.


FYI:
http://people.freebsd.org/~davidxu/schedctl/


reading this quickly, you allocate a separately addressed page for
each thread, but,  how do you use it?



I store the address in userland TLS area, then get it when I want to
check some scheduling informations.



Interesting, I was wondering earlier today if pointing to the per-thread
syspage in from the TLS area would save the TLB invalidate that you were
concerned about.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: amr driver broken since March 12

2009-03-29 Thread Scott Long


Danny Braniss wrote:

Danny Braniss wrote:

it seems March 12 was a bit off :-)
it took some time, but I managed to close the gap:
189100  ok
189150  fails
I will continue tomorrow, but this should be helpful.



189150 is in the middle of a big string of related commits.  Try
updating to the following change numbers and retesting:

189088
189107
189161

If the last one does not work, try editing /sys/dev/amr/amr.c to change

#define AMR_ENABLE_CAM 1

to

#define AMR_ENABLE_CAM 0

Scott


189161 works, also for the iir
now what?



Next set to try:

189219
189229
189253
189402
189531
189569
189591

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: amr driver broken since March 12

2009-03-29 Thread Scott Long


Danny Braniss wrote:

Danny Braniss wrote:

Danny Braniss wrote:

it seems March 12 was a bit off :-)
it took some time, but I managed to close the gap:
189100  ok
189150  fails
I will continue tomorrow, but this should be helpful.



189150 is in the middle of a big string of related commits.  Try
updating to the following change numbers and retesting:

189088
189107
189161

If the last one does not work, try editing /sys/dev/amr/amr.c to change

#define AMR_ENABLE_CAM 1

to

#define AMR_ENABLE_CAM 0

Scott

189161 works, also for the iir
now what?


Next set to try:

189219

broken

189229

broken


Ok, so 189161 works, 189219 doesn't, correct?  If so, did you also make 
the change to amr.c yet?


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: amr driver broken since March 12

2009-03-28 Thread Scott Long


Danny Braniss wrote:

it seems March 12 was a bit off :-)
it took some time, but I managed to close the gap:
189100  ok
189150  fails
I will continue tomorrow, but this should be helpful.




189150 is in the middle of a big string of related commits.  Try
updating to the following change numbers and retesting:

189088
189107
189161

If the last one does not work, try editing /sys/dev/amr/amr.c to change

#define AMR_ENABLE_CAM 1

to

#define AMR_ENABLE_CAM 0

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: amr driver broken since March 12

2009-03-27 Thread Scott Long


Danny Braniss wrote:

Danny Braniss wrote:

at least for me :-)
[and sorry for the cross posting]

old (March 12 , i know need the svn rev number but...)
None of the commit activity on March 12 is jumping out at me as being 
suspicious.  However, you are now the second person who has told me 
about AMR problems in 7.1 recently.  If you have a precise svn change

number, it would help greatly.

Scott

my bad. the last working amr/iir is from March 12.
I first detected the problem sometime later, but not later than March 23.
So it has to be changes in that time frame.

both drivers are showing similar symptoms:
waiting for not busy
the iir goes on for ever, and it's the cam that eventually panics,
 run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config
(actually not 100% true, depending if WITNESS is on or off, it sometimes
just hangs).
the amr seems to time out:
amr0: adapter is busy

thanks for looking into the problem,

danny




Ok, here are a series of revisions to step through, in forward order.
Make sure that you are starting with at least revision 189568.  Then,
update to exactly the revision numbers below, recompile the kernel, and
test:

190087
190091

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Improving the kernel/i386 timecounter performance (GSoC proposal)

2009-03-27 Thread Scott Long

I've been talking about this for years.  All I need is help with the VM 
magic to create the page on fork.  I also want two pages, one global

for gettimeofday (and any other global data we can think of) and one
per-process for static data like getpid/getgid.

Scott


Sergey Babkin wrote:

   (Sorry for the top quoting). Probably the best implementation of
   gettimeofd=y() is to have
   a page in the kernel mapped read-only to all the user pr=cesses. Put
   the kernel's idea of time
   into this page. Then getting the =ime becomes a simple read (OK, two
   reads, to make sure that
   no update =as happened in between).
   The TSC can then be used to add the precis=on between the ticks of
   the kernel timer:
   i.e. remember the value of TS= when the last tick happen, and the
   highest rate at which
   TSC may be ti=king at this CPU, and export in the same page. This
   would guarantee thatthe time is not moving back.
   However there are more issues with TS=. TSC is guaranteed to have
   the same value
   on all the processors that s=are the same system bus. But if the
   machine is built of multiple
   buses =ith bridges between them, all bets are off. Each bus may be
   stopped, resta=ted
   and clocked separately. There is no way to tell, on which CPU is th=
   process currently
   runnning, and it may be rescheduled do a different C=U right before
   or after the RDTSC
   instruction.
   -SB
   Ma= 26, 2009 06:55:04 PM, [1]...@phk.freebsd.dk wrote:
   
 In message [2]17560ccf0903260551v1f5cba9eu8 7727c0bae7b...@mail.gmail.com, Prasha

 nt Vaibhav writes:
 =The gettimeofday() function's implementation will then be
 change= to read the timestamp counter (TSC) from the processor,
 and use the
 g=;reading in conjunction with the timing info exported by the
 kernel to
 =calculate and return the time info in proper format.
 I take it a= read, that you know that there are other relvant
 functions than gettim=ofday() and that these must provide a
 monotonic timescale when queried =nterleaved ?
 Be aware that the TSC may not be, and may not stay syn=hronized
 across multiple cores.
 Further more, the TSC is not con=tant frequency and in particular
 not known frequency at all times.
 There are a lot of nasty cases to check, and a nasty interpolation
 =equired, which, in my tests some years back, totally negated any
 speedu= from using the TSC in the first place.
 At the very minimum, you wi=l have to add a quirk table where
 known good {CPU+MOBO+BIOS} combinatio=s can be entered, as we
 find them.
 This will also pave way f=r optionally making the
 FreeBSD kernel tickless,
 Rubbish. T=mecounters are not even closely associated with the
 tick or ticklessnes= of the kernel. [1]
  - The TSC frequency might change on cert=in processors with
 non-constant
  TSC rate (because of SpeedStep, =ynamic freq scaling etc.). The
 only way to
  combat this is that t=e kernel be notified every time the
 processor
  frequency changes.=very cpu frequency driver will need to be
 updated to
  notify the=ernel before and after a cpu freq change.
 That is not good enough= the bios may autonomously change the cpu
 speed
 and the skew from not k=owing exactly _when_ and _how_ the cpu
 clock
 changed, is a significant =umber of microseconds, plenty of time
 to make strange things happen.
 You will want to study carefully Dave Mills work to tame the alpha
 =hips wandering SAW clocks.
 Poul-Henning
 [1] In my mind, rewo=king the callout system in the kernel would
 be a much better more neded=nd much more worthwhile project.
 --
 Poul-Henning Kamp | =NIX since Zilog Zeus 3.20
 [3]...@freebsd.org | TCP=IP since RFC 956
 FreeBSD committer | BSD since 4.3-tahoe
 N=ver attribute to malice what can adequately be explained by
 incompetence.=r___
 [4]freebsd-hack...@freebsd.org mailing list
 [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackersTo
 unsubscribe, send any mail to [6]fre 
ebsd-hackers-unsubscr...@freebsd.org
 


References

   1. 3Dmailto:p...@phk.freebsd.dk;
   2. file://localhost/tmp/3D   3. 3Dmailto:p...@freebsd.org;
   4. 3Dmailto:fre   5. 3Dhttp://lists.=/
   6. 
3Dmailto:freebsd-hackers-unsub___
freebsd-curr...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Improving the kernel/i386 timecounter performance (GSoC proposal)

2009-03-27 Thread Scott Long


Robert Watson wrote:


On Fri, 27 Mar 2009, Scott Long wrote:

I've been talking about this for years.  All I need is help with the 
VM magic to create the page on fork.  I also want two pages, one 
global for gettimeofday (and any other global data we can think of) 
and one per-process for static data like getpid/getgid.


FWIW, there are some variations in schemes across OS's -- one extreme is 
the Linux approach, which actually exports a mini shared library in ELF 
format on the shared page, providing implementations of various services 
(such as entering system calls), time stuff, etc.  Less extreme are the 
shared pages offered on Mac OS X, etc.




Yes, but I'd like to start somewhere, and considering that it's been
impossible in _5_ years to get the 30 minutes of Peter or JeffR or JHB
time to get the basic VM magic done, I'm keeping my expectations as
modest as possible.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: CFT: Graphics support for /boot/loader

2009-02-05 Thread Scott Long


Julian Elischer wrote:

Max Laier wrote:

On Thursday 05 February 2009 23:18:36 Oliver Fromme wrote:

I have posted detailed instructions on the FreeBSD wiki:

http://wiki.freebsd.org/OliverFromme/BootLoaderTest

Any kind of feedback is welcome.


quick test in qemu - works well.  Very cool!



can you send a screenshot for those of us who can't test it now?


http://wiki.freebsd.org/OliverFromme/BootLoader
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: CFT: Graphics support for /boot/loader

2009-02-05 Thread Scott Long


Oliver Fromme wrote:

Hello fellow hackers,

Some of you might remember that I'm working on graphics
support for our /boot/loader.  Unfortunately, progress has
been rather slow because of non-FreeBSD-related activity.

Anyway, I have now prepared a tarball containing a loader
binary for public testing.  If you are eager to give it a
try, please feel free to do so.  It should work with any
FreeBSD version on i386 and amd64 platforms.

I have posted detailed instructions on the FreeBSD wiki:

http://wiki.freebsd.org/OliverFromme/BootLoaderTest

Any kind of feedback is welcome.



I think that this is really neat, you've done an impressive job
with it good job.  However, I do take issue with your criticism
of the ASCII logo; I actually spent a decent amount of time
designing the block text logo =-)  I wish that there hadn't been
moronic politics over the beastie logo, as that does look a lot
better, even if it is text.  And text is still required for
serial consoles.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: strange behaviour with /sbin/init and serial console

2008-10-31 Thread Scott Long


Ed Schouten wrote:

Hello Theirry,

* Thierry Herbelot [EMAIL PROTECTED] wrote:
with the following patch on /sbin/init, I have two different behaviours 
depending on the console type (on a i386/32 PC) :

- on a video console, I see the expected two messages,
- on a serial console, the messages are not displayed (init silently finishes 
its job and gets to start /etc/rc and everything)


I assume that the writev system call is implemented in 
src/sys/kern/tty_cons.c::cnwrite(), but I could not parse the code to find an 
explanation.


any taker ?

TfH

PS : this is initially for a RELENG_6 machine, but the code is quite similar 
under RELENG_7 or Current


Any data written to /dev/console is not multiplexed to all console
devices, but only the first active device in the list. The reason behind
this, is because it adds a real lot of complexity to the console code,
especially related to polling and reading on /dev/console.

This weekend I'm going to commit a replacement implementation of
/dev/console, which also has this restriction.



The multiplexed console feature is one thing that linux got right.  In a
corporate setting, you really need both a serial console and a video
console in order to effectively manage the machines, as you want to be
able to access them both remotely and locally.  While it might be hard
to build multiplexing into the console driver, do you think it would be
possible to layer a multiplexer on top of it, similar to how the kbdmux
driver works?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: fs/udf: vm pages overlap while reading large dir [patch]

2008-02-28 Thread Scott Long


Pav Lucistnik wrote:

Andriy Gapon píše v čt 28. 02. 2008 v 10:33 +0200:

And while I have your attention, I have a related question.

I have produced a bunch of ISO9660 Level 3 / UDF hybrid media with
mkisofs, and when I mount the UDF part of them, the mount point (root
directory of media) have 0x000 permissions. Yes that's right, d-
in ls -l. That makes the whole volume inaccessible for everyone except
root.

Is this something you can mend in our UDF driver, or should I go dig
inside mkisofs guts? Windows handle these media without any visible
problems.



I wonder if Windows even observes the permissions bits.  You'd have to
special-case the UDF driver code in FreeBSD, which certainly possible
but not terribly attractive.  I'd be interested to see what exactly
mkiso is doing.  Maybe it's putting permissions into extended attributes
and assuming the filesystem driver will read those instead.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: fs/udf: vm pages overlap while reading large dir [patch]

2008-02-28 Thread Scott Long


Andriy Gapon wrote:

on 26/02/2008 21:23 Pav Lucistnik said the following:

Pav Lucistnik píše v út 05. 02. 2008 v 19:16 +0100:

Andriy Gapon píše v út 05. 02. 2008 v 16:40 +0200:


Yay, and can you fix the sequential read performance while you're at it?
Kthx!

this was almost trivial :-)
See the attached patch, first hunk is just for consistency.
The code was borrowed from cd9660, only field/variable names are adjusted.

Just tested it with my shiny new Bluray drive, and it work wonders.
Finally seamless playback of media files off UDF carrying media.

So, how does it look WRT committing it?



Pav,

thank you for the feedback/reminder.
In my personal option the latest patch posted and described at the
following like is a very good candidate for commit:
http://docs.FreeBSD.org/cgi/mid.cgi?47AA43B9.1040608
It might have a couple of style related issues, but it should improve
things a bit even if some larger VM/VFS/GEOM issues remain.

And by the way, a patch from the following PR would be a good
side-dish for the above patch:
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/78987
I think it is simple and obvious enough.



I will commit both of these to CVS today.  Thanks again.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: fs/udf: vm pages overlap while reading large dir [patch]

2008-02-05 Thread Scott Long


Andriy Gapon wrote:

on 04/02/2008 22:07 Pav Lucistnik said the following:

Julian Elischer píše v po 04. 02. 2008 v 10:36 -0800:

Andriy Gapon wrote:

More on the problem with reading big directories on UDF.

You do realise that you have now made yourself the official
maintainer of the UDF file system by submitting a competent
and insightful analysis of the problem?

Yay, and can you fix the sequential read performance while you're at it?
Kthx!



Pav,

this was almost trivial :-)
See the attached patch, first hunk is just for consistency.
The code was borrowed from cd9660, only field/variable names are adjusted.



Your patch looks reasonable.  Btw, for the same reason that read-ahead
makes file reading much faster, I would not change directory reading to
be 1 sector at a time (unless you also do read-ahead for it).


But there is another issue that I also mentioned in the email about
directory reading. It is UDF_INVALID_BMAP case of udf_bmap_internal,
i.e. the case when file data is embedded into a file entry.
This is a special case that needs to be handled differently.
udf_readatoffset() handles it, but the latest udf_read code doesn't.
I have a real UDF filesystem where this type of allocation is used for
small files and those files can not be read.


Oh, so directory data can also follow this convention?  Blah.  Feel free
to fix that too if you want =-)

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: fs/udf: vm pages overlap while reading large dir

2008-02-04 Thread Scott Long


Andriy Gapon wrote:

More on the problem with reading big directories on UDF.

First, some sleuthing. I came to believe that the problem is caused by
some larger change in vfs/vm/buf area. It seems that now VMIO is applied
to more vnode types than before. In particular it seems that now vnodes
for devices have non-NULL v_object (or maybe this is about directory
vnodes, I am not sure).

Also it seems that the problem should happen for any directory with size
larger than four 2048-bytes sectors (I think that any directory with 
300 files would definitely qualify).

After some code reading and run-time debugging, here are some facts
about udf directory reading:
1. bread-ing is done via device vnode (as opposed to directory vnodes),
as a consequence udf_strategy is not involved in directory reading
2. the device vnode has a non-NULL v_object, this means that VMIO is used
3. it seems that the code assumed that non-VM buffers are used
4. bread is done on as much data as possible, up to MAXBSIZE [and even
slightly more in some cases] (i.e. code determines directory data size
and attempts to read in everything in one go)
5. physical sector number adjusted to DEV_BSIZE (512 bytes) sectors is
passed to bread - this is because of #1 above and peculiarities of buf
system
6. directory reading and file reading are quite different, because the
latter does reading via file vnode

Let's a consider a simple case of directory reading 5 * PAGE_SIZE (4K)
bytes starting from a physical sector N with N%2 = 0 from media with
sector size of 2K (most common CD/DVD case):
1. we want to read 12 KB = 3 pages = 6 sectors starting from sector N,
N%2 = 0
2. 4*N is passed as a sector number to bread
3. bo_bsize of the corresponding bufobj is a media sector size, i.e. 2K
4. actual read in bread will happen from b_iooffset of 4*N*DEV_BSIZE =
N*2K, i.e. correct offset within the device
5. for a fresh read getblk will be called
6. getblk will set b_offset to blkno*bo_bsize=4*N*2K, i.e. 4 times the
correct byte offset within the device
7. allocbuf will allocate pages using B_VMIO case
8. allocbuf calculates base page index as follows: pi = b_offset/PAGE_SIZE
this means the following:
sectors N and N+1 will be bound to a page with index 4*N*2K/4K = 2*N
sectors N+2 and N+3 will be bound to the next page 2*N +1
sectors N+4 and N+5 is tied to the next page 2*N + 2

Now let's consider a directory read of 2 sectors (1 page) starting at
physical sector N+1.
Using the same calculations as above, we see that the sector will be
tied to a page 2*(N+1) = 2*N + 2.
And this page already contains valid cached data from the previous read,
*but* it contains data from sectors N+4 and N+5 instead of N+1 and N+2.

So, we see, that because of b_offset being 4 times the actual byte
offset, we get incorrect page indexing. Namely, sector X gets associated
with different pages depending on what sector is used as a starting
sector for bread. If bread starts at sector N, then data of sector N+1
is associated with (second half of) page with index 2*N; but if bread
starts at sector N+1 then it gets associated with (the first half of)
page with index 2*(N+1).

Previously, before VMIO change, data for all reads was put into separate
buffers that did not share anything between them. So the problem was
limited only to wasting memory with duplicate data, but no actual
over-runs did happen. Now we have the over-runs because the VM pages are
shared between the buffers of the same vnode.

One obvious solution is to limit bread size to 2*PAGE_SIZE = 4 *
sector_size. In this case, as before, we would waste some memory on
duplicate data but we would avoid page overruns. If we limit bread size
even more, to 1 sector, then we would not have any duplicate data at
all. But there would still be some resource waste - each page would
correspond to one sector, so 4K page would have only 2K of valid data
and the other half in each page is unused.

Another solution, which to me seems to be better, is to do usual
reading - pass a directory vnode and 2048-byte sector offset to bread.
In this case udf_strategy will get called for bklno translation, all
offsets and indexes will be correct and everything will work perfectly
as it is the case for all other filesystems.
The only overhead in this case comes from the need to handle
UDF_INVALID_BMAP case (where data is embedded into file entry). So it
means that we have to call bmap_internal to see if we have that special
case and hanlde it, and if the case is not special we would call bread
on a directory vnode which means that bmap_internal would be called
again in udf_strategy.
One optimization that can be done in this case is to create a ligher
version of bmap_internal that would merely check for the special case
and wouldn't do anything else.

I am attaching a patch based on the ideas above. It fixes the problem
for me and doesn't seem to create any new ones :-)
About the patch:
hunk #1 fixes a copy/paste
hunk #2 should fixes strategy to not set

Re: amrd disk performance drop after running under high load

2007-10-18 Thread Scott Long


Boris Samorodov wrote:

Hi!

Since nobody answered so far, here is my two cents. I'm not an expert
here so it's only my imho.

On Wed, 17 Oct 2007 22:52:49 +0400 Alexey Popov wrote:


interrupt  total   rate
irq6: fdc0 8  0
irq14: ata0   47  0
irq16: uhci0  1428187319   1851

^^    [1]

irq18: uhci212374352 16
irq23: ehci0   3  0
irq46: amr0 11983237 15
irq64: em01427141755   1850

^^    [2]

cpu0: timer   1540896452   1997
cpu1: timer   1542377798   1999
Total 5962960971   7730


[1] and [2] looks suspicious to me (totals and rate are too close to
each other and btw to timers). Let the latter (timers) alone. Do you
use any USB device? Can you try to use other network card? That
behaviour seems to be an interrupt storm and/or irq collision.




It's neither.  It's a side effect of a feature that FreeBSD abuses for
handling interrupts.  Note that amr0 and ehci2 are acting similar.  It's
mostly harmless, but it does waste CPU cycles.  I wouldn't expect this
on a recent version of FreeBSD, though, at least not from the e1000
driver.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: iSCSI disconnects dilema

2007-01-14 Thread Scott Long

Wilko Bulte wrote:
 On Fri, Jan 12, 2007 at 09:31:04PM +0200, Danny Braniss wrote..
 --s/l3CgOIzMHHjg/5
 Content-Type: text/plain; charset=iso-8859-2
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable

 On Tue, Jan 09, 2007 at 09:06:46AM +0200, Danny Braniss wrote:
 Hi,
 While I think I have almost solved the problem of network disconnects,
 It downed on me a major problem:
 When a 'local' disk crashes, the kernel will probably hang/panic/crash.
 if i don't try to recover, then there is no change in the above scenario.
 if i try to recover, then the client does not know that it should
 umount/fsck/mount.
 While all this seems familiar, removing  a floppy/disk-on-key while it's
 mounted, we could always say you shouldn't have done that!, with
 a network connection, it can happen very often - rebooting the target, a
 network hickup, etc.
 =20
 So, any ideas?
 In my opinion it should be done this way:

 You have a queue of I/O requests. You send the to the other end and wait
 for confirmation. Until confirmation is received, you keep the requests
 queued. If the other end dies, you try to reconnect (until some timeout
 expires, the processes which send those requests will just wait), if you
 reconnect successfully, you resend not-confirmed requests, if you won't
 be able to reconnect, you just pass the errors up.

 This is what I did in ggate and it seems to work.
 That is basically what i'm doing - unacked request get requed.
 the problem I fear (and maybe I'm paranoid :-):
 
 Paranoia is a Good Thing(TM) in data storage land :-)
 
 assume the following scenario, the client(initiator) sends a write command,
 the target acks it, then it crashes, if the write was never completed,
 the initiator goes on as nothing ever happened. 
 
 Yes, but what can the initiator do about that?  I mean, it does not have any
 visibility of what the target has (or has not) done with the data.  '
 
 This is roughly the same as a RAID box accepting a write into a writeback 
 cache
 and ACK-ing to the host.  You can only assume that the RAID box' cache
 will get flushed to the spindles properly.  All the usual horror scenarios
 with a broken battery backup of the cache and a powerfailure etc apply here.
 
 Wilko
 

I forget, does iSCSI have a concept of a flush_cache command, or the
equivalent of what parallel SCSI does with ordered tags?  If so, then
that's how your app or OS knows that the transaction got committed to
stable storage.  It's been long assumed in the external storage world
that you are at the mercy of the external storage cache, so the problem
that Danny is referring to is nothing new.  The real question is how
to implement the equivalent mechanism that iSCSI provides in a way that
the OS/app can make use of it.  For example, CAM issues an ordered tag
periodically to flush the disk cache to stable storage.  Most storage
drivers, including CAM, will issue some sort of a flush_cache command to
the controller and media during system shutdown.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: iSCSI disconnects dilema

2007-01-14 Thread Scott Long

Danny Braniss wrote:
 I forget, does iSCSI have a concept of a flush_cache command, or the
 equivalent of what parallel SCSI does with ordered tags?
 
 not realy - or I can't find it. iSCSI is mainly and envelope for
 scsi commands, so whatever the CAM does, it will pass it on. 
 There are some managemenet commands, so the target can tell the initiator
 that it's going down for example (and what should the driver
 do in such a case in freebsd?)
 

If the periph is open (i.e. mounted), I'd just ignore this and have the
stack go through a normal retry timeout cycle to see if the device comes
back.  If it's closed, then I'd remove the periph.  Knowing if it's
opened or closed is likely hard to do from the iSCSI driver, which is
one reason why iSCSI knowledge needs to eventually be moved upwards in
CAM.

   If so, then
 that's how your app or OS knows that the transaction got committed to
 stable storage.  It's been long assumed in the external storage world
 that you are at the mercy of the external storage cache, so the problem
 that Danny is referring to is nothing new.  The real question is how
 to implement the equivalent mechanism that iSCSI provides in a way that
 the OS/app can make use of it.  For example, CAM issues an ordered tag
 periodically to flush the disk cache to stable storage. 
 nice, (or wishful thinking :-), the scsi part of iSCSI is/can be 
 software/virtual.
 

If the target device returns a successful completion from a command, the
initiator must assume that it's not lying.  You could do a flush/sync
cache command after every I/O, but then you'd have a completely
unacceptable level of performance.  But again, this is not a new problem
specific to iSCSI.  It's long been a design consideration of external
storage, and is why external storage 1) carries a high price tag to
accompany good engineering and testing, and 2) comes with some form of
battery backup, to prevent data loss in case of power loss.

 Most storage
 drivers, including CAM, will issue some sort of a flush_cache command to
 the controller and media during system shutdown.
 
 this took me a long time to fix! the userland program got killed at shutdown,
 the link was lost, and so there was no way to flush buffers, fixed by calling
 fget(...) too.
 
 I guess I can summarize: (and use the 3 monkey law :-)
   1- assume the target is 'well behaved' and will flush cache.
   2- there is - currently - no way to tell the OS that not all
  seems to be as expected.
   3- keep quiet and hope for the best.
 danny
 
 

So you had a scenario where a program was doing I/O right up to system
(initiator) shutdown, and some of those I/O's got lost in the process?
I guess I don't understand why the OS didn't flush all outstanding I/O
buffers after terminating the program and before finishing the shutdown.
Maybe you are doing something illegal in your driver, or maybe you need
to implement a kernel shutdown hook that will allow you to block the
shutdown until everything is flushed.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: iSCSI/shutdown advice needed

2006-11-26 Thread Scott Long


Danny Braniss wrote:

hi,
I'm trying to finish up the iSCSI initiator,
and need some advice. To shutdown the initiator, I
need to:
1- close down the CAM-peripherals, (ie da)
2- empty up all pending iSCSI transactions
3- close the tcp connection
2  3 I can handle, it's 1 that im stuck.
Q: how can I call the peripheral close function, or is there
   some CAM command?
   I tried 
	xpt_async(AC_LOST_DEVICE, isp-cam_path, NULL);

   but this it far to drastic, and actually will cause panic
   if the device is still mounted.

danny



UFS doesn't handle devices going away unexpectedly.  Fixing it involves
massive changes to both UFS and the VM system.  However, sending the
AC_LOST_DEVICE is probably the right thing to do.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: SATA300 Controllers

2006-07-06 Thread Scott Long


Wilko Bulte wrote:

On Wed, Jul 05, 2006 at 08:02:55PM -0500, Derrick T. Woolworth wrote..


Hello all,

Sorry for cross-posting, but these issues seem relevant for lists...

Has anyone had success with SATA300 controllers with FreeBSD 6.1?  I've been
trying Promise and nVidia nForce4 and I'm not having any luck.  Using a MSI
K8NGM2-L motherboard and others, but 6.1's installation hangs as soon as it
sees ad4.  I've also tried using an Adaptec 1210SA controller and had zero



Well, just as a datapoint this works fine for me:

[EMAIL PROTECTED] ~: dmesg|grep -i Prom
atapci0: Promise PDC20771 SATA300 controller port
0xd480-0xd4ff,0xd000-0xd0ff mem 0xf7ff6000-0xf7ff6fff,0xf7fa-0xf7fb
irq 21 at device 13.0 on pci2
ar0: 238475MB Promise Fasttrak RAID1 status: READY
[EMAIL PROTECTED] ~: uname -a
FreeBSD freebie.xs4all.nl 6.1-STABLE FreeBSD 6.1-STABLE #2: Wed Jun 14
22:01:33 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/FREEBIE
i386



Promise has a good relationship with FreeBSD, I would expect their 
controllers to work pretty well.


Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: iSCSI/sendto(...)

2006-06-01 Thread Scott Long


Danny Braniss wrote:


Hi,
on a fairly new 6.1-stable, and probably before, once in a
blue moon, sendto return error 64 (EHOSTDOWN?). but the packet seems to have
been received by the target, since i get a response, and further more,
everything keeps on working.

what is error 64?

danny




EHOSTDOWN comes from the ARP layer of the IP stack, and would be
consistent with the host either getting no arp response or rejected
responses from the target.  It would be useful to run tcpdump+ethereal
on your connection to see what is really going on.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: iSCSI/sendto(...)

2006-06-01 Thread Scott Long


Danny Braniss wrote:

Danny Braniss wrote:



Hi,
on a fairly new 6.1-stable, and probably before, once in a
blue moon, sendto return error 64 (EHOSTDOWN?). but the packet seems to have
been received by the target, since i get a response, and further more,
everything keeps on working.

what is error 64?

danny




EHOSTDOWN comes from the ARP layer of the IP stack, and would be
consistent with the host either getting no arp response or rejected
responses from the target.  It would be useful to run tcpdump+ethereal
on your connection to see what is really going on.



too much traffic, and would be like looking for a needle in a haystack.
(i can't reproduce this at will)
the question is, if it was an error, how come the packet did go out.
need more proof for the above statement - working on it.

danny






I find that ethereal does a great job of associating packets and making
it easy to sort through mountains of data.  It's not so good at actually
collecting the packets, so I run tcpdump in raw collection mode and then
feed the output to ethereal for analysis.  Having tcpdump generate a
circular ring of files that are at most 20MB works best.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: misc questions about the devicedriver arch

2006-05-30 Thread Scott Long

M. Warner Losh wrote:
 : THIRD
 : Because the PCIE configure space is 4k long ,shall we change the
 : #define PCI_REGMAX  255
 : to facilitate the PCI express config R/W?
 
 Maybe.  Lemme investigate because PCIe changes this from a well known
 constant for all pci busses, to a variable one...
 
 Warner

When I added PCIe extended config support, I never took into
consderation the userland access point of view.  Changing this
definition to 4096 might Just Work, and it might Not Work.  Dunno.
In the 18 months since I implemented it, no other person has asked
about userland access.  Other than the silly case of people trying
to write device drivers in PERL, I'm not sure how much value it
gives compared to the stability and security risk it imposes.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: misc questions about the devicedriver arch

2006-05-30 Thread Scott Long


william wallace wrote:


On 5/30/06, Scott Long [EMAIL PROTECTED] wrote:


M. Warner Losh wrote:
 : THIRD
 : Because the PCIE configure space is 4k long ,shall we change the
 : #define PCI_REGMAX  255
 : to facilitate the PCI express config R/W?

 Maybe.  Lemme investigate because PCIe changes this from a well known
 constant for all pci busses, to a variable one...

 Warner

When I added PCIe extended config support, I never took into
consderation the userland access point of view.  Changing this
definition to 4096 might Just Work, and it might Not Work.  Dunno.
In the 18 months since I implemented it, no other person has asked
about userland access.  Other than the silly case of people trying
to write device drivers in PERL, I'm not sure how much value it
gives compared to the stability and security risk it imposes.

Scott



I have to clarify my intentions that i am not TRYing to do a userland
PCI express driver . I just want to make a interesting branch whitch
can do pci express native Hot plug and hot remove ,with Mr Losh and
other gentlemen's help ,i am making progress ,and now a loadable
module is finishing .
I have borrowed many Ideas from Linux ,but several fatal difficulties
paused me ,with the PCI_REGMAX included.
wish to hear from u :) thank u!



The PCI_REGMAX definition is not used by the extended configuration 
space code.  However, this code only exists on i386 right now.  I

haven't gotten around to implementing it on amd64 yet.  Implementing
it there is not just a trivial change of the defintion.  Some platform
specific memory map tricks need to be done.  It would be possible to
port the i386 code wholesale, but that code is not terribly efficient
on the amd64 platform.

So, what problem are you running into?

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: misc questions about the devicedriver arch

2006-05-22 Thread Scott Long


william wallace wrote:

[...]


MSI:
I've bantered around different suggestions for an API that will support
this.  The basic thing that a driver needs from this is to know
exactly how many message interrupt vectors are available to it.  It
can't just register vectors and handlers blindly since the purpose of
MSI is to assign special meanings to each vector and allow the driver to
handle each one in specifically.


[...]

I just wanted to briefly say that an MSI implementation has been done
recently, and that it should start getting wider circulation and review
soon.  That's not to say that more work and design can't be done in this
area, but we should probably wait a bit and see what has been done
already.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 答复: 答复: help:How to map a physic al address into a kernel address?

2006-05-17 Thread Scott Long

[EMAIL PROTECTED] wrote:

 Hi guys:
 
 The attached file is the sample codes of my HBA driver. I make notes on the
 place where the address transfer is needed. Please make comments if
 possible. 
 
 Thanks a lot!
 
 Hong 
 

It looks like the primary question that you are asking in the code is this:

How to get the kernel virtual address of csio-data_ptr?

Correct?  The answer is that csio-data_ptr is a kernel virtual address
already if the CAM_DATA_PHYS flag is not set.  For prepare_sg_table,
you can just ignore the case where it isn't unless you expect to also
write software that will use the flag (CAM was originally written for an
application that did use this flag, but it's use is no longer common).  As
for ft_map_sg, the only way that you can be in there is if CAM_DATA_PHYS
was not set, so it's safe to say that csio-data_ptr is a kernel virtaul
address.

One thing to note about your code is that Local_StartIO should be called
from
within ft_map_sg instead of ft_cam_action.  That way the EINPROGRESS
status of bus_dmamap_load will be handled correctly.  I can' describe this
more if you have questions.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: help:How to map a physical address into a kernel address?

2006-05-16 Thread Scott Long


[EMAIL PROTECTED] wrote:


Hi guys:

 


To access sg_table in kernel address, I need to map the starting physical
address of a segment into a kernel address. As I know that, we can use
phystovirt()/bustovirt(), or kmap()/kmap_atomic() to map a bus/physical
address or a physical page into a kernel address in Linux, but I did not
find such a function in FreeBSD. Please help me on this, it is very urgent!

 


Thanks a lot!



What kind of memory are you trying to access?  Are you trying to access
memory on the card that is pointed to by PCI base address registers?  If
so then you need to use the bus_space API.  Or are you trying to 
allocate memory in the kernel and then give the physical address of that
memory to your card?  If so then you need to use bus_dma.  Both Warner 
and I are happy to help guide you with these APIs.


Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: RFC: Optionally verbose SYSINIT

2006-05-11 Thread Scott Long


This would be awesome, please do it.

Scott


Benno Rice wrote:

One of the things that I found useful both in starting the PowerPC port 
and in doing the XScale stuff I'm working on is making the SYSINIT stuff 
done by mi_startup() verbose.  This generally requires hacking your own 
code into mi_startup() to print out which SYSINIT you're up to and the 
like.  jhb recently pointed me at this version he wrote which uses DDB 
to look up the symbol corresponding to the SYSINIT in question which 
makes it even more useful.


I would like to commit this version, which I've made optional based on a 
VERBOSE_SYSINIT option, so as to make it available to anyone else 
further down the line who's porting to a new architecture.


Comments?  Questions?




Index: conf/options
===
RCS file: /home/ncvs/src/sys/conf/options,v
retrieving revision 1.540
diff -u -r1.540 options
--- conf/options7 May 2006 18:12:17 -   1.540
+++ conf/options11 May 2006 05:34:26 -
@@ -158,6 +158,7 @@
 TURNSTILE_PROFILING
 TTYHOG opt_tty.h
 VFS_AIO
+VERBOSE_SYSINITopt_global.h
 WLCACHEopt_wavelan.h
 WLDEBUGopt_wavelan.h
 
Index: kern/init_main.c

===
RCS file: /home/ncvs/src/sys/kern/init_main.c,v
retrieving revision 1.262
diff -u -r1.262 init_main.c
--- kern/init_main.c7 Feb 2006 21:22:01 -   1.262
+++ kern/init_main.c11 May 2006 05:35:21 -
@@ -84,6 +84,9 @@
 #include vm/vm_map.h
 #include sys/copyright.h
 
+#include ddb/ddb.h

+#include ddb/db_sym.h
+
 void mi_startup(void); /* Should be elsewhere */
 
 /* Components of the first process -- never freed. */

@@ -169,6 +172,11 @@
register struct sysinit **xipp; /* interior loop of sort*/
register struct sysinit *save;  /* bubble*/
 
+#if defined(VERBOSE_SYSINIT)

+   int last;
+   int verbose;
+#endif
+
if (sysinit == NULL) {
sysinit = SET_BEGIN(sysinit_set);
sysinit_end = SET_LIMIT(sysinit_set);
@@ -191,6 +199,14 @@
}
}
 
+#if defined(VERBOSE_SYSINIT)

+   last = SI_SUB_COPYRIGHT;
+   verbose = 0;
+#if !defined(DDB)
+   printf(VERBOSE_SYSINIT: DDB not enabled, symbol lookups disabled.\n);
+#endif
+#endif
+
/*
 * Traverse the (now) ordered list of system initialization tasks.
 * Perform each task, and continue on to the next task.
@@ -206,9 +222,38 @@
if ((*sipp)-subsystem == SI_SUB_DONE)
continue;
 
+#if defined(VERBOSE_SYSINIT)

+   if ((*sipp)-subsystem  last) {
+   verbose = 1;
+   last = (*sipp)-subsystem;
+   printf(subsystem %x\n, last);
+   }
+   if (verbose) {
+#if defined(DDB)
+   const char *name;
+   c_db_sym_t sym;
+   db_expr_t  offset;
+
+   sym = db_search_symbol((vm_offset_t)(*sipp)-func,
+   DB_STGY_PROC, offset);
+   db_symbol_values(sym, name, NULL);
+   if (name != NULL)
+   printf(   %s(%p)... , name, (*sipp)-udata);
+   else
+#endif
+   printf(   %p(%p)... , (*sipp)-func,
+   (*sipp)-udata);
+   }
+#endif
+
/* Call function */
(*((*sipp)-func))((*sipp)-udata);
 
+#if defined(VERBOSE_SYSINIT)

+   if (verbose)
+   printf(done.\n);
+#endif
+
/* Check off the one we're just done */
(*sipp)-subsystem = SI_SUB_DONE;
 





___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: FreeBSD 6.1 Released

2006-05-11 Thread Scott Long


Mike Jakubik wrote:


Jonathan Noack wrote:


The *entire* errata page was from 6.0; it was a mistake.  This wasn't
some put on the rose-colored classes and gloss over major issues
thing.  It was a long release cycle and something was forgotten.  C'est
la vie.  It's always a good idea to check the most up-to-date version of
the errata page on the web anyway, so it's *not* too late to update it.

  



How convenient. These problems needed to be addressed in the release 
notes, not some on line version.



So, you're still waiting for Scott to personally fix the problems and he
couldn't deliver?  Huh?  I quote you
(http://lists.freebsd.org/pipermail/freebsd-stable/2006-May/025209.html):
Scott, thanks for the very generous gesture, but i cant ask you
something like this.
  



He emailed me personally, i accepted his help offer, never heard from 
him since.




Sorry, things got lost in the shuffle to get this released.  For your
specific snapshot deadlocks, please test the changes that have gone into
7-CURRENT and report back if they fix your problem.  Only with active 
testing will we know if they are good to be backported in time for 6.2.


Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Atomic updates of NFS export lists

2006-05-10 Thread Scott Long


Andrey Simonenko wrote:

Greetings,

In my environment non-atomic updates of NFS export lists are not
acceptable.  So, I decided to correct this problem.  As the result
mountd, kern/vfs_export.c were completely rewritten, mount.h,
vfs_mount.c and nfs_srvsubs.c also got changes.

For details see kern/9619.



I've been looking at this since my company is also running into these
problems.  I've integrated your patchset into my tree, and I'll let you
know how it works after a few days of testing.  One thing to note is
that you've significantly re-written much of mountd, as well as changed
the API/ABI a bit and removed some command line switches.  That makes it
less attractive for inclusion in RELENG_6, but is fine for 7-CURRENT.
With that in mind, you should switch over to using nmount() instead of
mount(), that way you can completely remove the per-filesystem handling
code that you added.

If there is any way that you can trim the changes to just implement the
new export primitives and leave out the libsock stuff, it would be much
easier to justify getting into RELENG_6.  I don't have an opinion on the
libsock design, but you should talk to people like Robert Watson about
that before this goes into 7-CURRENT.

But thank you very much for this.  It was a pleasant surprise to see
this after I had been talking to others about exactly these problems for
a few weeks.  Hopefully we can get this integrated into FreeBSD soon.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

FreeBSD 6.1 Released

2006-05-08 Thread Scott Long

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It is my great pleasure and privilege to announce the availability of
FreeBSD 6.1-RELEASE.  This release is the next step in the development
of the 6.X branch, delivering several performance improvements, many
bugfixes, and a few new features.  These include:

~ Addition of a keyboard multiplexer.  This allows USB and PS/2 keyboards
  to coexist without any special options at boot.
~ Many fixes for filesystem stability.  High load stress tests are now run
  successfully on a regular basis as part of the normal FreeBSD QA process.
~ Automatic configuration for man Bluetooth devices, as well as automatic
  support for running WiFi access points.
~ Addition of drivers for new ethernet and SAS and SATA RAID controllers.
~ BIND updated to 9.3.2
~ sendmail updated to 8.13.6

NOTE: It was discovered at the last minute that the errata notes that were
packaged with the release are out of date.  For a complete list of known
problems, please see the online errata list, available at:

http://www.FreeBSD.org/releases/6.1R/errata.html

For more information about FreeBSD release engineering activities,
please see:

http://www.FreeBSD.org/releng

 Availability
 -

FreeBSD 6.1-RELEASE supports the i386, pc98, alpha, sparc64, amd64,
powerpc, and ia64 architectures and can be installed directly over the
net using bootable media or copied to a local NFS/FTP server.
Distributions for all architectures are available now.

Please continue to support the FreeBSD Project by purchasing media
from one of our supporting vendors.  The following companies will be
offering FreeBSD 6.1 based products:

~   FreeBSD Mall, Inc.http://www.freebsdmall.com/
~   Daemonnews, Inc.  http://www.bsdmall.com/freebsd1.html

If you can't afford FreeBSD on media, are impatient, or just want to
use it for evangelism purposes, then by all means download the ISO
images.  We can't promise that all the mirror sites will carry the
larger ISO images, but they will at least be available from the
following sites.  MD5 and SHA256 checksums for the release images are
included at the bottom of this message.

 Bittorrent
 --

The FreeBSD project encourages the use of BitTorrent for distributing
the release ISO images.  A collection of torrent files to download the
images is available at

http://torrents.freebsd.org:8080/

 FTP
 ---

At the time of this announcement the following FTP sites have FreeBSD
6.1-RELEASE available.

ftp://ftp.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.FreeBSD.org/pub/FreeBSD/
ftp://ftp3.FreeBSD.org/pub/FreeBSD/
ftp://ftp5.FreeBSD.org/pub/FreeBSD/
ftp://ftp.at.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.ch.FreeBSD.org/pub/FreeBSD/
ftp://ftp.cz.FreeBSD.org/pub/FreeBSD/
ftp://ftp.ee.FreeBSD.org/pub/FreeBSD/
ftp://ftp.fi.FreeBSD.org/pub/FreeBSD/
ftp://ftp.fr.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.ie.FreeBSD.org/pub/FreeBSD/
ftp://ftp.is.FreeBSD.org/pub/FreeBSD/
ftp://ftp1.ru.FreeBSD.org/pub/FreeBSD/
ftp://ftp.se.FreeBSD.org/pub/FreeBSD/
ftp://ftp.si.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.tw.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.uk.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.us.FreeBSD.org/pub/FreeBSD/
ftp://ftp5.us.FreeBSD.org/pub/FreeBSD/

FreeBSD is also available via anonymous FTP from mirror sites in the
following countries: Argentina, Australia, Brazil, Bulgaria, Canada,
China, Czech Republic, Denmark, Estonia, Finland, France, Germany,
Hong Kong, Hungary, Iceland, Ireland, Israel, Japan, Korea, Lithuania,
Amylonia, the Netherlands, New Zealand, Poland, Portugal, Romania,
Russia, Saudi Arabia, South Africa, Slovak Republic, Slovenia, Spain,
Sweden, Taiwan, Thailand, Ukraine, and the United Kingdom.

Before trying the central FTP site, please check your regional
mirror(s) first by going to:

ftp://ftp.yourdomain.FreeBSD.org/pub/FreeBSD

Any additional mirror sites will be labeled ftp2, ftp3 and so on.

More information about FreeBSD mirror sites can be found at:

http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

For instructions on installing FreeBSD, please see Chapter 2 of The
FreeBSD Handbook.  It provides a complete installation walk-through
for users new to FreeBSD, and can be found online at:

http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/install.html

 Acknowledgments
 

Many companies donated equipment, network access, or man-hours to
finance the release engineering activities for FreeBSD 6.1 including
The FreeBSD Foundation, FreeBSD Systems, Hewlett-Packard, Yahoo!,
Sentex Communications, and Copan Systems.

The release engineering team for 6.1-RELEASE includes:

Scott Long [EMAIL PROTECTED] Release Engineering,
Ken Smith [EMAIL PROTECTED]I386, AMD64, Sparc64 Release Building,
Mirror Site Coordination
Robert Watson [EMAIL PROTECTED] Release Engineering, Security
Doug White [EMAIL

Re: Core Duo - only one cpu being used

2006-05-04 Thread Scott Long


Erich Dollansky wrote:

Hi,

Eric Anderson wrote:


Erich Dollansky wrote:


Hi,

Eric Anderson wrote:



  PID USERNAMETHR PRI NICE   SIZERES STATE  C   TIME   WCPU 
COMMAND
   11 root  1 171   52 0K 8K CPU1   0   0:00 99.02% 
idle: cpu1
 2653 root  1 1280 18564K 17560K RUN0   0:01 34.00% 
cc1plus



could it be that it is just a problem with top itself?

It cannot be that CPU1 uses 99% for the idle process and 34% for the 
compiler.


Play with the other sort options. You might find the the idle process 
for CPU0.



Is this what you want:

$ ps -auxw | grep idle
root11 99.0  0.0 0 8  ??  RL 7:45PM   0:00.00 
[idle: cpu1]
root12  0.0  0.0 0 8  ??  RL 7:45PM  51:04.57 
[idle: cpu0]


something is really wrong here. CPU1 gets 99% of the time but uses then 
only 0 seconds while CPU0 gets 0% of the time but uses 51 hours?


CPU1 is being treated as a hyperthreading core instead of a real core, 
and is being disabled per our policy on Intel hyperthreading.  By 
'disabled' I mean that it is started, but it is being excluded from

scheduling decisions, and thus is only running its idle proc.  It's
also handling any interrupts that come to it, such as timer and IPI
interrupts, so it's at 99% instead of 100% for the idle proc.  There
is nothing broken about the number you are seeing, your system is
just running under a scheduling policy that it should not be.

This should have been fixed a week or so ago by a commit to HEAD,
RELENG_6, and RELENG_6_1 by Colin Percival.  How old is kernel?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

FreeBSD 6.1-RC2 available

2006-05-02 Thread Scott Long


All,

I'm foregoing the formal pretty announcement for 6.1-RC2 because the
message needs to get out and I don't have an hour to spend on making it
look nice.

FreeBSD 6.1-RC2 is available for download.  This is the last RC before
the release.  Please test it to make sure that there have been no
regressions since the last RC, and to make sure that it there are no
new problems with installation.  Other than a few cosmetic tweaks,
there will be no more changes before 6.1.

The list of known issues:

- Using UFS snapshots and quotas at the same time can cause system
lockups.  There is no work-around available at this time, so please
avoid this configuration.  This will be fixed in a future release.

- Under rare and heavily loaded circumstances, there is a possibility
to leak pty's.  This can result in not being able to long into the
system.  The cause of this is not well understood, and it appears to
be very difficult to trigger it.

- DEVFS is known to have several problems with multiple processes
doing directory listings at the same time, as well as with unmounting
DEVFS directories at the same time.  There is no known work-around
for this at this time.  This will be fixed in a future release.

- A number of improvements and fixes for various drivers have come
in at the last minute that still require much more testing and
validation.  This includes the 'if_nve' and 'if_bge' drivers in
particular.  These updates will be included in future releases.

MD5 (6.1-RC1-amd64-bootonly.iso) = 93abe294e7678e00b7391f47a01074fe
MD5 (6.1-RC1-amd64-disc1.iso) = c1b718b6752f0e48edb8b822ee9b0dc8
MD5 (6.1-RC1-amd64-disc2.iso) = 4a67ae8ed7a7852e08442205d6a5cd7c
MD5 (6.1-RC1-i386-bootonly.iso) = b56aac9ca1a868daaf5673cd21bf78f5
MD5 (6.1-RC1-i386-disc1.iso) = 12521c3f9d40f637e4cdb40ea398d072
MD5 (6.1-RC1-i386-disc2.iso) = 53615f19889fe85c41e2bcea0b2be525
MD5 (6.1-RC2-ia64-bootonly.iso) = 481e6f1899c0ba632272e7853b8ef59e
MD5 (6.1-RC2-ia64-disc1.iso) = f4601bb9089af1bcde5b751f5762f35a
MD5 (6.1-RC2-ia64-disc2.iso) = b44d5a0538b784cbb5de0a8ec23e4256
MD5 (6.1-RC2-ia64-livefs.iso) = 0fe8b66a80edaa50ac353d5471930035
MD5 (6.1-RC2-pc98-disc1.iso) = 773a64a475596d586d0a1573d88310cc

SHA256 (6.1-RC1-amd64-bootonly.iso) = 
88e072b4898692813517aa254a33f1e7469de0e590c36bfb3e92cb120ac0ad16
SHA256 (6.1-RC1-amd64-disc1.iso) = 
017e69c5461fe2c865a395830dde88c8a55e7ec83d9a195b3b619346b44f9cc6
SHA256 (6.1-RC1-amd64-disc2.iso) = 
81624f3b8dfa67ceab1dc6ec0a94c4485ad85955321c39d13c9ab4a678f776ef
SHA256 (6.1-RC1-i386-bootonly.iso) = 
ec1a3fbf53186b5bc44dbfcdc77872c847f3c55532bb62f2afb4133328e7994f
SHA256 (6.1-RC1-i386-disc1.iso) = 
e0b83f2cbd27db20f330036d0a25b8366b9e45df4b9c09354f76e584a9eb3b83
SHA256 (6.1-RC1-i386-disc2.iso) = 
de1fe5009229efd44b25bb18c4e68b03027259171cd9e017fe5bffadaa3402bb
SHA256 (6.1-RC2-ia64-bootonly.iso) = 
c044989257754fa17daa352f76c3e011dfc04b3b242c2153c7a1ec47a773d4d1
SHA256 (6.1-RC2-ia64-disc1.iso) = 
60bec7c25b8f645a9d20d3240397c7a92f42d24ff5d01b4604ece5f9ee499ccc
SHA256 (6.1-RC2-ia64-disc2.iso) = 
854048d4ba4dcf00657501d36a5fb15a94ed4c20e646031960ebc3315c3a513e
SHA256 (6.1-RC2-ia64-livefs.iso) = 
fb3fadb00c9ddb6233172a34a7d47ab80171b54410835954c20f50849359ee73
SHA256 (6.1-RC2-pc98-disc1.iso) = 
f2b5f17a3355465727613e33807964a0cf92d9c02868cd2c25440995b2c6ebfd


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Zero Copy, FreeBSD and Linus Torvalds opinion

2006-04-30 Thread Scott Long


Iantcho Vassilev wrote:

Hello guys,


in bsdnews.com i found this link http://kerneltrap.org/node/6506 and
particulary this:

I claim that Mach people (and apparently FreeBSD) are incompetent idiots.
Playing games with VM is bad. memory copies are _also_ bad, but quite
frankly, memory copies often have _less_ downside than VM games, and bigger
caches will only continue to drive that point home.




What do you think about it?


I claim that Linus is an attention whore.  How about that?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

FreeBSD 6.1-RC1 Available

2006-04-13 Thread Scott Long


Announcement


The FreeBSD Release Engineering Team is pleased to announce the
availability of FreeBSD 6.1-RC1.  It is meant to be a refinement of the 
6-STABLE, branch with few dramatic changes.  A lot of bugfixes have been 
made, some drivers have been updated, and some areas have been tweaked 
for better performance, etc. but no large changes have been made to the

basic architecture.  This RC is late in coming due to many more bugs
being fixed, as well as a keyboard multiplexer being added.  This is
enabled by default via the 'kbdmux' driver and allows multiple keyboards
of any type to be plugged in and work at once.  In turn, the boot menu
option to handle USB keyboards specially has been removed as it is no
longer needed.  This feature has been tested for several months, but
more testing is always needed.

We encourage people to help with testing so any final bugs can be
identified and worked out.  Availability of ISO images is given below.
If you have an older system you want to update using the normal
CVS/cvsup source based upgrade the branch tag to use is RELENG_6_1. 
Problem reports can be submitted using the send-pr(1) command.


The FreeBSD 5.5 release process is on hold while we put the final
touches on 6.1.  It will resume within 1-2 weeks with a 5.5-RC1
release.

The list of open issues and things still being worked on are on the
todo list:

http://www.freebsd.org/releases/6.1R/todo.html

Known Issues


The NDIS driver is known to not work correctly with the wpa_supplicant
package.  This will be fixed for the release.

A string termination problem can cause geom(8) commands to abort
randomly.  This will be fixed for the release.

Availability


The RC1 ISOs and FTP support are available on most of the FreeBSD
Mirror sites.  A list of the mirror sites is available here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

The MD5s are:

MD5 (6.1-RC1-alpha-disc1.iso) = c9ce5255facfc44a30f34f55dd6ee2d6
MD5 (6.1-RC1-alpha-bootonly.iso) = 9277c6cd9c5dd17a9aeaa9394fe1d2f8

MD5 (6.1-RC1-amd64-bootonly.iso) = 93abe294e7678e00b7391f47a01074fe
MD5 (6.1-RC1-amd64-disc1.iso) = c1b718b6752f0e48edb8b822ee9b0dc8
MD5 (6.1-RC1-amd64-disc2.iso) = 4a67ae8ed7a7852e08442205d6a5cd7c

MD5 (6.1-RC1-i386-bootonly.iso) = b56aac9ca1a868daaf5673cd21bf78f5
MD5 (6.1-RC1-i386-disc1.iso) = 12521c3f9d40f637e4cdb40ea398d072
MD5 (6.1-RC1-i386-disc2.iso) = 53615f19889fe85c41e2bcea0b2be525

MD5 (6.1-RC1-ia64-bootonly.iso) = b2f284c8f6c28455ac59cb37ff2f6658
MD5 (6.1-RC1-ia64-disc1.iso) = e1385927a3272674512f8205aef0addc
MD5 (6.1-RC1-ia64-disc2.iso) = 8b212b2f0914f13621996a8ad8397c71
MD5 (6.1-RC1-ia64-livefs.iso) = 75fdb240f273e7f9b2cdefcab43144c2

MD5 (6.1-RC1-pc98-disc1.iso) = 65465e3298efc5122607ea3d0b3e7136

SHA256 (6.1-RC1-alpha-disc1.iso) = 
d40b8e3e1944f28c5ba1b1f55eb7b5cc22472177116b98f85f2c5bb0ffb59a5f
SHA256 (6.1-RC1-alpha-bootonly.iso) = 
43ceaf712475d00b7287d09753635383cb284ad8fc63b98608e55ab458aed157


SHA256 (6.1-RC1-amd64-bootonly.iso) = 
88e072b4898692813517aa254a33f1e7469de0e590c36bfb3e92cb120ac0ad16
SHA256 (6.1-RC1-amd64-disc1.iso) = 
017e69c5461fe2c865a395830dde88c8a55e7ec83d9a195b3b619346b44f9cc6
SHA256 (6.1-RC1-amd64-disc2.iso) = 
81624f3b8dfa67ceab1dc6ec0a94c4485ad85955321c39d13c9ab4a678f776ef


SHA256 (6.1-RC1-i386-bootonly.iso) = 
ec1a3fbf53186b5bc44dbfcdc77872c847f3c55532bb62f2afb4133328e7994f
SHA256 (6.1-RC1-i386-disc1.iso) = 
e0b83f2cbd27db20f330036d0a25b8366b9e45df4b9c09354f76e584a9eb3b83
SHA256 (6.1-RC1-i386-disc2.iso) = 
de1fe5009229efd44b25bb18c4e68b03027259171cd9e017fe5bffadaa3402bb


SHA256 (6.1-RC1-ia64-bootonly.iso) = 
2b7290e4babcb647ec8b2a499fc5e0fc6918ac20ee0432e9ce2d216c84a540fa
SHA256 (6.1-RC1-ia64-disc1.iso) = 
64c035cf52544e2088720d6e6e602f95b53f6d7b342431a1aa2f4b4c32438847
SHA256 (6.1-RC1-ia64-disc2.iso) = 
e4f2f0599e9e80edec10f30de565e34e16bd8716da6673834036ce4c66d32dc0
SHA256 (6.1-RC1-ia64-livefs.iso) = 
cac9b95ba01a69d78af0034d96868a7b93e20840f3ec0eacb7a30c7e7dd0b39a


SHA256 (6.1-RC1-pc98-disc1.iso) = 
b797113a34628130f759dc9756b741230dcd064f4519f1a477b3491a02f346ca


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Context switching

2006-04-10 Thread Scott Long


Nickolas wrote:

Hello All!

  I'm porting a CPI card driver from linux to FreeBSD.
  Some initialization routines require much time (~1-2 seconds).
  Initialization of hardware should be done during opening device
  special file. So, I need to switch thread context.

  I'm doing it in such way:

  mi_switch(SW_VOL, choosethread());

  Main trouble: system panic after program exit.

  dmesg output:
--
Fatal trap 12: page fault while in user mode
fault virtual address   = 0xbfbfe5bc
fault code  = user write, protection violation
instruction pointer = 0x1f:0x8074604
stack pointer   = 0x2f:0xbfbfe5c0
frame pointer   = 0x2f:0xbfbfe5f8
code segment= base 0xc090f8c0, limit 0x0, type 0x13
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 472 (bash)
trap number = 12
panic: page fault
--

  Please, tell how correct context switching should be implemented?

  OS version: FreeBSD 5.4



tsleep and msleep are the appropriate ways to context switch.  mi_switch
is an implementation detail of the scheduler.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Using any network interface whatsoever

2006-04-08 Thread Scott Long


Ceri Davies wrote:

On Fri, Apr 07, 2006 at 03:57:42PM -0700, Brooks Davis wrote:


On Fri, Apr 07, 2006 at 11:53:42PM +0100, Ceri Davies wrote:


I'm trying to configure a bootable image to be used in various situations
and on various (mostly unknown) hardware.

For the filesystem I can use geom_label and /dev/ufs/UnlikelyString, but I'd
also like to have it try to configure whatever interfaces the machine
happens to have via DHCP.

Other than specifying ifconfig_if0=DHCP once for every possible value of
if, is there a mechanism to do this already?


ifconfig_DEFAULT



Superb, thank you!



If you have non-Ethernet-like interfaces compiled in, you will probably
want create empty ifconfig_if variables for them since DHCP won't
work very well there. :)



Good point, thanks again :)

Ceri


Well, the real question is why we force the details of driver names onto 
users.  Network and storage drivers are especially guilty of this, but

tty devices also are annoying.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Using any network interface whatsoever

2006-04-08 Thread Scott Long


Mike Meyer wrote:

In [EMAIL PROTECTED], Scott Long [EMAIL PROTECTED] typed:

Well, the real question is why we force the details of driver names onto 
users.  Network and storage drivers are especially guilty of this, but

tty devices also are annoying.



Because Unix has always made the hardware details available to
administrators. Times have changed so that users now need to do things
that used to be restricted to administrators.

This historical behavior is a *good* thing. If all devices of type
foo are just named foo and assigned numbers by the system, then I
have no control over the names. If I don't care which is which, this
isn't a problem. If I do care - for instance, I want to distinguish
between the ethernet interface that's on the internet and the one
that's on my LAN, or I want root to be on the disk with the root file
system on it - then this is a PITA, because every time I add hardware
to the system, or re-arrange the cards in the cage, or similar things,
I risk breaking the system configuration. If the device names are
completely determined by the hardware settings, then this doesn't
happen.

Real world examples of this type of breakage include a FreeBSD 4.x
system with SCSI disks that failed to boot when a USB mass storage
device was plugged into it, and a Solaris system that started swapping
on it's Ingres raw database partition after a disk was added.

If a system is meant for desktop use where you typically have at most
one of anything, then hiding the names from the users is a good
thing. In a server environment, where you may have multiple instances
of several different device types, then being able to easily tell
which is which is a good thing.

mike


You're argument here doesn't really make sense.  Youre' saying that
instead of /dev/da0, we should have
/dev/HITACHI-HUS103073FL3800-SA19-B0T1L0, and instead of em0, it should
be em0-192.168.254.199-24-192.168.254.1-192.168.254.255, right?  That
way all the information is present and there is no chance of mixing up
devices.

I'm not saying that we should get rid of the device information.  I'm
fully happy making it available to top layer applications.
Administrators definitely need the information to make good decisions.
But the information isn't always needed, and it does make simple
management tasks harder.  It also adds complexity that can lead to
problems.  Why when I add a RAID driver do I also need to hack up
sysinstall so that it'll recognise the RAID devices?  This is 2006, not
1976!  The computer should be helping us in administration tasks, not
hiding behind inconsistent and obscure names.

Now, for your specific case of SCSI, it is possible to wire down device
assignments by the administrator.  It's been documented how to do this
in man pages and kernel config files, most recently by me personally,
for years.  The flaw is that it still requires specific operator
intervention to make work.  That's where things like volume labels come
in.  Does a sysadmin care about the low-level device name for a drive on
a Windows or Mac system?  Does he even know without taking a deep look
inside the system?  Does not knowing it make it any less possible to
easily and reliable manage and control the hardware?  It's all done
through human-readable labels that are easy to work with.  The low level
information is still available when needed, but it's not the primary
means of control.  I think that's fine; it strikes the balance between
control and ease of use that I'm looking for.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Using any network interface whatsoever

2006-04-08 Thread Scott Long


Mike Meyer wrote:

In [EMAIL PROTECTED], Scott Long [EMAIL PROTECTED] typed:

Please trim the text you are repling to.



Please, I'm tired of arbitrary email etiquette.


But where do you put the label on an ethernet interface?

mike


It sounds like your message is, don't be like Linux.  Fine, what do
you want instead?  How does having 2 em devices in my system, named em0
and em1, tell me by name which one is connected to which LAN?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Using any network interface whatsoever

2006-04-08 Thread Scott Long


Ceri Davies wrote:


On Sat, Apr 08, 2006 at 08:34:30AM -0600, Scott Long wrote:


On Fri, Apr 07, 2006 at 11:53:42PM +0100, Ceri Davies wrote:



For the filesystem I can use geom_label and /dev/ufs/UnlikelyString, but 
I'd

also like to have it try to configure whatever interfaces the machine
happens to have via DHCP.

Other than specifying ifconfig_if0=DHCP once for every possible 
value of

if, is there a mechanism to do this already?


ifconfig_DEFAULT


Well, the real question is why we force the details of driver names onto 
users.  Network and storage drivers are especially guilty of this, but

tty devices also are annoying.



The current situation on BSD, where I can identify which interface is
meant by its type, is definitely preferable to the Linux situation where
eth0 may mean something different tomorrow depending on what is plugged
in.

Since we can rename devices arbitrarily, I don't really see a problem
with respect to anything else.

Ceri


I'll say again, how does having em0, em1, em2, and em3 help me know what
is going on with each of those interfaces?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: FreeBSD Kernel Quality?

2006-04-08 Thread Scott Long


Robert Huff wrote:


Sam Leffler writes:



OTOH we've done nothing with user application code and based on
the work I've seen done by netbsd there's plenty of stuff to be
fixed there.



When you say user application code, is this an alias for
ports or do you mean non-ported applications?


Robert Huff


user application code == code not in src/sys/...  That means
src/lib, src/bin, src/sbin, src/usr.bin, etc.


Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: patchset-9 release (Re: [unionfs][patch] improvements of the unionfs - Problem Report, kern/91010)

2006-03-17 Thread Scott Long


Jacques Marneweck wrote:

Danny Braniss wrote:


Daichi GOTO wrote:
   


All folks have interests in improved unionfs should keep attentions
and ask how about merge? at every turn :)
 


OK.  How about a merge?

I'd really like to see this in 6-STABLE.

Regards,

Jan Mikkelsen.
   


just a 'me too'. I've been running with the patch(under 6.1) and it's 
definitely

better than the panics with the unpatched version. in other words,
IMHO, it does not break anything, and it actualy fixes somethings.

danny
 


Any ETA to when we can see this merged into 6.1 and 5.5?

Regards
--jm



Since it's not in HEAD yet, it's pretty improbable that it'll get into
5.5 and 6.1.  It would be nice to get it in for 6.2 though.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: patchset-9 release (Re: [unionfs][patch] improvements of the unionfs - Problem Report, kern/91010)

2006-03-16 Thread Scott Long


Daichi GOTO wrote:

Jan Mikkelsen wrote:


Daichi GOTO wrote:


All folks have interests in improved unionfs should keep attentions
and ask how about merge? at every turn :)



OK.  How about a merge?

I'd really like to see this in 6-STABLE.



Me too, but unfortunately it is difficult with some reasons
(detail information http://people.freebsd.org/~daichi/unionfs/).
Of course, our patch gives the conditions for integration of
-current OK. For -stable is BAD.

We must keep the API compatibility of command/library
for integration of -stable. With some technical/specifical
reasons, our improved unionfs has a little uncompatibility.

For the last time, integration of -stable will be left
to the judgment of src committers and others.


Regards,

Jan Mikkelsen.





Right now, unionfs is somewhat usable for read-only purposes.  As
long as your work doesn't alter or break the behaviour of read-only
mounts, I think it's very much ready to go into CVS.  From there it
can get wider testing and review and be considered for 6-stable.
Since read-write support in the existing code is pretty much worthless,
I don't think that there will be a problem justifying the operational
changes that you describe in your documentation.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 6.1-PRE boot locks up, using USB keyboard

2006-03-15 Thread Scott Long


John Baldwin wrote:

On Wednesday 15 March 2006 12:11, Rick C. Petty wrote:


On Wed, Mar 15, 2006 at 10:46:01AM -0500, John Baldwin wrote:



I'm using a USB keyboard, no PS/2.  I've tried the hint to disable kbdmux,
I've tried with and without selecting the Boot w/ USB keyboard and the
machine locks up in the same spot no matter what I try.  The same hardware
boots just fine with 6.0-RELEASE (although I need to choose the USB
keyboard option if I plan on typing).  Any suggestions?


What if you turn off USB keyboard support in your BIOS?


My BIOS (Asus A8N-E rev 1010) has no option for disabling USB keyboard
support, but I can either disable the USB controller or disable the USB
legacy support.  I doubt either of these is desirable.  Fortunately, I
discovered the problem..



The legacy support option is the one that makes a USB keyboard look like
a PS/2 keyboard.



The ukbd device is compiled into GENERIC.  I also had ukbd_load=YES in my
loader.conf so it would be compatible with a custom kernel.  When GENERIC
boots, I get the message that ukbd is already loaded (file exists).  I
would expect that the kernel just ignores the attempt, but apparently there
is an adverse effect.  Whenever ukbd is loaded by /boot/loader and that
device already exists in the kernel, the boot locks up after:

atkbdc0: Keyboard controller (i8042) at port 0x60,0x64 on isa0

when using a USB keyboard.  I would think this is a bug.  It is 100%
repeatable for me.  If I comment out the line in /boot/loader.conf, the
system boots nicely.  Perhaps this is related to kbdmux(4), but I'm not
sure.  I've also noticed related problems when trying to load umass and ums
through the boot loader and manually (I will try to reproduce these).
Maybe the problem is in the USB layer??

FYI, I tried this on 6.1-BETA4, fresh from the ISOs.



Ok.  There are several edge cases that can blow up if you kldload a module
or load a module from the loader that is already present in the kernel.



Alternately, I've heard from some people with a similar problem that 
turning off USB2 but leaving plain USB on avoids the problem.  I'm not 
exactly sure how or why this is, but it's worth a try I guess.


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

FreeBSD 6.1-BETA2/FreeBSD 5.5-BETA2 Available

2006-03-14 Thread Scott Long


Announcement


The FreeBSD Release Engineering Team is pleased to announce the
availability of FreeBSD 6.1-BETA4 and FreeBSD 5.5-BETA4.  Both FreeBSD
6.1 and FreeBSD 5.5 are meant to be a refinement of their  respective
branches with few dramatic changes.  A lot of bugfixes have been made,
some drivers have been updated, and some areas have been tweaked for
better performance, etc. but no large changes have been made to the
basic architecture.  The FreeBSD 5.5 Release is being done for people
who are unable to make the jump to FreeBSD 6.X at this time.  We do
encourage people to make that transition as soon as possible, though.
There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5
but not all of the bugfixes done to RELENG_6 have been backported to
RELENG_5.  This will almost certainly be the last 5.X release.

We encourage people to help with testing so any final bugs can be
identified and worked out.  Availability of ISO images is given below.
If you have an older system you want to update using the normal
CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1
and RELENG_5 for 5.5, though that will change later in the release cycle
when we start doing the Release Candidates.  Problem reports can be
submitted using the send-pr(1) command.

The list of open issues and things still being worked on are on the
todo list:

http://www.freebsd.org/releases/6.1R/todo.html
http://www.freebsd.org/releases/5.5R/todo.html

Known Issues


A couple of significant changes were made to 6.1-BETA4.  First is a
large set of fixes to the VFS layer and various filesystems that
should sigficantly help performance under heavy load and also fix
problems with forcefully unmounting these filesystems.  While these
changes have recieved considerable developer testing, users are
requested to test filesystem stability as much as possible to ensure
that there are no regressions.

The second large change is that sysinstall will now install both the
GENERIC and SMP kernels and automatically select the appropriate one
based on whether it detects one CPU in the system or multiple CPUs.
However, single CPU systems with hyperthreading will still be treated
as uni-processor by sysinstall.  The automatic selection can be
overridden within sysinstall.  Testing of this is requested to help
identify systems that are not detected correctly.

Availability


The BETA4 ISOs and FTP support are available on most of the FreeBSD
Mirror sites.  A list of the mirror sites is available here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

The MD5s are:

MD5 (5.5-BETA4-pc98-disc1.iso) = bf6cf1238c000a01fe8c34ed4554e66e
MD5 (5.5-BETA4-alpha-bootonly.iso) = 84e55974d8854692a85d43558e10c658
MD5 (5.5-BETA4-alpha-disc1.iso) = b5fc0a01dc6cb96924c7cd9c18af6dd9
MD5 (5.5-BETA4-i386-bootonly.iso) = e54261162e775b692138597ae2512bf6
MD5 (5.5-BETA4-i386-disc1.iso) = 15f3161d4c5f996bbbc9b28682198dde
MD5 (5.5-BETA4-i386-disc2.iso) = 10c4c7985eea736862480b0212e60155
MD5 (5.5-BETA4-amd64-bootonly.iso) = 371522c9ab80e7d5c0fadd71d670543e
MD5 (5.5-BETA4-amd64-disc1.iso) = 4811510b11620f8706f51e4e0154b8b1
MD5 (5.5-BETA4-amd64-disc2.iso) = 54ee8ec3240de9d84576f1f0cbf2048a

MD5 (6.1-BETA4-pc98-disc1.iso) = ee891f12ddb7b62b2c1d3672555ceb3b
MD5 (6.1-BETA4-ia64-bootonly.iso) = b05da331e737c6c614acbb924584199e
MD5 (6.1-BETA4-ia64-disc1.iso) = d0b09231c1d55308fb8fb45eec845284
MD5 (6.1-BETA4-ia64-livefs.iso) = 9db26824bb09ee98e33eceadb643
MD5 (6.1-BETA4-alpha-bootonly.iso) = c77a7d80803efeba3b8140072ccd4969
MD5 (6.1-BETA4-alpha-disc1.iso) = b2701eb2931dd815b3b595f9183ae5a4
MD5 (6.1-BETA4-i386-bootonly.iso) = 113f1b990d298aa8b7f81d93a3636dc3
MD5 (6.1-BETA4-i386-disc1.iso) = aee3a4416eec24b1795346efeb624416
MD5 (6.1-BETA4-i386-disc2.iso) = 01b01719f7a06d2613a3e9fe15417b3f
MD5 (6.1-BETA4-amd64-bootonly.iso) = c52a2081931d89cbbebf50f198e8b169
MD5 (6.1-BETA4-amd64-disc1.iso) = 5624a6ba41abdc60802d21be9ab4cf6e
MD5 (6.1-BETA4-amd64-disc2.iso) = db427ec7ab4af75a8224a890c476846d

The SHA256s are:

SHA256 (5.5-BETA4-pc98-disc1.iso) = 
0574c7db49a81c77d1d9cded1add28451026fee5ca52605be03ab6660f9b5ab5
SHA256 (5.5-BETA4-i386-bootonly.iso) = 
077a4b6561311af08d9f760d734fc822d2589554f4a25ac413cfeb275a59361c
SHA256 (5.5-BETA4-i386-disc1.iso) = 
3367499f48d7fdc526a1f447f8e83ee4eef7a76e74784eb7471124499440e05d
SHA256 (5.5-BETA4-i386-disc2.iso) = 
40751884348826807f6c24ec78d424568be21c3e64fcb002f7cfd2bf9ec3bfa7
SHA256 (5.5-BETA4-amd64-bootonly.iso) = 
ca7390623cfa64589a4f19c80fa557e8a6085c54335811010c6b609a4202fd20
SHA256 (5.5-BETA4-amd64-disc1.iso) = 
a9cba0901cf6747193e173eb1c1d1b843ddaa59d12f2e8e84d0d634d2aba5bb0
SHA256 (5.5-BETA4-amd64-disc2.iso) = 
763d5dfe8d7bed4dadc8850e04e2ad04b8c3f4ae6e283df6cfdd25b9475a80d0


SHA256 (6.1-BETA4-pc98-disc1.iso) = 
3de12c8ec0d65a651bcb200049c5fa7b4a9228ebea6f64dea15b35ab07d19178
SHA256 (6.1-BETA4-ia64-bootonly.iso) =

BETA4! [Re: FreeBSD 6.1-BETA2/FreeBSD 5.5-BETA2 Available]

2006-03-14 Thread Scott Long


Sorry, I accidentally sent out an incomplete draft.  This announcement
is for BETA4, of course.  Also, the note about VFS changes below should
stress that the changes were made for stability, not performance.  Sorry
for the confusion.

Scott Long wrote:


Announcement


The FreeBSD Release Engineering Team is pleased to announce the
availability of FreeBSD 6.1-BETA4 and FreeBSD 5.5-BETA4.  Both FreeBSD
6.1 and FreeBSD 5.5 are meant to be a refinement of their  respective
branches with few dramatic changes.  A lot of bugfixes have been made,
some drivers have been updated, and some areas have been tweaked for
better performance, etc. but no large changes have been made to the
basic architecture.  The FreeBSD 5.5 Release is being done for people
who are unable to make the jump to FreeBSD 6.X at this time.  We do
encourage people to make that transition as soon as possible, though.
There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5
but not all of the bugfixes done to RELENG_6 have been backported to
RELENG_5.  This will almost certainly be the last 5.X release.

We encourage people to help with testing so any final bugs can be
identified and worked out.  Availability of ISO images is given below.
If you have an older system you want to update using the normal
CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1
and RELENG_5 for 5.5, though that will change later in the release cycle
when we start doing the Release Candidates.  Problem reports can be
submitted using the send-pr(1) command.

The list of open issues and things still being worked on are on the
todo list:

http://www.freebsd.org/releases/6.1R/todo.html
http://www.freebsd.org/releases/5.5R/todo.html

Known Issues


A couple of significant changes were made to 6.1-BETA4.  First is a
large set of fixes to the VFS layer and various filesystems that
should sigficantly help performance under heavy load and also fix
problems with forcefully unmounting these filesystems.  While these
changes have recieved considerable developer testing, users are
requested to test filesystem stability as much as possible to ensure
that there are no regressions.

The second large change is that sysinstall will now install both the
GENERIC and SMP kernels and automatically select the appropriate one
based on whether it detects one CPU in the system or multiple CPUs.
However, single CPU systems with hyperthreading will still be treated
as uni-processor by sysinstall.  The automatic selection can be
overridden within sysinstall.  Testing of this is requested to help
identify systems that are not detected correctly.

Availability


The BETA4 ISOs and FTP support are available on most of the FreeBSD
Mirror sites.  A list of the mirror sites is available here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

The MD5s are:

MD5 (5.5-BETA4-pc98-disc1.iso) = bf6cf1238c000a01fe8c34ed4554e66e
MD5 (5.5-BETA4-alpha-bootonly.iso) = 84e55974d8854692a85d43558e10c658
MD5 (5.5-BETA4-alpha-disc1.iso) = b5fc0a01dc6cb96924c7cd9c18af6dd9
MD5 (5.5-BETA4-i386-bootonly.iso) = e54261162e775b692138597ae2512bf6
MD5 (5.5-BETA4-i386-disc1.iso) = 15f3161d4c5f996bbbc9b28682198dde
MD5 (5.5-BETA4-i386-disc2.iso) = 10c4c7985eea736862480b0212e60155
MD5 (5.5-BETA4-amd64-bootonly.iso) = 371522c9ab80e7d5c0fadd71d670543e
MD5 (5.5-BETA4-amd64-disc1.iso) = 4811510b11620f8706f51e4e0154b8b1
MD5 (5.5-BETA4-amd64-disc2.iso) = 54ee8ec3240de9d84576f1f0cbf2048a

MD5 (6.1-BETA4-pc98-disc1.iso) = ee891f12ddb7b62b2c1d3672555ceb3b
MD5 (6.1-BETA4-ia64-bootonly.iso) = b05da331e737c6c614acbb924584199e
MD5 (6.1-BETA4-ia64-disc1.iso) = d0b09231c1d55308fb8fb45eec845284
MD5 (6.1-BETA4-ia64-livefs.iso) = 9db26824bb09ee98e33eceadb643
MD5 (6.1-BETA4-alpha-bootonly.iso) = c77a7d80803efeba3b8140072ccd4969
MD5 (6.1-BETA4-alpha-disc1.iso) = b2701eb2931dd815b3b595f9183ae5a4
MD5 (6.1-BETA4-i386-bootonly.iso) = 113f1b990d298aa8b7f81d93a3636dc3
MD5 (6.1-BETA4-i386-disc1.iso) = aee3a4416eec24b1795346efeb624416
MD5 (6.1-BETA4-i386-disc2.iso) = 01b01719f7a06d2613a3e9fe15417b3f
MD5 (6.1-BETA4-amd64-bootonly.iso) = c52a2081931d89cbbebf50f198e8b169
MD5 (6.1-BETA4-amd64-disc1.iso) = 5624a6ba41abdc60802d21be9ab4cf6e
MD5 (6.1-BETA4-amd64-disc2.iso) = db427ec7ab4af75a8224a890c476846d

The SHA256s are:

SHA256 (5.5-BETA4-pc98-disc1.iso) = 
0574c7db49a81c77d1d9cded1add28451026fee5ca52605be03ab6660f9b5ab5
SHA256 (5.5-BETA4-i386-bootonly.iso) = 
077a4b6561311af08d9f760d734fc822d2589554f4a25ac413cfeb275a59361c
SHA256 (5.5-BETA4-i386-disc1.iso) = 
3367499f48d7fdc526a1f447f8e83ee4eef7a76e74784eb7471124499440e05d
SHA256 (5.5-BETA4-i386-disc2.iso) = 
40751884348826807f6c24ec78d424568be21c3e64fcb002f7cfd2bf9ec3bfa7
SHA256 (5.5-BETA4-amd64-bootonly.iso) = 
ca7390623cfa64589a4f19c80fa557e8a6085c54335811010c6b609a4202fd20
SHA256 (5.5-BETA4-amd64-disc1.iso) = 
a9cba0901cf6747193e173eb1c1d1b843ddaa59d12f2e8e84d0d634d2aba5bb0
SHA256 (5.5-BETA4-amd64

Re: VMWARE GSX Port?

2006-03-03 Thread Scott Long


Aniruddha Bohra wrote:

On Thu, 2006-03-02 at 13:28 -0800, Kip Macy wrote:


-CURRENT runs on 3.0 as a domU. There is partial dom0 support. The
changes have not gone back into the mainline because xenbus is
extremely difficult to integrate cleanly. You can check on the state
of the xen3 branch in perforce.



At several places the difficulties are mentioned. Is there some place
that these are listed or discussed in more detail?

Thanks
Aniruddha
  



The difficulty is that Xen uses a Mach-like message passing system to 
communicate information like DomU configuration.  This requires kernel 
threads to be operational early on in the startup of the DomU kernel,

much earlier than what FreeBSD allows.  My attempts so far to allow
xenbus to synchronously retrieve its devce configuration instead of
relying on asynchronous messages has been unsuccessfull.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: VMWARE GSX Port?

2006-02-25 Thread Scott Long


Ashok Shrestha wrote:

VMWARE GSX was released recently for free.
[http://www.vmware.com/news/releases/server_beta.html]

Is anyone working on a port for this?




I've started on it, but I haven't made much progress yet.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: urgent, need to recover superblock!

2006-02-22 Thread Scott Long


Dave wrote:

Hello,
   Some urgency on this issue!I've got a 10 gb ide drive that has 
critical data on one of it's

partitions /dev/ad1e. This drive was originally gmirrored in
another box it worked fine, it was the master drive. Now i've
installed this drive as a slave in another 6.0 box, and now it
shows up as ad1 with the partition i want being ad1e. I did a mount it
worked fine. So i knew the drive was working, i then unmounted the
partition, and tried to dump it to another drive. This didn't work, dump 
got

an error about incorrect superblock. I then did a mount
-o ro /dev/ad1e /mnt and i'm getting an error Incorrect
superblock from mount. I then tried fsck /dev/ad1e and got the
same error msg. These partitions were formatted with ufs2 as their
filesystem. I then ran bsdlabel ad1 and got a printout of my label,
this showed up which gives me hope that this data can be retrieved.
An error i'm getting from bsdlabel says that the c: partition does not 
cover

the
entire disk and that may result in utilities not working correctly. Any
help appreciated.
   Some urgency!
Dave.


Sounds like you need to install ports/syutils/ffsrecov and spend some 
quality time with it tonight.


Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

FreeBSD 6.1-BETA2/FreeBSD 5.5-BETA2 Available

2006-02-20 Thread Scott Long


Announcement


The FreeBSD Release Engineering Team is pleased to announce the
availability of FreeBSD 6.1-BETA2 and FreeBSD 5.5-BETA2.  Both FreeBSD
6.1 and FreeBSD 5.5 are meant to be a refinement of their  respective
branches with few dramatic changes.  A lot of bugfixes have been made,
some drivers have been updated, and some areas have been tweaked for
better performance, etc. but no large changes have been made to the
basic architecture.  The FreeBSD 5.5 Release is being done for people
who are unable to make the jump to FreeBSD 6.X at this time.  We do
encourage people to make that transition as soon as possible, though.
There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5
but not all of the bugfixes done to RELENG_6 have been backported to
RELENG_5.  This will almost certainly be the last 5.X release.

We encourage people to help with testing so any final bugs can be
identified and worked out.  Availability of ISO images is given below.
If you have an older system you want to update using the normal
CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1
and RELENG_5 for 5.5, though that will change later in the release cycle
when we start doing the Release Candidates.  Problem reports can be
submitted using the send-pr(1) command.

The list of open issues and things still being worked on are on the
todo list:

http://www.freebsd.org/releases/6.1R/todo.html
http://www.freebsd.org/releases/5.5R/todo.html

Known Issues


The DHCP problem that affected the 6.1-BETA1 installer has been fixed.
Several critical fixes were made to the ATA subsystem after the BETA2
builds started, so please check for newer updates before reporting
problems.  The Intel 2200/2915 Wireless is known to have a number of
stability problems that are being worked on right now.  Work is also in
progress to make ht-plug of USB memory devices more reliable.  Updated
packages for all architectures might not be available yet.

Availability


The BETA2 ISOs and FTP support are available on most of the FreeBSD
Mirror sites.  A list of the mirror sites is available here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

The MD5s are:
MD5 (5.5-BETA2-amd64-bootonly.iso) = a2333f7f2184e2899c7317202843fab6
MD5 (5.5-BETA2-amd64-disc1.iso) = 1c5efbe2276c94890a6e8efb152f55d5
MD5 (5.5-BETA2-amd64-disc2.iso) = bd1e55d73401d6160d2bc1eaef90348c

MD5 (5.5-BETA2-i386-bootonly.iso) = 26fba5024002189a3d0bc8be58a62bc8
MD5 (5.5-BETA2-i386-disc1.iso) = 5fc65d29cd7139dadd20e207b16c81f1
MD5 (5.5-BETA2-i386-disc2.iso) = 24ca73eb276887fa5c90fa670e3b5e64

MD5 (5.5-BETA2-sparc64-bootonly.iso) = 8835b89782db79fe042b3d365c9e63ea
MD5 (5.5-BETA2-sparc64-disc1.iso) = 58712a0f7e30e19bfdd52ca5961050db
MD5 (5.5-BETA2-sparc64-disc2.iso) = 1e350504487f0f51fbe590eb830ad3f4

MD5 (5.5-BETA2-alpha-bootonly.iso) = 8ec56c534a22ee5a57af88b01fe3bf76
MD5 (5.5-BETA2-alpha-disc1.iso) = 9abd0f971d2928c95dfff8e0808bbc2c

MD5 (5.5-BETA2-pc98-disc1.iso) = 92d7862ed1b77989fe1545346b9c2cef

MD5 (6.1-BETA2-amd64-bootonly.iso) = c9d59cadf063e3a67626c1e5477b27bb
MD5 (6.1-BETA2-amd64-disc1.iso) = e471a79dbcfbea829c8ab6caa9683e61
MD5 (6.1-BETA2-amd64-disc2.iso) = 29367aa8f31a4af3e49d110574f25319

MD5 (6.1-BETA2-i386-bootonly.iso) = cd77279cdf230e4b37c2d579d2a01cc1
MD5 (6.1-BETA2-i386-disc1.iso) = 3267d7794079d3b803f5f5f004cf04f1
MD5 (6.1-BETA2-i386-disc2.iso) = f47ad00c0240320661a78320befdeaa1

MD5 (6.1-BETA2-sparc64-bootonly.iso) = fc070774c0516391f7176518a0b0fabc
MD5 (6.1-BETA2-sparc64-disc1.iso) = 540aa7abd4f51ea93138086f886e11de
MD5 (6.1-BETA2-sparc64-disc2.iso) = 0d29f1554eb53a8748b9ec280868c460

MD5 (6.1-BETA2-alpha-bootonly.iso) = 923056c4b9c249f5393530ca62618aa6
MD5 (6.1-BETA2-alpha-disc1.iso) = fab96d1ee15340e5e8c639066f313751

MD5 (6.1-BETA2-pc98-disc1.iso) = 95be33fe72af53e9d6140b85385dc8c8

MD5 (6.1-BETA2-ia64-bootonly.iso) = 8eceddde4826e30480933cd05d306084
MD5 (6.1-BETA2-ia64-disc1.iso) = b034fce6098c6c203db379511174c729
MD5 (6.1-BETA2-ia64-livefs.iso) = 2592fa45458555862063b49e2946c16d

The SHA256s are:

SHA256 (5.5-BETA2-amd64-bootonly.iso) = 
2616b50051eb8213877d4ba506b6f72f218870dffd40d1000b6d022d398d7f09
SHA256 (5.5-BETA2-amd64-disc1.iso) = 
603a590632679cf3151ab183da4b53dde241e4990b335f396902cc5c5c5ff531
SHA256 (5.5-BETA2-amd64-disc2.iso) = 
0834e6d5e024db45793ff92b4e436d5c466213bd05b6c1091380d58822c5fb0b


SHA256 (5.5-BETA2-i386-bootonly.iso) = 
b5709ee350faf8010db5eac0db0639945045212bc3e3ac96ed42b3433d698b32
SHA256 (5.5-BETA2-i386-disc1.iso) = 
012356396548d81840dbe004c5b125b1d5725e239728c920e41b786a539c67b9
SHA256 (5.5-BETA2-i386-disc2.iso) = 
567d8321f609f99da8864a48b862a647ebc65ab9bb28b8b3f79a8d86cf108d9d


SHA256 (5.5-BETA2-sparc64-bootonly.iso) = 
f713b1ba0eff3596b57ee22d629e4940e2a61eddf5bb207aa8597158be269851
SHA256 (5.5-BETA2-sparc64-disc1.iso) = 
50375e00fb40d3e3d56dd6b2f3321916a5324f349ddb1c8e4ec904fd13eaa4e8
SHA256 (5.5-BETA2-sparc64-disc2.iso) =

Re: Panic Kernel Dump to umass device?

2006-02-10 Thread Scott Long


Nate Nielsen wrote:

I'm developing for small embedded systems, and I'm looking into the
possibility of dumping a kernel core dump to a USB memory stick (umass
driver). It currently doesn't work (see below), but I'm interested in
fixing it.

Yes, I know it'll be slow. It's probably also a non-tested (and
non-reliable) code path for a kernel dump. But leaving those issues aside...

First I wanted to ask if anyone else has tried this. Is it an insane
idea, impossible? I'm not very familiar with the CAM/SCSI/USB
sub-systems so perhaps someone more knowledgeable than I can set me
straight.

Currently when doing a dump to a USB device, I get the following. This
with 6.0-RELEASE. Dump device is /dev/da0s1.




Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x0
fault code  = supervisor write, page not present
instruction pointer = 0x20:0xc0cea412
stack pointer   = 0x28:0xc6cf5c1c
frame pointer   = 0x28:0xc6cf5c24
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 473 (kldload)
trap number = 12
panic: page fault
Uptime: 3m48s
Dumping 95 MB (2 chunks)
Aborting dump due to I/O error.
status == 0xb, scsi status == 0x0

** DUMP FAILED (ERROR 5) **
Automatic reboot in 5 seconds - press a key on the console to abort




It waits for about a minute after 'Dumping 95 MB (2 chunks)'. The light
on the USB stick goes and remains stuck in the on state. The status: 0xb
seems to be CAM_CMD_TIMEOUT. ERROR 5 is EIO.

As far as I know, kernel dumps are always dune without interrupts and
the driver runs with polling. It's likely that the umass driver and/or
USB subsystem doesn't like this.


Cheers,
Nate



You're correct that dumping is meant to be done with interrupts and task
switching disabled.  The first thing that the umass driver is missing is
a working CAM poll handler.  Without this, there is no way for command
completions to be seen when interrupts are disabled.  Beyond that, I
somewhat suspect that the USB stack expects to be able to push command
completion work off to worker threads, at least for some situations, and
that also will not work in the kernel dump environment.  So, there is a
lot of work needed to make this happen.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

BETA1 announcement

2006-02-09 Thread Scott Long


Announcement


The FreeBSD Release Engineering Team is pleased to announce the 
beginning of both the FreeBSD 6.1 and FreeBSD 5.5 release cycles with 
the availability of FreeBSD 6.1-BETA1 and FreeBSD 5.5-BETA1


Both FreeBSD 6.1 and FreeBSD 5.5 are meant to be a refinement of their
respective branches with few dramatic changes.  A lot of bugfixes have
been made, some drivers have been updated, and some areas have been
tweaked for better performance, etc. but no large changes have been made
to the basic architecture.  The FreeBSD 5.5 Release is being done for
people who are unable to make the jump to FreeBSD 6.X at this time.  We
do encourage people to make that transition as soon as possible, though.
There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5
but not all of the bugfixes done to RELENG_6 have been backported to
RELENG_5.  This will almost certainly be the last 5.X release.

We encourage people to help with testing so any final bugs can be
identified and worked out.  Availability of ISO images is given below.
If you have an older system you want to update using the normal
CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1
and RELENG_5 for 5.5, though that will change later in the release cycle
when we start doing the Release Candidates.  Problem reports can be
submitted using the send-pr(1) command.

The list of open issues and things still being worked on are on the
todo list:

http://www.freebsd.org/releases/6.1R/todo.html
http://www.freebsd.org/releases/5.5R/todo.html

Known Issues


Other than the list of open issues in the todo lists BETA1 has a few
other known issues.  There is a problem with using DHCP during system
installation.  A fix for this is already being worked on.  And as usual
at this stage of a release the availability of pre-built packages on
the ISO images varies widely from architecture to architecture.  The
list of packages that will be available as part of the release itself
will certainly be different.

Availability


The BETA1 ISOs and FTP support are available on most of the FreeBSD
Mirror sites.  A list of the mirror sites is available here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

The MD5s are:

MD5 (5.5-BETA1-alpha-bootonly.iso) = af05452acda5868b5515bc868038ffba
MD5 (5.5-BETA1-alpha-disc1.iso) = 4660ef47a3d1ffc49611484c7fab4cd0
MD5 (5.5-BETA1-amd64-bootonly.iso) = c0f161a4711ca422832907692e47f54c
MD5 (5.5-BETA1-amd64-disc1.iso) = 4e64fe4c4cd0dec41ee234f84f8c4946
MD5 (5.5-BETA1-amd64-disc2.iso) = ba9898176a7afbfc2d0162e38ec8d205
MD5 (5.5-BETA1-i386-bootonly.iso) = 5a5214b758db033529897884350b8a19
MD5 (5.5-BETA1-i386-disc1.iso) = 10c489414716782d9d8ce942dd4f7de8
MD5 (5.5-BETA1-i386-disc2.iso) = f7bcae220c1cfc8cff67e1aaa5a3bb69
MD5 (5.5-BETA1-pc98-disc1.iso) = d581fa4725e9b2daf939f45619b63c93
MD5 (5.5-BETA1-sparc64-bootonly.iso) = 34a90df46d5b1a6c7fe2cd263138f1ec
MD5 (5.5-BETA1-sparc64-disc1.iso) = 3261ec3570b7b1a1f8089577564f5693
MD5 (5.5-BETA1-sparc64-disc2.iso) = 614541ce58508efe94b3ca2f9fc6159c

MD5 (6.1-BETA1-alpha-bootonly.iso) = e63dc0fcbc4222e82c3bb7e040f791cb
MD5 (6.1-BETA1-alpha-disc1.iso) = c536ed08fedbd45141f48919803042dc
MD5 (6.1-BETA1-amd64-bootonly.iso) = 84a72b2a6d86fd29f7e35da9092300ec
MD5 (6.1-BETA1-amd64-disc1.iso) = 20f36ee823ed4313f89289d5357fc365
MD5 (6.1-BETA1-amd64-disc2.iso) = 2bbf97c74d7df701037634a7b0c971cb
MD5 (6.1-BETA1-i386-bootonly.iso) = 037b99d2dddb93f75f3a3445908103a4
MD5 (6.1-BETA1-i386-disc1.iso) = 3cc6e3e66abce6420c316e04631ddb19
MD5 (6.1-BETA1-i386-disc2.iso) = c822d0a62f3e402f21088ca8abefce3e
MD5 (6.1-BETA1-ia64-bootonly.iso) = 1fb12b97f70980ab12c2d31526c128ec
MD5 (6.1-BETA1-ia64-disc1.iso) = d5e34526c056caf543d300412cf07648
MD5 (6.1-BETA1-ia64-livefs.iso) = c29cb7b7b0724d70b6e07e724bf44b62
MD5 (6.1-BETA1-pc98-disc1.iso) = 6894340e1b7dac32de263974a62e9beb
MD5 (6.1-BETA1-sparc64-bootonly.iso) = af91246b0b42bf16e65d38a5a68b6726
MD5 (6.1-BETA1-sparc64-disc1.iso) = 668c98638c2b830ca3276cdfe18815a5
MD5 (6.1-BETA1-sparc64-disc2.iso) = f01049d7a1011db04101b97c954da363

The SHA256s are:

SHA256 (5.5-BETA1-amd64-bootonly.iso) = 
2553bf02cf3acf38f5ca6ccb2884a1d83b1046e039686fbfca8d8b799173fb7c
SHA256 (5.5-BETA1-amd64-disc1.iso) = 
ee9ee3c11651d6c1991a168e7c6731d129209c072bf93bd6e453af1da255e1fd
SHA256 (5.5-BETA1-amd64-disc2.iso) = 
9e871ea6dd7ee3e4ef4bdcb7f3eb7efcda7a1f96b8397f313010c03d781b4c93
SHA256 (5.5-BETA1-i386-bootonly.iso) = 
335428bcc6e391578c354a042ab125866968ebe3c6031011fe0062558d56328c
SHA256 (5.5-BETA1-i386-disc1.iso) = 
a33138b4bf7b224e24c20d6578954897d8095ce316cb72cd117a2c449c26a55b
SHA256 (5.5-BETA1-i386-disc2.iso) = 
ec68782fa3e82c74582de00b62e14f395e3a45ec0c10e95fb5673a8354e3f4f5
SHA256 (5.5-BETA1-pc98-disc1.iso) = 
a884fd69a0e49752c0a963c8b0d2cd84c5ffa8d2b0cae57e18fa59136722214c
SHA256 (5.5-BETA1-sparc64-bootonly.iso) = 
8a073b3704e038f2217b559e2e6f68ed33a5f0f45a4546ccdccb4e0b71f1b79f
SHA256

Re: Weird PCI interrupt delivery problem (resolution, sort of)

2006-01-25 Thread Scott Long


John Baldwin wrote:

On Tuesday 24 January 2006 19:34, Craig Boston wrote:


On Tue, Jan 24, 2006 at 10:43:49AM -0500, John Baldwin wrote:


What if you do a read of the lapic before the write?  Maybe doing 'x =
lapic-eoi;  lapic-eoi = 0;'?


Reading the lapic before the write has no effect.

Reading the lapic after the write makes it work.



Hmm, perhaps the read forces the write to post?  Scott?



Either that, or the read imposes enough delay to let whatever was
happening during the DELAY call work.   I find it hard to believe that
uncached writes would get delayed like this.  I've lost the original
posting on this, could you provide the dmesg and computer make/model
again?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Difference between a kthread and an ordinary process.

2006-01-24 Thread Scott Long


Pranav Peshwe wrote:

Hello,
 When a kthread is created using the kthread_create (9)
function, i found out that a new instance of struct proc is created
and allocated for the thread just as in case of a creation of a new
process.Also, the thread is assigned a pid as in the case of a
process.
  What is the difference between a kernel thread and a normal process
created using fork ? except the address space sharing with swapper and
kernel mode execution of the kthread. Is a kthread effectively just a
process always running in kernel mode ?



That is exactly what a kthread is.  There is some work in process to 
make them true threads within one or more processes.


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Weird PCI interrupt delivery problem (resolution, sort of)

2006-01-19 Thread Scott Long


Craig Boston wrote:


After trying everything I could think of to do to the I/O APIC code and
coming up empty, tonight I went back to the local APIC.  I had
previously ruled it out since the lapic timer interrupt continued to
work fine even when the others stopped.  However, adding some DELAY(1)
calls at key points caused it to work, much like adding WITNESS does.
I managed to get it down to a single change that makes APIC mode work on
this laptop:

--- local_apic.c.orig   Thu Jan 19 18:32:37 2006
+++ local_apic.cThu Jan 19 18:32:28 2006
@@ -599,4 +599,5 @@
 lapic_eoi(void)
 {
lapic-eoi = 0;
+   lapic-eoi = 0;
 }

...and welcome to bizarro world.  There's absolutely no reason I can
think of why that would change anything, other than buggy hardware.

I looked at what Linux was doing, and they're also using a single write
to EOI interrupts, so long as the X86_GOOD_APIC config option is enabled
(and it is for P5/MMX or newer).  Otherwise it does an extra read before
writing to any APIC register.  I don't know if linux works on this
hardware or not -- the live CD I tried wasn't compiled for APIC support.

At this point, since AFAIK nobody else has reported the same problem,
I'm content with a local workaround.  It's just... wierd.

Craig


This points to a bus coherency problem.  I wonder if your BIOS is
incorrectly setting the memory region of the apics as cachable.  You'll
want to bug Baldwin about this.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: An idea of remove MUTEX_WAKE_ALL

2006-01-03 Thread Scott Long


Daniel Eischen wrote:


On Tue, 3 Jan 2006, John Baldwin wrote:



On Sunday 01 January 2006 02:21 am, prime wrote:


Hi hackers,
  I have an idea about remove the kernel option MUTEX_WAKE_ALL.
  When we unlock the mutex(in _mtx_unlock_sleep),we can directly
give the lock to the first thread waiting on the turnstile.And a
thread gets the mutex after he returned from turnstile_wait so he
can simply jump out the _obtain_lock loop in _mtx_lock_sleep.
This makes a mutex always be owned by a thread when there are threads
waiting on the turnstile,so priority inheritance can work now.
  This idea need only a few changes in kern/kern_mutex.c .But when
NO_ADAPTIVE_MUTEXS not set,it makes threads that spinning on other CPU
to get the mutex have to spin for a long time,and this makes the short
term mutex more expensive(maybe should use spin mutex instead).

What do think about the idea? Thanks.


Sun actually found that the performance was better when you did MUTEX_WAKE_ALL
because once you woke up N threads, if they don't all resume at once then
they will acquire the lock in sequence and the lock acquires and releaes will
all be simple ones rather than all being the complicated contested case.
There are more details in _Solaris Internals_.



Yes, but doesn't this partly rely on having the threads spin(*)
for a bit if the current lock owner is running on another CPU?
Do we currently do that?

(*) No, I am not referring to spin mutexes.



Adaptive mutexes are enabled by default and have been for at least a
year.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: An idea of remove MUTEX_WAKE_ALL

2006-01-03 Thread Scott Long


Daniel Eischen wrote:


On Tue, 3 Jan 2006, Scott Long wrote:



for a bit if the current lock owner is running on another CPU?
Do we currently do that?

(*) No, I am not referring to spin mutexes.



Adaptive mutexes are enabled by default and have been for at least a
year.



Ahh, then that's what they (Adaptive) do.



Well, it's a bit different from Solaris, I believe.  They do not sleep
after a certain number of contested spins, and instead just continue to
spin.  As we reduce the coverage of large contested locks (like Giant)
this becomes much less of performance problem, though.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: My wish list for 6.1

2005-12-16 Thread Scott Long


Xin LI wrote:


Hi, Scott,

On 12/16/05, Scott Long [EMAIL PROTECTED] wrote:


Guys,

With code freeze for 6.1 about 6 weeks away, I'd like to put out my
'wish list' for it:



More-or-less OT question: Shall we switch ULE as the default scheduler
on -HEAD to encourage more testing against it?

Cheers,


Only if there is someone committed to tracking and fixing bugs.  Last
time we tried this, we wasted a lot of time and energy.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

My wish list for 6.1

2005-12-15 Thread Scott Long


Guys,

With code freeze for 6.1 about 6 weeks away, I'd like to put out my 
'wish list' for it:


1.  working kbdmux.  We need this for the growing number of systems that
assume that USB is the primary keyboard.  Current status appears to be
that the kbdmux driver breaks very easily.  We need this working well
enough where it can be enabled by default, and all attached keyboards
Just Work.

2.  SMP kernels for install.  Right now we only install a UP kernel, for
performance reasons.  We should be able to package both a UP and SMP
kernel into the release bits, and have sysinstall install both.  It 
should also select the correct one for the target system and make that

the default on boot.  The easiest way to do this would be to have
sysinstall boot an SMP kernel and then look at the hw.ncpu sysctl.  The
only problem is being able to have sysinstall fall back to booting a UP
kernel for itself if the SMP one fails.  This can probably be 'faked' by
setting one of the SMP-disabling variables in the loader.  But in any
case, the point is to make the process Just Work for the user, without
the user needing to know arcane loader/sysctl knobs.  SMP laptops are
right around the corner, and we should be ready to support SMP
out-of-the-box.

3.  Full review and update of the install docs, handbook, FAQ, etc. 
There are sections that are embarrassingly out of date (one section of

the handbook apparently states that we only support a single brand of
wifi cards).  A co-worker of mine tried to install 6.0 using just the
handbook install guide, and discovered that it really doesn't match
reality anymore, in both big and small ways.  Contact me directly if
you would like his list of comments.

Thanks!

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: scsi-target and the buffer cache

2005-12-07 Thread Scott Long


Eric Anderson wrote:

Nate Lawson wrote:


Eric Anderson wrote:

I'm curious about whether a target mode device would use the buffer 
cache or not.  Here's a scenario:


Host A: has fibre channel host adapter, in target mode, large memory 
pool, and another fiber channel host adapter connecting to fibre 
channel block device.
Host B: Fibre channel host adapter, connecting to Host A.  'sees' the 
target mode block device created by Host A.


Will Host A use the buffer cache to cache blocks between the real 
block device, and the shared target mode device?
What about if Host A put a filesystem on the block device, created a 
single file the size of the filesystem, and shared that filesystem 
via a target mode device to Host B?
What I'm wanting is a box (FreeBSD?) that can be placed between a 
fibre channel block device (like a RAID array), and a fibre channel 
host using that block device, and act as a block cache for that 
device, using the FreeBSD's memory.  If it had a significant amount 
of memory, this could be very useful.




If you use the example scsi_target usermode 
(usr/share/examples/scsi_target), then the buffer cache will be used 
since its reads/writes are from usermode like normal.  If you don't 
want that behavior, you can set O_DIRECT in the open() call of the 
backing store file.


If you chose to modify the kernel side, you'd have to make sure your 
accesses were through the VOP layer and then it would be cached.


You should check to be sure the target mode performance meets your 
expectations also.




I guess I would be using the user mode tool, unless there's another 
way?  Your comment on performance also makes me a little worried about 
that now - do you think I would see a large performance hit?

Thanks!
Eric




The way the target mode stack works in FreeBSD is that the kernel 
provides some of the basic services, but the actual target emulator

is meant to live in userland.  The userland program responds to
events from the kernel via the select interface.  This generally
works pretty well.  However, it does mean that control has to
cross the kernel-userland boundary at least once for every event.

What I'd suggest doing is prototyping your target emulator in userland
and evaluating the performance there, and then moving it to the kernel
if you _really_ need more performance.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: scsi-target and the buffer cache

2005-12-07 Thread Scott Long


Eric Anderson wrote:

Nate Lawson wrote:


Scott Long wrote:


Eric Anderson wrote:


Nate Lawson wrote:


Eric Anderson wrote:

I'm curious about whether a target mode device would use the 
buffer cache or not.  Here's a scenario:


Host A: has fibre channel host adapter, in target mode, large 
memory pool, and another fiber channel host adapter connecting to 
fibre channel block device.
Host B: Fibre channel host adapter, connecting to Host A.  'sees' 
the target mode block device created by Host A.


Will Host A use the buffer cache to cache blocks between the real 
block device, and the shared target mode device?
What about if Host A put a filesystem on the block device, created 
a single file the size of the filesystem, and shared that 
filesystem via a target mode device to Host B?
What I'm wanting is a box (FreeBSD?) that can be placed between a 
fibre channel block device (like a RAID array), and a fibre 
channel host using that block device, and act as a block cache for 
that device, using the FreeBSD's memory.  If it had a significant 
amount of memory, this could be very useful.







If you use the example scsi_target usermode 
(usr/share/examples/scsi_target), then the buffer cache will be 
used since its reads/writes are from usermode like normal.  If you 
don't want that behavior, you can set O_DIRECT in the open() call 
of the backing store file.


If you chose to modify the kernel side, you'd have to make sure 
your accesses were through the VOP layer and then it would be cached.


You should check to be sure the target mode performance meets your 
expectations also.




I guess I would be using the user mode tool, unless there's another 
way?  Your comment on performance also makes me a little worried 
about that now - do you think I would see a large performance hit?

Thanks!
Eric




The way the target mode stack works in FreeBSD is that the kernel 
provides some of the basic services, but the actual target emulator

is meant to live in userland.  The userland program responds to
events from the kernel via the select interface.  This generally
works pretty well.  However, it does mean that control has to
cross the kernel-userland boundary at least once for every event.

What I'd suggest doing is prototyping your target emulator in userland
and evaluating the performance there, and then moving it to the kernel
if you _really_ need more performance.




Agree 100%.  While having it in usermode means there are boundary 
crossings that increase per-transaction latency, the actual bulk data 
transfer is via zero-copy IO and you should be able to exceed the data 
transfer rates of several 10K RPM drives on decent hardware.





Ok, great.. Now, will scsi_target work ok with raw devices, or only 
files?  (although I'm not sure theres all that much difference really).


Thanks!!
Eric




You can write your userland code to use whatever files or devices you
want.  Are you talking about the scs_target.c code in
/usr/share/examples?  That's just a skeletal example that you can use
as a starting point for your own work.

Scott

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: sym(4) broken on amd64 (Time to port new driver?)

2005-11-23 Thread Scott Long


Sergey N. Voronkov wrote:

Looks like it is broken for a while - _sym_calloc2: failed to allocate HCB
is always there... 


And... Looks like Gerard Roudier havn't more interest in maintaining this
driver - there is the second generation of the original driver into linux
source three since 2001, which is newer ported to FreeBSD by the author. And
FreeBSD hooks was removed from the driver code...

May be it is a time to port siop(4) from NetBSD?



No.  I'm working on fixing this right now.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Sharing the same VM address space between Kernel and UserSpace

2005-11-14 Thread Scott Long


John Giacomoni wrote:


I am in need of a way to share memory between kernel space and possibly
multiple different user-space processes for an extended period of time.
This memory would need to be a single unpageable region.

I am using the vm routines as cribbed from mmap, however I'd like the
address spaces to be viewed as the same regardless of which process I'm
in to avoid swizzling pointers as I'm storing data structures in the
shared memory region.

I imagine I'd need to find a way to expose part of the kernel address
space to user space to accomplish this.

Is there a way to do this?

thanks

John G



If you get this working then it'll be very useful for the syspage 
support that was talked about recently.


The kernel can access addresses in the user space so long as they
are wired and won't cause a fault.  Thus I imagine that you
only need to allocate the memory, wire it, mark it with the appropriate
page permissions, and reserve a user address range for it in the
process map.  I'd look at the process exec path in the kernel for
places to hook in.  The only other trick then is how to let the user
process know the address for this magic region.  An easy way would be
to store it in a sysctl that can be read at runtime.  A harder way would
be to have the kernel dummy up an elf segment in the image activator
code that the dynamic linker could read and put into a global variable
for the program to access.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: twe and giant

2005-11-05 Thread Scott Long


Charles Sprickman wrote:

Hello all,

I was just wondering about this...  I recently bumped a soon-to-be 
production box to 6.0 as it seems like upgrading now is easier than 
doing it the week after the box goes into production (it's amazing how 
the release engineering team knows to schedule this way...:).


One thing I noticed in reviewing the boot messages is that twe is still 
under the giant lock:


twe0: 3ware Storage Controller. Driver version 1.50.01.002 port 
0x2860-0x286f mem 0xf400-0xf47f irq 17 at device 13.0 on pci0

twe0: [GIANT-LOCKED]
twe0: 4 ports, Firmware FE8S 1.05.00.068, BIOS BE7X 1.08.00.048

I bring this up because I could have sworn that I read here or elsewhere 
that this driver was revamped.  I also could have sworn that at some 
point in 5.x it was not under giant.  Maybe I'm imagining things...


Anyhow, does anyone know the status of this, and also is there a central 
repository that tracks changes like this that I can watch?


Thanks,

Charles


I have some old patches that lock twe.  They aren't quite complete or 
right due to an edge case with DMA handling.  I'll probably dust them 
off and finish them soon.


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

FreeBSD 6.0 Released

2005-11-04 Thread Scott Long



-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


It is my great pleasure and privilege to announce the availability of
FreeBSD 6.0-RELEASE.  This release is the next step in delivering the
high performance and enterprise features that have been under
development in the FreeBSD 5.x series for that last several years.
Some of the many changes since 5.4 include:

~ Significant performance improvements to the filesystem and direct disk
  access layers of the OS.  The filesystem is now multithreaded and can
  take full advantage of multiple CPU systems.
~ Expanded support for wireless networking adapters and new support for
  the WPA wireless security protocol.
~ Experimental support for the PowerPC platform.

For a complete list of new features and known problems, please see the
release notes and errata list, available at:

http://www.FreeBSD.org/releases/6.0R/relnotes.html
http://www.FreeBSD.org/releases/6.0R/errata.html

For more information about FreeBSD release engineering activities,
please see:

http://www.FreeBSD.org/releng

 Availability
 -

FreeBSD 6.0-RELEASE supports the i386, pc98, alpha, sparc64, amd64,
powerpc, and ia64 architectures and can be installed directly over the
net using bootable media or copied to a local NFS/FTP server.
Distributions for all architectures are available now.

Please continue to support the FreeBSD Project by purchasing media
from one of our supporting vendors.  The following companies will be
offering FreeBSD 6.0 based products:

~   FreeBSD Mall, Inc.http://www.freebsdmall.com/
~   Daemonnews, Inc.  http://www.bsdmall.com/freebsd1.html

If you can't afford FreeBSD on media, are impatient, or just want to
use it for evangelism purposes, then by all means download the ISO
images.  We can't promise that all the mirror sites will carry the
larger ISO images, but they will at least be available from the
following sites.  MD5 and SHA256 checksums for the release images are
included at the bottom of this message.

 Bittorrent
 --

The FreeBSD project encourages the use of BitTorrent for distributing
the release ISO images.  A collection of torrent files to download the
images is available at

ftp://ftp.freebsd.org/pub/FreeBSD/torrents/6.0-RELEASE

 FTP
 ---

At the time of this announcement the following FTP sites have FreeBSD
6.0-RELEASE available.

ftp://ftp.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.FreeBSD.org/pub/FreeBSD/
ftp://ftp3.FreeBSD.org/pub/FreeBSD/
ftp://ftp5.FreeBSD.org/pub/FreeBSD/
ftp://ftp.at.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.ch.FreeBSD.org/pub/FreeBSD/
ftp://ftp.cz.FreeBSD.org/pub/FreeBSD/
ftp://ftp.ee.FreeBSD.org/pub/FreeBSD/
ftp://ftp.es.FreeBSD.org/pub/FreeBSD/
ftp://ftp.fi.FreeBSD.org/pub/FreeBSD/
ftp://ftp.fr.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.ie.FreeBSD.org/pub/FreeBSD/
ftp://ftp.is.FreeBSD.org/pub/FreeBSD/
ftp://ftp5.pl.FreeBSD.org/pub/FreeBSD/
ftp://ftp3.ru.FreeBSD.org/pub/FreeBSD/
ftp://ftp.se.FreeBSD.org/pub/FreeBSD/
ftp://ftp.si.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.tw.FreeBSD.org/pub/FreeBSD/
ftp://ftp.uk.FreeBSD.org/pub/FreeBSD/
ftp://ftp2.us.FreeBSD.org/pub/FreeBSD/
ftp://ftp5.us.FreeBSD.org/pub/FreeBSD/

FreeBSD is also available via anonymous FTP from mirror sites in the
following countries: Argentina, Australia, Brazil, Bulgaria, Canada,
China, Czech Republic, Denmark, Estonia, Finland, France, Germany,
Hong Kong, Hungary, Iceland, Ireland, Japan, Korea, Lithuania,
Amylonia, the Netherlands, New Zealand, Poland, Portugal, Romania,
Russia, Saudi Arabia, South Africa, Slovak Republic, Slovenia, Spain,
Sweden, Taiwan, Thailand, Ukraine, and the United Kingdom.

Before trying the central FTP site, please check your regional
mirror(s) first by going to:

ftp://ftp.yourdomain.FreeBSD.org/pub/FreeBSD

Any additional mirror sites will be labeled ftp2, ftp3 and so on.

More information about FreeBSD mirror sites can be found at:

http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html

For instructions on installing FreeBSD, please see Chapter 2 of The
FreeBSD Handbook.  It provides a complete installation walk-through
for users new to FreeBSD, and can be found online at:

http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/install.html

 Acknowledgments
 

Many companies donated equipment, network access, or man-hours to
finance the release engineering activities for FreeBSD 6.0 including
The FreeBSD Foundation, FreeBSD Systems, Hewlett-Packard, Yahoo!,
Sentex Communications, and SPARTA.

The release engineering team for 6.0-RELEASE includes:

Scott Long [EMAIL PROTECTED] Release Engineering,
I386 and AMD64 Release Building
Ken Smith [EMAIL PROTECTED]Sparc64 Release Building, Mirror
Site Coordination
Robert Watson [EMAIL PROTECTED] Release Engineering, Security
Doug White [EMAIL

Re: locking in a device driver

2005-11-02 Thread Scott Long


Dinesh Nair wrote:



On 11/03/05 03:12 Warner Losh said the following:


Yes.  if you tsleep with signals enabled, the periodic timer will go
off, and you'll return early.  This typically isn't what you want
either.



looks like i've got a lot of work to do, poring thru all the ioctls for 
the device and trying to use another method to wait instead of tsleep().


Note that a thread can block on select/poll in 4.x and still allow other
threads to run.  I used this to solve a very similar problem to your in
a 4.x app of mine.  I have the app thread wait on select() on the device
node for the driver.  When the driver gets to a state when an ioctl
won't block (like data being available to read), then it does the
appropriate magic in it's d_poll method.  select in userland sees this,
allows the thread to resume running, and the thread then calls ioctl.
Of course you have to be careful that you don't have multiple threads
competing for the same data or that the data won't somehow disappear
before the ioctl call runs.  But it does work.  Look at the aac(4)
driver for my example of this.

The other option is to use rfork, aka 'linuxthreads' to similate threads
via linked processes that share their address space.  Each 'thread' is
actually a process, and if one 'thread' blocks the rest are still 
allowed to run.  It's more heavy-weight than real threads, but it does

also work.




works.  If you use libc_r on 5, you'll see exactly this behavior.  If
you use libpthread or libthr, you won't.



i use gcc -pthread, so it's libc_r on 4.x. what does 'gcc -pthread' link 
to on 5.x ?




lpthread, I believe.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: locking in a device driver

2005-11-01 Thread Scott Long


Dinesh Nair wrote:



On 10/28/05 16:40 Dinesh Nair said the following:




On 10/28/05 10:52 M. Warner Losh said the following:


libc_r will block all other threads in the application while an ioctl
executes.  libpthread and libthr won't.  I've had several bugs at work




which is a Good Thing(tm) indeed for me on 4.x.



which may not be a Good Thing(tm) after all. this could be causing the 
problem i'm seeing with the driver on 4.x. any methods to get around 
this, short of not using threads ?




I think this thread has gone too far into hyperbole and conjecture. 
What is your code trying to do, and what problems are you seeing?


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: locking in a device driver

2005-11-01 Thread Scott Long


Dinesh Nair wrote:



On 11/02/05 03:02 Julian Elischer said the following:


drops to splzero or similar,..
woken process called,
starts manipulating another buffer
collides with next interrupt.



that makes a lot of sense, i'll try with using splxxx() in the pseudo 
driver, to block out the real driver. it's currently splhigh() due to 
INTR_TYPE_MISC being used, but i guess i could change this to 
INTR_TYPE_NET or INTR_TYPE_TTY. what would be good for a 
telecommunications line card which is time sensitive and interrupts at a 
constant 1000Hz ?


INTR_TYPE_TTY and spltty




it needs to call splxxx() while it is doing it..
I would suggest having two buffers and swapping them under splxxx() so 
that

the one that the driver is accessing is not the one you are draining.
that  way teh splxxx() levle needs to only be held for the small time 
you are doing the swap.




the first buffer is actually the buffer into which DMA reads/writes are 
done. what i referred to as another buffer is in fact a ring of 
buffers. the real driver writes into the top of the ring, and increments 
the top ring pointer. the pseudo driver reads from the bottom of the 
ring and increments the bottom ring pointer.


buf1 buf2 buf3 buf4 buf5 buf6 buf7 buf8
  ^ ^
  | |
  | +-- top ring pointer, incremented as real driver reads
  | from device
  +-- bottom ring pointer, incremented as userland reads from pseudo



You'll also want to use an spl in the top half of the pseudo driver to
cover where the pointers are read and changed.




not locks, but spl,
and only step 8 needs to be changed because all teh rest are already 
done at high spl.



wouldnt a lockmgr() around the access to these ring buffers help since 
we're locking access to data and not necessarily execution ?




lockmgr is far to heavy-weight and complex for this.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Display files currently in the buffer cache

2005-10-31 Thread Scott Long


Eric Anderson wrote:

Mark Kirkwood wrote:


Dear hackers,

I'm interested in being able to display some data about the contents 
of the buffer cache , say file name and page offset (something like 
IRIX's 'bufview').


Is there any utilities that do this currently? (searched around but 
didn't see anything in ports).


Assuming not, is it feasible to write one to do this? (if so, any 
pointers appreciated - massive FreeBSD internals newbie here).



This would be a cool tool!  I've been thinking of that too, and also 
would like to have a lkdump tool - which dumps information about 
currently locked files.


Eric





Does the FreeBSD VM really have a concept of filenames at all?  I
thought that all it understood was buffer objects and vnodes.  And
since there isn't a strong correlation between vnodes and the filesystem
namespace, it would be hard to provide such information.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Very slow writing to SATA disk

2005-10-28 Thread Scott Long


Søren Schmidt wrote:



On 28/10/2005, at 23:45, Mikhail Teterin wrote:


Indeed, 55C is way to high for 24/7 usage, and it might be that the
drive is choking on it and barely is able to compensate.



The reads are pretty quick... I'd like to be able to spin it down, but
ataidle is broken :-(



Ask the maintainer to get it fixed, but be warned experience says it  
might hose your data...



What does SMART say ? any unusual like high correction rates or
anything ?





(SMART data deleted)

Well except the excessive temperature nothing out of the ordinary...

Now, you say read speed is OK, but write speed isnt, is that on the  raw 
disk device or though the filesystem ?


Søren Schmidt
[EMAIL PROTECTED]



For what it's worth, I'm seeing slow write speeds on some tests with
other (non-ata) controllers.  Haven't had time to isolate it just yet.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: correct use of bus_dmamap_sync

2005-10-27 Thread Scott Long


Dinesh Nair wrote:


On 10/27/05 04:16 Scott Long said the following:

an example would be using 
(BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE) which
would be 0x03 in freebsd 4.x and 0x06 in freebsd 5.x. the gotcha is 
that

0x03 in freebsd 4.x is BUS_DMASYNC_POSTWRITE. so therefore,
BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE will be 
BUS_DMASYNC_POSTWRITE in

4.x which in the syscall is actually a no op.



Yes, that is fugly.  Just don't use the | versions for now I would 
guess.



Trying to maintain source compatibility between 4.x and 5.x/6.x will 
make you encounter a whole lot more problems than just this.



could you elaborate on what busdma related problems there'd be, between 
4.x and 5.x/6.x ? do, for example, the inner workings of the bus_dma* 
syscalls work the same on both ?




I was speaking about driver code in general.  For busdma specifically,
the only difference is the extra arguments to bus_dma_tag_create().

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: locking in a device driver

2005-10-27 Thread Scott Long


Dinesh Nair wrote:


carrying on this discussion, what would be a good locking mechanism to 
use to protect tsleep() and other sensitive areas in a driver in freebsd 
4.x ?


the current code for the driver in 5.x uses mtx_lock and mtx_unlock with 
some parts even being protected by mtx_lock(Giant).


would the use of simple_lock() or s_lock() do, given that 
SIMPLELOCK_DEBUG was defined in the 4.x kernel ?


the mechanism is actually a pseudo device driver which communicates with 
the real device driver. the pseudo device driver creates a bunch of 
/dev/ devices which the userland reads/writes to, and the pseudo device 
driver then places data in a few buffers.


the real device driver then reads these buffers and uses busdma to send 
the data to the device. reading is done by using busdma to read from the 
device and then placing the data in these buffers for the pseudo device 
to return to the userland process.


locking in the real device driver uses splhigh/splx, but what locking 
should be used in the pseudo device driver ?




If you need to protect your pseudodriver from being interrupted by the
real driver then you'll need to use the same spl() as the driver.  Note
that you shouldn't be using splhigh() unless you really know what you
are doing.  Other than that, there likely isn't anything that you need
to do for 'locking' in 4.x.  The kernel is non-reentrant there, so you
don't need to worry about synchronizing multiple threads.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: locking in a device driver

2005-10-27 Thread Scott Long


M. Warner Losh wrote:

In message: [EMAIL PROTECTED]
Scott Long [EMAIL PROTECTED] writes:
: Dinesh Nair wrote:
:  
:  carrying on this discussion, what would be a good locking mechanism to 
:  use to protect tsleep() and other sensitive areas in a driver in freebsd 
:  4.x ?
:  
:  the current code for the driver in 5.x uses mtx_lock and mtx_unlock with 
:  some parts even being protected by mtx_lock(Giant).
:  
:  would the use of simple_lock() or s_lock() do, given that 
:  SIMPLELOCK_DEBUG was defined in the 4.x kernel ?

: 
:  the mechanism is actually a pseudo device driver which communicates with 
:  the real device driver. the pseudo device driver creates a bunch of 
:  /dev/ devices which the userland reads/writes to, and the pseudo device 
:  driver then places data in a few buffers.
:  
:  the real device driver then reads these buffers and uses busdma to send 
:  the data to the device. reading is done by using busdma to read from the 
:  device and then placing the data in these buffers for the pseudo device 
:  to return to the userland process.
:  
:  locking in the real device driver uses splhigh/splx, but what locking 
:  should be used in the pseudo device driver ?
:  
: 
: If you need to protect your pseudodriver from being interrupted by the

: real driver then you'll need to use the same spl() as the driver.  Note
: that you shouldn't be using splhigh() unless you really know what you
: are doing.  Other than that, there likely isn't anything that you need
: to do for 'locking' in 4.x.  The kernel is non-reentrant there, so you
: don't need to worry about synchronizing multiple threads.

One thing to also bear in mind is that in 4.x spl locking is a code
lock.  It keeps multiple 'threads' of execution from entering a block
of code.  mutexes in -current are data locks.  While usually one can
think of the two the same, it can trip the unweary up.

Locking in 4.x is indeed much simpler.

Warner


I wouldn't characterize spls that way.  An spl keeps top-half code from 
being preempted by an interrupt that would cause bottom half code to 
run.  It's more of a special critical section that a code serializer.

It's big advantage is that it doesn't mask out all interrupts, just
the ones that you want, and it's much more light weight on x86 that
doing explicit cli/sti instructions.  It's the BGL spinlock that keeps
multiple processes from executing the top half at the same time, and
there is no control over that; it's just 'there'.  The synchronization
guarantees that you have in the 4.x kernel are:

1.  Only one process will be executing in the kernel at a time. 
Multiple processes might be blocked at the same time, but only one

will be executing, regardless of the number of CPUs.

2.  Only one interrupt handler will execute at a time, and while it is
executing there will not be any top half code executing on any other CPUs.

3.  Interrupt handlers can preempt a process executing in the kernel 
unless the appropriate spl mask/level is set.


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: [Fwd: Re: use of bus_dmamap_sync]

2005-10-26 Thread Scott Long


John Baldwin wrote:


On Wednesday 26 October 2005 04:47 am, Dinesh Nair wrote:


On 10/26/05 10:39 Scott Long said the following:


Apparently the original poster sent his question to me in private, then
sent it again to the mailing list right as I was responding in private.


apologies on that, scott. an initial search only turned up your message in
the archives, but spreading it wider (not confining the google to
lists.freebsd.org) brought up more hits, and that made me post it into
-hackers.

do bear with me as i try to understand this.



Below is my response.  Note that I edited it slightly to fix an error
that I found

 bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD);
 Ask hardware for data
 bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD);



   read from readbuf (i'm assuming that device has put data in
  readbuf)
   POSITION B
}


in other words, the PREREAD/POSTREAD wrap around the device's access to
memory, and not the CPU's ?



Yes, scott's notes are more correct than mine here.



 bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE);
 notify hardware of the write
 bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE);

The point of the syncs is to do the proper memory barrier and cache
coherency magic between the CPU and the bus as well as do the memory
copies for bounce buffers.  If you are dealing with statically mapped
buffers, i.e. for an rx/tx descriptor ring, then you'll want code


however, reading thru the syscall code, bus_dmamem_alloc() sets the dmamap
to NULL, and if it's null, bus_dmamap_sync() is not called at all. would
this mean that if memory is allocated by bus_dmamem_alloc(), it does not
need to be synced with bus_dmamap_sync() ?




The value of the map is an implementation detail, which is why it's an
opaque typedef.  Portable code should always assume that the map has
valid data.  Now, specifically for i386, if you have a device with a
4GB address limit, and it has no data alignment constraints (unlike 
twe), and you are not using PAE, then yes the map will be NULL and the 
syncs will do nothing.  Assuming that all three of these cases are false

is not good, though.



Perhaps on i386.  Each arch implements sync().  Argh, it does look like the 
memory barriers needed on e.g., Alpha aren't used with static buffers because 
of the map != NULL check in sys/busdma.h.  *sigh*  I guess archs that need 
membars even without bounce buffers need to always allocate and setup a 
bus_dmamap.  None of that matters for i386 though.




Feel free to fix alpha.  Again, long ago, I thought that alpha pretended
to be coherent in the 2GB DMA window that we use so that it could be
more like i386.  If that's not true then that's fine.  If you need to
make structural changes to the MI code on order to fix alpha, please let
me know.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: correct use of bus_dmamap_sync

2005-10-26 Thread Scott Long


John Baldwin wrote:


On Wednesday 26 October 2005 02:13 am, Dinesh Nair wrote:


On 10/26/05 04:10 John Baldwin said the following:


Yes, and on some archs the sync() operations do have memory barriers in
place, but there isn't any bounce buffering with bus_dmamem_alloc()
memory.


and in _bus_dmamap_load() in /usr/src/sys/i386/i386/busdma_machdep.c,
apparently if the second argument to bus_dmamap_load (the pointer to
bus_dmamap_t)) is NULL, the syscall code sets it to nobounce_dmamap, a
static struct which doesnt seem to be used/allocated, except within the
syscall.

what would the implications of using NULL for the dmamap address be ?



Well, you need it to get the physical address to pass to your device for
it to do DMA against.


on freebsd 4.x, vtophys(buffer) returns the same value as the this address.
 (i.e, when the callback function from bus_dmamap_load() is called, the
address of the segment returned is the same as vtophys(buffer)). this is
the current observed behaviour on 4.x.



On i386, yes.  It won't on sparc64 when using an IOMMU for example.  The whole 
point of using bus_dma is to not use vtophys() since by doing that you are 
assuming that the PA's used by the CPU map 1:1 to the addresses used by your 
device to do DMA, and on architectures with an IOMMU such as sparc64, G5 ppc 
boxes, and probably amd64 boxes in the future, that is not a valid assumption 
at all.




Well, the point of busdma is to make the DMA mechanics transparent to 
the driver.  It's not just about IOMMUs, it's also about handling 
alignment constraints and address boundaries and exclusion areas.  It's
a set-it-and-forget-it deal.  Set the requirements and constraints in 
the tag, follow the API, and the details Just Work without having to

worry about them.




have things changed between freebsd 4.x (which i'm using) and freebsd 5.x
?


I don't think so as far as the interface.


the values of the BUS_DMASYNC_ constants have changed though. they're
an enum with values 0-3 in 4.x but in 5.x they're defined as 0x01, 0x02,
0x04 and 0x08. due to this, combining BUS_DMASYNC_XXX thru an OR could
possibly give different behaviour on 4.x and 5.x.

an example would be using (BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE) which
would be 0x03 in freebsd 4.x and 0x06 in freebsd 5.x. the gotcha is that
0x03 in freebsd 4.x is BUS_DMASYNC_POSTWRITE. so therefore,
BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE will be BUS_DMASYNC_POSTWRITE in
4.x which in the syscall is actually a no op.



Yes, that is fugly.  Just don't use the | versions for now I would guess.



Trying to maintain source compatibility between 4.x and 5.x/6.x will 
make you encounter a whole lot more problems than just this.





also, in both 4.x and 5.x, only POSTREAD and PREWRITE have any real
meaning, as PREREAD and POSTWRITE are no ops.



On i386, yes.  Eventually those operations might be used to manipulate IOMMU 
mappings for example.




I honestly don't ever expect to see IOMMU code for i386.  The IOMMU that 
is provided by the AGP bus is fairly limited in what it can do, and 
trying to coordinate its use with X would be simply a nightmare.  I'm 
less clear on the IOMMU that exists for amd64 and whether it's a true 
IOMMU or just an aliasing of the AGP IOMMU.


Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

[Fwd: Re: use of bus_dmamap_sync]

2005-10-25 Thread Scott Long

Apparently the original poster sent his question to me in private, then 
sent it again to the mailing list right as I was responding in private.

Anyways, no need to continue to guess; if anyone has any questions, feel
free to ask.

Below is my response.  Note that I edited it slightly to fix an error 
that I found


Scott

 Original Message 
Subject: Re: use of bus_dmamap_sync
Date: Tue, 25 Oct 2005 07:59:03 -0600
From: Scott Long [EMAIL PROTECTED]
To: Dinesh Nair [EMAIL PROTECTED]
References: [EMAIL PROTECTED]

Dinesh Nair wrote:


hi scott,

i came across this message of yours,
http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044395.html 



and you seem like the perfect person to assist me in something. i've been
trying to figure out the best places to use bus_dmamap_sync when
reading/writing to a dma mapped address space. however, i cant seem to get
the gist of this, either from the mailing list discussions or the man page.
could you assist me ?

i'm on FreeBSD 4.11 right now, and i notice the definitions of 
BUS_DMASYNC_* has changed from an enum (0-3) in 4.x to a typedef in 5.x.


this is what i have done. i have used two buffers to handle reads from the
device and writes to the device. the pseudocode is as follows

rx_func()
{
POSITION A

  bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD);
  Ask hardware for data
  bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD);


read from readbuf (i'm assuming that device has put data in
   readbuf)
POSITION B
}

tx_func()
{
POSITION C

write to txbuf (here's where we write to txbuf)

  bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE);
  notify hardware of the write


POSITION D

  bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE);

}

what BUS_DMASYNC_{PRE,POST}{READ,WRITE} option should i use  for 
bus_dmamap_sync in position A, B, C and D ?


any assistance would be gladly appreciated, as i'm seeing some really weird
symptoms on this device, where data written out is being immediately read
in. i'm guessing this has to do with my wrong usage of bus_dmamap_sync().



The point of the syncs is to do the proper memory barrier and cache
coherency magic between the CPU and the bus as well as do the memory
copies for bounce buffers.  If you are dealing with statically mapped
buffers, i.e. for an rx/tx descriptor ring, then you'll want code
exactly like described above.  In reality, most platforms only do stuff
for the POSTREAD and PREWRITE cases, but for the sake of completeness
the others are documented and usually used in drivers.  NetBSD might
have platforms that require operations for PREREAD and POSTWRITE, but
I've never looked that closely.

If you are dealing with dynamic buffers,
i.e. for mbuf data, then you'll want the PREREAD and PREWRITE ops to
happen in the callback function for bus_dmamap_load() and the POSTREAD
and POSTWRITE ops to happen right before calling bus_dmamap_unload.  So
in this case is would be:

rx_buf()
{
allocate buffer
allocate map
bus_dmamap_load(tag, map, buffer, size, rx_callback, arg, flags)
}

rx_callback(arg, segs, nsegs, errno)
{
convert segs to hardware format
bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD)
notify hardware about buffer
}

rx_complete()
{
bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD)
bus_dmamap_unload(tag, map, buffer)
deallocate map
process buffer
}

tx_buf()
{
fill buffer
allocate map
bus_dmamap_load(tag, map, buffer, size, tx_callback, arg, flags)
}

tx_callback(arg, segs, nsegs, errno)
{
convert segs to hardware format
bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE)
notify hardware about buffer
}

tx_complete()
bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE)
bus_dmamap_unload(tag, map, buffer)
deallocate map
free buffer
}

This is the design that busdma was originally modelled on.  It works
well for storage devices where the load operation must succeed.  It
doesn't work as well for network devices where the latency of the
indirect calls is measurable.  So for that, I added
bus_dmamap_load_mbuf_sg().  It eliminates the callback function and
returns the scatter gather list directly.  So, the above example would
be:

tx_buf()
{
bus_dma_segment_t segs[maxsegs];
int nsegs;

fill buffer
allocate map
bus_dmamap_load_mbuf_sg(tag, map, buffer, size, segs, nsegs)
convert segs to hardware format
bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE)
notify hardware about buffer
}

Also, the 'allocate map' part should be done carefully.  Most network
drivers are lazy and call bus_dmamap_create() and bus_dmamap_destroy()
for each buffer.  It's often better to pre-allocate the maps at init
time, put them on a list, and then just push and pop them off the list
at runtime.  This is usually faster than calling the busdma

Re: Driver Development Books?

2005-10-12 Thread Scott Long


Pete wrote:

Hello,
   I have what may seem to be a silly question, but I cannot find any 
other decent resources on the web. . The problem that I am having 
right now is
that I have a fairly nice graphics card which, for the moment is only 
supported on Windows Operating systems, and old 2.4 Linux kernels. So 
far there has
not been much positive outlook in porting the drivers to *BSD or any of 
the 2.6 kernels that I know of, let alone 64-bit drivers for non-Win OSes.


So I guess that makes my question fairly simple then; I know that driver 
code is written in C (which I am learning currently) but thats about all 
I know. I'm probably
not far off when I say that I need more to go on. Yet, from looking at 
Amazon.com I have not been able to find any books on writing driver 
code, which is really

frustrating.

One of my security related books, Rootkits, tells me about how to write 
drivers for a completely different reason so I know a bit more about how 
they work but again
the code involved does not interface hardware to the OS, just injects a 
custom application. The other tool that I will probably use is Jungo, 
which is a nice-looking
application which automates a skeletal version of the driver you need, 
but again, I would not know how to fill it out.


Any help is appreciated.

-Pete



There are indeed no books that I know of on the subject of writing
drivers for any *BSD, let alone FreeBSD.  For the last year I've wanted
to sit down and write such a book, but the amount of time needed to do
this is daunting.   Anyways, there were a couple of articles published
back around 2000 on DeamonNews that covered some basic information on
writing kernel modules, and they are likely still available via the
various web search engines.  For more detailed information, you'll need
to dig into the kernel source code, look for appropriate manual pages,
and ask questions.  There are a number of really good people on this
list that try to answer most questions like this, so don't be afraid to
ask.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Driver Development Books?

2005-10-12 Thread Scott Long


Sangwoo Shim wrote:


2005/10/12, Scott Long [EMAIL PROTECTED]:


Pete wrote:


Hello,
  I have what may seem to be a silly question, but I cannot find any
other decent resources on the web. . The problem that I am having
right now is
that I have a fairly nice graphics card which, for the moment is only
supported on Windows Operating systems, and old 2.4 Linux kernels. So
far there has
not been much positive outlook in porting the drivers to *BSD or any of
the 2.6 kernels that I know of, let alone 64-bit drivers for non-Win OSes.

So I guess that makes my question fairly simple then; I know that driver
code is written in C (which I am learning currently) but thats about all
I know. I'm probably
not far off when I say that I need more to go on. Yet, from looking at
Amazon.com I have not been able to find any books on writing driver
code, which is really
frustrating.

One of my security related books, Rootkits, tells me about how to write
drivers for a completely different reason so I know a bit more about how
they work but again
the code involved does not interface hardware to the OS, just injects a
custom application. The other tool that I will probably use is Jungo,
which is a nice-looking
application which automates a skeletal version of the driver you need,
but again, I would not know how to fill it out.

Any help is appreciated.

-Pete



There are indeed no books that I know of on the subject of writing
drivers for any *BSD, let alone FreeBSD.


[snip]


For me, following book was quite helpful:
Embedded FreeBSD cookbook, by Paul Cevoli
ISBN: 1589950046

It tells about basic kernel data structure for driver writing. One of
the best aspect of this book is that it shows you real code for real
device (a simple PCI device). Moreover, it was quite easy to read.
Although it focuses on FreeBSD 4.X. For those who want some
_introduction_ for the FreeBSD driver
writing, I would like to recommend this.

Regard,
Sangwoo Shim


Ah, didn't know about that book.  Yes, that sounds like a good
foundation, though some aspects of drivers in 5.x and beyond are
vastly different than in 4.x and prior, particularly concerning
synchronization and interrupt behaviour.  The next step is to talk about
the different driver APIs and infrastructure, as well as debugging
guides.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Fwd: Re: Linksys WRT54G with freebsd

2005-09-23 Thread Scott Long


Bruno Ducrot wrote:

On Fri, Sep 23, 2005 at 01:50:45PM +0200, Florent Thoumie wrote:


Le Vendredi 23 septembre 2005 à 12:16 +0200, Bachilo Dmitry a écrit :

Forwarding to FreeBSD hackers. (Because i am hacking WRT right now and only 
Linux flashes work)

--  ?? ??  --

Subject: Re: Linksys WRT54G with freebsd
Date: ?? 23  2005 17:06
From: Thierry Herbelot [EMAIL PROTECTED]
To: freebsd-current@freebsd.org
Cc: Marcos Biscaysaqu - ThePacific.net [EMAIL PROTECTED]

Le Friday 23 September 2005 11:08, vous avez écrit :


On the other hand, it's the wireless thing.  If not needed, this
should  be fun to do a port, somehow, even though it's a wireless
router.


The cool factor of porting FreeBSD to the WRT54G cannot be underestimated,
but Linux ports were enormously helped by the opening of the sources of the
Linksys Linux port (which is absent for FreeBSD) and the big number of
willing developpers (just have a look at the *number* of different Linux
ports to the WRT).

The latest 6.0 release would be an excellent target, with its brand-new
support for WPA and virtual APs ... who volunteers ?


	The Linksys WRT54g wireless router is based on a Broadcom CPU 
	(derived from MIPS) and FreeBSD/mips seems to be a dead 
	project :-(



Indeed.  It's targeted to SGI platforms anyway.  Maybe there is a need to
start a new port if there is enough people interrested?



There has been talk of doing this in the past year from some people, but
I don't know if it got very far.  If you're inspired, go for it!  There
are plenty of docs on the web about how to attach a serial port header
and bootstrap it.  And, don't underestimate the mips32 work that is
already in the tree; it's likely a good starting point.

And, it's more than just a 'coolness' factor.  I'd really like to have
pf running on mine, that way I could rid of the clunky machine doing
static NAT + firewall on my DSL line.  THe linux firewall capabilities
are soo last century =-)

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: KINDLY HELP : error while kldloading a pci,character driver

2005-09-21 Thread Scott Long


rashmi ns wrote:

Hi,
 Amazing, Thanks a lot it really works . 
 Now i have to read what D_VERSION does :-)

Thanks ,
Rashmi.N.S


You also need to remove .d_maj.  /dev entries are created dynamically 
now, and you application should have no knowledge of the major and

minor number internals of it.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: PCI_MULTI FUNCTION DEVICE DRIVERS

2005-09-13 Thread Scott Long


rashmi ns wrote:

Hello All,
While writing a pci-driver for hdlc controller which has two functions 
1.BRIDGE

2.Network
Do we need to write two separate drivers for each class-code or how can a 
single driver manage two different functionalites .Are there any examples on 
pci-multifunction drivers .I read in the documents that we need to use mbuf 
structure for multi-function devises are there any drivers which uses the 
same.

Thanks in advance ,
Rashmi.N.S


In the FreeBSD driver model, the driver 'probe' method will get called
for each PCI function in the system.  The driver can either bid to
claim it, or reject it.  It's up to the driver author as to what
criteria is used to bid/accept vs reject a function.  Almost all drivers
look at the pci device id set.  Looking at merely the PCI class code is
not recommended since it it far too ambiguous.  Also, unlike Linux, the
driver does not have easy access to the PCI enumeration internals of the
OS.  There are also no guarantees as to what order the bus will be
probed or what order functions will be enumerated in.

Do you actually need to program both functions of the hardware?  Usually
a bridge device tends to be passive from a driver standpoint.  Is there
something special that your bridge does?  If so then you'll need to
write two sets of drivers, each with a unique probe, attach, and detach
method.  Making the separate driver instances work together will be a
bit tricky.  The easy but really messy approach is to create some global
variables and methods and have the drivers cheat.  I'd avoid this is at
all possible.  Probably the more correct approach is to have each
function walk the device tree and look for its sibling, then communicate
via custom DEVMETHOD's.  This does have performance implications though,
so a combination of walking the tree then calling via direct dispatch
is probably the best approach if performance is a factor.

If you could provide more information about what your device is and what
each function does, I can probably give better answers.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Adding new option to ktrace

2005-09-06 Thread Scott Long


Nikhil Dharashivkar wrote:

Hi Scott and Rajesh,
 Thanks for replying me. Basically what happend, while testing
scsi driver on freebsd, at  some point it crashes. So, there is no way
to know how much IO is performed. To know the IO state just before the
driver fails, i selected ktrace to print IO information whatever i ll
get from dastrategy routine.


You have reason to believe that certain I/O patterns cause the crash?
What driver is being used?  What is the crash?

Scott


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Adding new option to ktrace

2005-09-05 Thread Scott Long


Nikhil Dharashivkar wrote:

Hi,
   i want to hack the ktrace system call. Basically, I want to monitor
scsi disk IO through dastrategy() routine.
It seems that kern_ktrace.c implements different functions for
ktrace options like -tc / -ti ... etc (see man page). So, is it
possible to add new option for disk IO with new structure object
containing disk io information which will be pass to
ktr_submittrequest thr' ktr_request structure.
 Will data will be written correctly in ktrace.out and will
kdump analyze that ?





What are you trying to monitor?  Would the existing devstat interface
work?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Adding new option to ktrace

2005-09-05 Thread Scott Long


Rajesh S. Ghanekar wrote:

Scott Long wrote:


Nikhil Dharashivkar wrote:


Hi,
   i want to hack the ktrace system call. Basically, I want to monitor
scsi disk IO through dastrategy() routine.
It seems that kern_ktrace.c implements different functions for
ktrace options like -tc / -ti ... etc (see man page). So, is it
possible to add new option for disk IO with new structure object
containing disk io information which will be pass to
ktr_submittrequest thr' ktr_request structure.
 Will data will be written correctly in ktrace.out and will
kdump analyze that ?





What are you trying to monitor?  Would the existing devstat interface
work?



May be he requires how many bytes transferred (read/write) while a 
process is executing.
I guess devstat doesn't do it from process context, it gives total IO 
read/writes from a device,

if registred via devstat. Please correct me if I am wrong.


- Rajesh



There isn't a 1:1 correlation between the bytes that the userland 
program writes, and the bytes that actually get written to disk.

Filesystem metadata writes will happen if the file needs to be
extended, not to mention the access time being updated.  Some writes
won't even originate from a userland program, like swap writes.
GEOM also decouples the I/O path, so it's not the user process that
will actually do the write, it's the g_down kthread.  I would think
that this would make tracking I/O via ktrace very hard.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Adding new option to ktrace

2005-09-05 Thread Scott Long


Nikhil Dharashivkar wrote:

Yes, what rajesh saying is right , i want to print IO Bytes.


You want to capture writes coming from userland, or you want to
capture all low-level disk writes?  Are you trying to correlate
these writes with a particular user process?  Consider an mmaped
file.  A userland program will modify the memory fronting the file,
at at some point the pagedaemon kthread will come in and flush those
dirty pages, independent of the user process.  Also, like I said,
device strategy routines are decoupled from the syscall callers by
the g_down kthread.  Trying to figure out the userland thread from
dastrategy that is responsible for the I/O is going to be tricky,
if even possible at all.

Scott



On 9/6/05, Scott Long [EMAIL PROTECTED] wrote:


Rajesh S. Ghanekar wrote:


Scott Long wrote:



Nikhil Dharashivkar wrote:



Hi,
  i want to hack the ktrace system call. Basically, I want to monitor
scsi disk IO through dastrategy() routine.
   It seems that kern_ktrace.c implements different functions for
ktrace options like -tc / -ti ... etc (see man page). So, is it
possible to add new option for disk IO with new structure object
containing disk io information which will be pass to
ktr_submittrequest thr' ktr_request structure.
Will data will be written correctly in ktrace.out and will
kdump analyze that ?





What are you trying to monitor?  Would the existing devstat interface
work?



May be he requires how many bytes transferred (read/write) while a
process is executing.
I guess devstat doesn't do it from process context, it gives total IO
read/writes from a device,
if registred via devstat. Please correct me if I am wrong.


- Rajesh



There isn't a 1:1 correlation between the bytes that the userland
program writes, and the bytes that actually get written to disk.
Filesystem metadata writes will happen if the file needs to be
extended, not to mention the access time being updated.  Some writes
won't even originate from a userland program, like swap writes.
GEOM also decouples the I/O path, so it's not the user process that
will actually do the write, it's the g_down kthread.  I would think
that this would make tracking I/O via ktrace very hard.

Scott







___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Low umass performance with USB 2.0 ports

2005-08-31 Thread Scott Long


Ian Dowse wrote:

In message [EMAIL PROTECTED], Eygene A. Ryabinkin wri
tes:


What is filesystem has your USB drive?


The one I was extensively testing has FAT, but I've checked the UFS2 --
just a bit better -- 1.8 Mb/second. But you're right -- no wdrains at all.


FreeBSD 4.x had very low performance with FAT filesystem,
writing process spent lots of time in the wdrain state too.


Yes, it has. But here the same flash drive gives different results for
ehci and uhci devices, and the total speed of echi is lower due to wdrains:
300 Kb/sec versus 500 Kb/sec. And I sometimes write my data to the Windows
partition with FAT to my home HDD -- it has no wdrains. At least, I've not
noticed them. For flash I can.



The patch in from the email below may help with the wdrain state -
can you see if it makes any difference?


Is the problem that the interrupt gets fired but not all of the status
information has made it's way back to host memory when the driver gets
there?  Would it make a difference to instead read back the EHCI_USBSTS
register after writing to it in ehci_intr1?  That way all transactions
down to the controller would be guaranteed to be flushed before you
continue on.  I wonder if this is a remnant of the famous problems with
VIA chipsets doing bad things under medium-to-high PCI contention.  I
don't see any obvious workarounds for this in the Linux EHCI code, so
I wonder if it's a case of them not encountering it, or doing something
different that avoids the problem.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Low umass performance with USB 2.0 ports

2005-08-31 Thread Scott Long


Scott Long wrote:

Ian Dowse wrote:

In message [EMAIL PROTECTED], Eygene A. 
Ryabinkin wri

tes:


What is filesystem has your USB drive?



The one I was extensively testing has FAT, but I've checked the UFS2 --
just a bit better -- 1.8 Mb/second. But you're right -- no wdrains at 
all.



FreeBSD 4.x had very low performance with FAT filesystem,
writing process spent lots of time in the wdrain state too.



Yes, it has. But here the same flash drive gives different results for
ehci and uhci devices, and the total speed of echi is lower due to 
wdrains:
300 Kb/sec versus 500 Kb/sec. And I sometimes write my data to the 
Windows
partition with FAT to my home HDD -- it has no wdrains. At least, 
I've not

noticed them. For flash I can.




The patch in from the email below may help with the wdrain state -
can you see if it makes any difference?



Is the problem that the interrupt gets fired but not all of the status
information has made it's way back to host memory when the driver gets
there?  Would it make a difference to instead read back the EHCI_USBSTS
register after writing to it in ehci_intr1?  That way all transactions
down to the controller would be guaranteed to be flushed before you
continue on.  I wonder if this is a remnant of the famous problems with
VIA chipsets doing bad things under medium-to-high PCI contention.  I
don't see any obvious workarounds for this in the Linux EHCI code, so
I wonder if it's a case of them not encountering it, or doing something
different that avoids the problem.

Scott


Actually, I just peeked inside the Linux EHCI code and it does a dummy
read immediately after writing to the status register:

/* clear (just) interrupts */
writel (status, ehci-regs-status);
readl (ehci-regs-command);   /* unblock posted write */

I wonder if that's the whole trick here.  Would someone be willing to
try the attached patch instead of the one that Ian posted?

Scott
Index: ehci.c
===
RCS file: /usr/ncvs/src/sys/dev/usb/ehci.c,v
retrieving revision 1.36
diff -u -r1.36 ehci.c
--- ehci.c  29 May 2005 04:42:27 -  1.36
+++ ehci.c  31 Aug 2005 19:44:14 -
@@ -578,6 +578,7 @@
return (0);
 
EOWRITE4(sc, EHCI_USBSTS, intrs); /* Acknowledge */
+   EOREAD4(sc, EHCI_USBCMD); /* Flush posted writes on PCI */
sc-sc_bus.intr_context++;
sc-sc_bus.no_intrs++;
if (eintrs  EHCI_STS_IAA) {
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Low umass performance with USB 2.0 ports

2005-08-31 Thread Scott Long


Hans Petter Selasky wrote:


On Wednesday 31 August 2005 21:47, Scott Long wrote:


Scott Long wrote:


Ian Dowse wrote:


In message [EMAIL PROTECTED], Eygene A.
Ryabinkin wri

tes:


What is filesystem has your USB drive?


The one I was extensively testing has FAT, but I've checked the UFS2 --
just a bit better -- 1.8 Mb/second. But you're right -- no wdrains at
all.



FreeBSD 4.x had very low performance with FAT filesystem,
writing process spent lots of time in the wdrain state too.


Yes, it has. But here the same flash drive gives different results for
ehci and uhci devices, and the total speed of echi is lower due to
wdrains:
300 Kb/sec versus 500 Kb/sec. And I sometimes write my data to the
Windows
partition with FAT to my home HDD -- it has no wdrains. At least,
I've not
noticed them. For flash I can.


The patch in from the email below may help with the wdrain state -
can you see if it makes any difference?


Is the problem that the interrupt gets fired but not all of the status
information has made it's way back to host memory when the driver gets
there?  Would it make a difference to instead read back the EHCI_USBSTS
register after writing to it in ehci_intr1?  That way all transactions
down to the controller would be guaranteed to be flushed before you
continue on.  I wonder if this is a remnant of the famous problems with
VIA chipsets doing bad things under medium-to-high PCI contention.  I
don't see any obvious workarounds for this in the Linux EHCI code, so
I wonder if it's a case of them not encountering it, or doing something
different that avoids the problem.

Scott


Actually, I just peeked inside the Linux EHCI code and it does a dummy
read immediately after writing to the status register:

/* clear (just) interrupts */
writel (status, ehci-regs-status);
readl (ehci-regs-command);   /* unblock posted write */

I wonder if that's the whole trick here.  Would someone be willing to
try the attached patch instead of the one that Ian posted?

Scott



This is not documented in the EHCI chip specification.


Flushing posted writes is something that all programmers of PCI devices
should understand, so it usually isn't documented in device manuals.

There exists the 
doorbell to ensure that the EHCI controller is finished with data structures.
Also I have noticed that the existing EHCI driver does not always dequeue 
structures from the controller before accessing them.




Can you point to an example here?

If Scott's patch doesn't work, could you have tried to install the following 
(compiles on FreeBSD 5/6/7):




Yeah, looks like my guess was wrong.

Scott

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Syscall/Sysret state on i386 arch

2005-08-29 Thread Scott Long


John Baldwin wrote:

On Sunday 28 August 2005 10:32 am, alexander wrote:


The AMD64 arch is using the syscall/sysret opcodes instead of int80h to
perform a syscall (/usr/src/lib/libc/amd64/SYS.h). I just checked the
output my of dmesg and it says:

CPU: AMD Duron(tm) Processor (1311.69-MHz 686-class CPU)
 Origin = AuthenticAMD  Id = 0x671  Stepping = 1

Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV
,\ PAT,PSE36,MMX,FXSR,SSE
 AMD Features=0xc0400800SYSCALL,MMX+,3DNow+,3DNow

I got a hold of the AMD document number 21086.pdf. It describes both
opcodes pretty well, but doesn't tell which CPUs support the new opcodes.
But since the first revision of that document is dated Sept 1997 quite a
lot of i386 CPU's should support the opcodes. The NASM manual only states
[P6,AMD] as the required CPU to perform those opcodes.

I found some patches for Linux that replace the int80h syscall calling

convention with syscall/sysret on i386 and the results look pretty 


convincing:


(INT $0x80 based getpid(), got pid 497) latency:282 cycles
(SYSENTER based getpid(), got pid 497) latency:138 cycles

on a 266 MHz PII this is 0.51 usecs for a getpid(). (was 1.06 usecs)


Quoted from: http://www.ussg.iu.edu/hypermail/linux/kernel/9806.1/0878.html

Does anybody know more about this? Is it even possible to replace the
current syscall implementation that easily or would that require elaborate
changes to all the syscalls (libc), etc. And which CPU's support these new
opcodes? Doesn anybody know if the Linux patches actually got comitted to
the official kernel?



Support for syscall/sysret is determined by a cpuid flag.  I do believe 
someone has worked on either syscall/sysret or sysenter/sysexit support in a 
p4 branch.  You can try asking jeff@ about it.  I think it was 
sysenter/sysexit and it didn't really improve things much.




Actually, the results were fairly inconclusive because it was also 
somewhat unstable under real loads.


The work is in Perforce under

 //depot/user/jeff/sysenter/...

I've worked on this branch also, but not in a few months.  I can
make patches if anyone is interested.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Checking sysctl values from within the kernel.

2005-08-05 Thread Scott Long


Dan Nelson wrote:


In the last episode (Aug 05), Thordur I. Bjornsson said:


If I want to check a sysctl value from within the kernel (e.g. an
KLD), should I use the system calls described in sysctl(3) ?

If not, what is the propper way to do so ?



Since most sysctls are direct mappings onto integer variables in the
kernel, just check the variable directly.



Most of those integer values are also declared static, so they won't
be visible to external code, especially not kld's.

There is no easy way to do this.  I'm sure that you could hack up some
code to simulate a sysctl syscall from within the kernel, but that would
be really really gross, evil, and wrong.  What values are you trying to
get at?  Would it make more sense to export them via real accessor
functions?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Checking sysctl values from within the kernel.

2005-08-05 Thread Scott Long


John Baldwin wrote:

On Friday 05 August 2005 10:50 am, Dan Nelson wrote:


In the last episode (Aug 05), Thordur I. Bjornsson said:


If I want to check a sysctl value from within the kernel (e.g. an
KLD), should I use the system calls described in sysctl(3) ?

If not, what is the propper way to do so ?


Since most sysctls are direct mappings onto integer variables in the
kernel, just check the variable directly.



There's also a kernel_sysctl() function available in the kernel for in-kernel 
access to sysctls.  You might have to lookup the OID for a given name 
yourself though.  Actually, there's a kernel_sysctlbyname() as well.




Shoot, forgot about that function.  However, exporting data throughout
the kernel via the sysctl interface sounds like poor design.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: UFS endian-ness

2005-07-30 Thread Scott Long


M. Warner Losh wrote:

In message: [EMAIL PROTECTED]
Jeremy Baggs [EMAIL PROTECTED] writes:
:   I was wondering if anyone has done any recent work with, or knows how
: (non-)trival it would be adding support for mounting big-endian UFS 
: filesystems,  such as the one in use on os X.


It is trivial.  NetBSD just does the swapping on input or output and
the diffs to do it were small.

Warner


Do their patches include UFS2 and EA support?

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: await asleep

2005-07-27 Thread Scott Long


Daniel Eischen wrote:

On Wed, 27 Jul 2005, Norbert Koch wrote:



The functions await() and asleep() in kern_synch.c
are marked as EXPERIMENTAL/UNTESTED.
Is this comment still valid? Does anyone have used
those functions successfully? Should I better not
use them in my device driver code for RELENG_4?
How do I correctly cancel a request (as I should do
according to the man page): asleep (NULL, 0, NULL, 0)?


The await family was removed in 5.x and beyond, so trying to
use them in 4.x will make your driver very unportable.  There
are better ways than await to handle delayed events.



Well, there's tsleep() and wakeup() for FreeBSD  5.0.  Other
than that, what else can you do?  These functions are deprecated
in 5.x and 6.x in favor of condvar(9) and mutex(9), so you should
really use those instead of tsleep() and wakeup().

It seems the kernel in -current is still using tsleep() and
wakeup() in some places.  I thought we got rid of all these...



  Can you explain why tsleep and wakeup should no longer be
used?  I wasn't aware that they were formally deprecated.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: await asleep

2005-07-27 Thread Scott Long


Daniel Eischen wrote:


On Wed, 27 Jul 2005, Scott Long wrote:



Daniel Eischen wrote:


On Wed, 27 Jul 2005, Norbert Koch wrote:




The functions await() and asleep() in kern_synch.c
are marked as EXPERIMENTAL/UNTESTED.
Is this comment still valid? Does anyone have used
those functions successfully? Should I better not
use them in my device driver code for RELENG_4?
How do I correctly cancel a request (as I should do
according to the man page): asleep (NULL, 0, NULL, 0)?


The await family was removed in 5.x and beyond, so trying to
use them in 4.x will make your driver very unportable.  There
are better ways than await to handle delayed events.



Well, there's tsleep() and wakeup() for FreeBSD  5.0.  Other
than that, what else can you do?  These functions are deprecated
in 5.x and 6.x in favor of condvar(9) and mutex(9), so you should
really use those instead of tsleep() and wakeup().

It seems the kernel in -current is still using tsleep() and
wakeup() in some places.  I thought we got rid of all these...



  Can you explain why tsleep and wakeup should no longer be
used?  I wasn't aware that they were formally deprecated.



My mistake then.  I thought they were deprecated when mutex and
CVs were introduced.  There is no need for them except for compatability,


Incorrect.  A mutex is not a replacement for sleep.  CV's and semaphores
implement some of what tsleep does, but tsleep is absolutely appropriate
when you want to sleep for an event (like disk i/o completing) and don't
need to worry about mutexes.  Not every inch of the kernel needs to be
covered by mutexes, Giant or otherwise.


and the priority argument of tsleep() doesn't have any meaning
any longer, right?



I thought it did, but John can give the definitive answer.

Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: how to use the function copyout()

2005-07-26 Thread Scott Long


Felix-KM wrote:

I think that could work (only an idea, not tested):


struct Region
{
 void * p;
 size_t s;
};


#define IOBIG _IOWR ('b', 123, struct Region)


userland:

 char data[1000];
 struct Region r;

 r.p = data;
 r.s = sizeof data;
 int error = ioctl (fd, IOBIG, r);


kernel:
 int my_ioctl(..., caddr_t data, ...)
 {
   ...
   char data[1000];
   ...
   return copyout(data, ((struct Region *) data)-p, ((struct Region *)
data)-s);
 }


Have a try and tell us if it works.


Norbert




Yes! Now the program works!
I have changed the code in this way:

struct Region
{
  void * p;
  size_t s;
};

#define IOBIG _IOWR ('b', 123, struct Region)



Unless your ioctl handler is going to modify values in the Region struct
and pass them back out to userland, you should just use _IOR instead of 
_IORW.


Scott
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]

1 2 3 >

1 - 100 of 225 matches

Mail list logo