Re: IBM blade server abysmal disk write performances
On Jan 19, 2013, at 4:33 PM, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl wrote: to be enabled to get any speed-up from tagged commands. This was no risk with SCSI drives, since the cache did not make the drives lye i see no correlation between interface type and possibility of lying about command completion. Any interface that enables write cache will lie about write completions. This is true for SAS, SATA, SCSI, and PATA (and probably FC and iSCSI). That's the whole point of the write cache =-) Where things got interesting was in the days of SCSI vs PATA. There was no tagged queuing for PATA, except for a hack that allowed CDROMs to disconnect from the shared bus. So you only got 1 command at a time, and you payed a serialized latency penalty. The only way to get reasonable write performance on PATA was to enable the write cache. Meanwhile, SCSI had TCQ and could amortize the latency penalty to the point where performance with TCQ and no WC was almost as good at with WC. This made SCSI the clear choice for performance + data safety. With SATA vs SAS, the gap is much narrower. The TCQ command set (still used by SAS) is still better than the NCQ command set, but the differences are minor enough that it doesn't matter for most applications. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: IBM blade server abysmal disk write performances
Try adding the following to /boot/loader.conf and reboot: hw.mpt.enable_sata_wc=1 The default value, -1, instructs the driver to leave the STA drives at their configuration default. Often times this means that the MPT BIOS will turn off the write cache on every system boot sequence. IT DOES THIS FOR A GOOD REASON! An enabled write cache is counter to data reliability. Yes, it helps make benchmarks look really good, and it's acceptable if your data can be safely thrown away (for example, you're just caching from a slower source, and the cache can be rebuilt if it gets corrupted). And yes, Linux has many tricks to make this benchmark look really good. The tricks range from buffering the raw device to having 'dd' recognize the requested task and short-circuit the process of going to /dev/null or pulling from /dev/zero. I can't tell you how bogus these tests are and how completely irrelevant they are in predicting actual workload performance. But, I'm not going to stop anyone from trying, so give the above tunable a try and let me know how it works. Btw, I'm not subscribed to the hackers mailing list, so please redistribute this email as needed. Scott From: Dieter BSD dieter...@gmail.com To: freebsd-hackers@freebsd.org Cc: mja...@freebsd.org; gi...@freebsd.org; sco...@freebsd.org Sent: Thursday, January 17, 2013 9:03 PM Subject: Re: IBM blade server abysmal disk write performances I am thinking that something fancy in that SAS drive is not being handled correctly by the FreeBSD driver. I think so too, and I think the something fancy is tagged command queuing. The driver prints da0: Command Queueing enabled and yet your SAS drive is only getting 1 write per rev, and queuing should get you more than that. Your SATA drive is getting the expected performance, which means that NCQ must be working. Please let me know if there is anything you would like me to run on the BSD 9.1 system to help diagnose this issue? Looking at the mpt driver, a verbose boot may give more info. Looks like you can set a debug device hint, but I don't see any documentation on what to set it to. I think it is time to ask the driver wizards why TCQ isn't working, so I'm cc-ing the authors listed on the mpt man page. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: IBM blade server abysmal disk write performances
- Original Message - From: Wojciech Puchar woj...@wojtek.tensor.gdynia.pl To: Scott Long scott4l...@yahoo.com Cc: Dieter BSD dieter...@gmail.com; freebsd-hackers@freebsd.org freebsd-hackers@freebsd.org; gi...@freebsd.org gi...@freebsd.org; sco...@freebsd.org sco...@freebsd.org; mja...@freebsd.org mja...@freebsd.org Sent: Friday, January 18, 2013 11:10 AM Subject: Re: IBM blade server abysmal disk write performances The default value, -1, instructs the driver to leave the STA drives at their configuration default. Often times this means that the MPT BIOS will turn off the write cache on every system boot sequence. IT DOES THIS FOR A GOOD REASON! An enabled write cache is counter to data reliability. Yes, it helps make benchmarks look really good, and it's acceptable if your data can be safely thrown away (for example, you're just caching from a slower source, and the cache can be rebuilt if it gets corrupted). And yes, Linux has many tricks to make this benchmark look really good. The tricks range from buffering the raw device to having 'dd' recognize the requested task and short-circuit the process of going to /dev/null or pulling from /dev/zero. I can't tell you how bogus these tests are and how completely irrelevant they are in predicting actual workload performance. But, I'm not going to stop anyone from trying, so give the above tunable a try and let me know how it works. If computer have UPS then write caching is fine. even if FreeBSD crash, disk would write data I suspect that I'm encountering situations right now at netflix where this advice is not true. I have drives that are seeing intermittent errors, then being forced into reset after a timeout, and then coming back up with filesystem problems. It's only a suspicion at this point, not a confirmed case. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: IBM blade server abysmal disk write performances
On Jan 18, 2013, at 1:12 PM, Dieter BSD dieter...@gmail.com wrote: It is inexcusable that FreeBSD defaults to leaving the write cache on for SATA PATA drives. This was completely driven by the need to satisfy idiotic benchmarkers, tech writers, and system administrators. It was a huge deal for FreeBSD 4.4, IIRC. It had been silently enabled it, we turned it off, released 4.4, and then got murdered in the press for being slow. If I had my way, the WC would be off, everyone would be using SAS, and anyone who enabled SATA WC or complained about I/O slowness would be forced into Siberian salt mines for the remainder of their lives. At least the admin can easily fix this by adding hw.ata.wc=0 to /boot/loader.conf. The bigger problem is that FreeBSD does not support queuing on all controllers that support it. Not something that admins can fix, and inexcusable for an OS that claims to care about performance. You keep saying this, but I'm unclear on what you mean. Can you explain? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: On cooperative work [Was: Re: newbus' ivar's limitation..]
On Aug 2, 2012, at 12:23 AM, Kevin Oberman kob6...@gmail.com wrote: Doug makes some good points. No, he doesn't. He and Arnould being argumentative and accusatory where none of that is warranted. I used to run the devsummits, and we did tele-conference lines for remote people to participate. After I stepped down, others took it up and did the same thing. Usually, the lines were unused. I suspect that organizers simply stopped thinking about them after a while because of poor interest. There is no conspiracy of exclusion here, just simple human apathy. The invite system for the devsummit was, and still is, purely about providing some order to the process. It ensures that people attending are willing to demonstrate a minimum amount of interest, more than just wondering by a room one day and dropping in for free food and wifi. If anyone feels that they are being excluded, it's because they are too lazy to go beyond being argumentative on a mailing list. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: geom - cam disk
Once the bio is put into the bioq from da_strategy, the CAM scheduler is called. It may or may not wind up calling dastart right away; if the simq or devq is frozen, or if the devq has been exhausted, then the io will be deferred until later and the call stack will unwind back into g_down. The bioq can therefore accumulate many bio's before being drained. Draining will usually happen from the camisr, at which point you can potentially have i/o being initiated from both the camisr and the g_down threads in parallel. The monolithic locking in CAM right now prevents this from actually happening, though that's a topic that needs to be revisited. Scott On Jul 25, 2012, at 1:27 PM, Andriy Gapon wrote: Preamble. I am trying to understand in detail how things work at GEOM - CAM disk boundary. I am looking at scsi_da and ata_da which seem to be twins in this respect. I got an impression that the bioq_disksort calls in the strategy methods and the related queues are completely useless in the GEOM single-threaded world. There is only one thread, g_down, that can call a strategy method, the method enqueues a bio, then calls a schedule function and through xpt_schedule the call flow continues to a start method which dequeues the bio and off it goes. I currently can see how a bio queue can accumulate more than one bio. What am I missing? :-) I will be very glad to learn more about this layer if anyone is willing to educate me. Thank you in advance. P.S. I wrote a very simple to DTrace script to my theory experimentally and my testing with various workloads didn't disprove the theory so far (which doesn't mean that it is correct, of course). The script: fbt::bioq_disksort:entry /args[0]-queue.tqh_first == 0/ { @[empty] = count(); } fbt::bioq_disksort:entry /args[0]-queue.tqh_first != 0/ { @[non-empty] = count(); } It works on all bioq_disksort calls, but I stressing only ada disks at the moment. -- Andriy Gapon ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Improving the kernel/i386 timecounter performance (GSoC proposal)
David Xu wrote: Julian Elischer wrote: David Xu wrote: David Xu wrote: Julian Elischer wrote: depends on the hardware. anyhow I was only saying it was possible, not necessarily good or even useful. I had done some works for thread private page shared by kernel and userland when I was doing userland spinlock, if userland asks a page, kernel will allocate it and put some interesting thing in it by scheduler etcs, these code may be useful. FYI: http://people.freebsd.org/~davidxu/schedctl/ reading this quickly, you allocate a separately addressed page for each thread, but, how do you use it? I store the address in userland TLS area, then get it when I want to check some scheduling informations. Interesting, I was wondering earlier today if pointing to the per-thread syspage in from the TLS area would save the TLB invalidate that you were concerned about. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: Danny Braniss wrote: it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. 189150 is in the middle of a big string of related commits. Try updating to the following change numbers and retesting: 189088 189107 189161 If the last one does not work, try editing /sys/dev/amr/amr.c to change #define AMR_ENABLE_CAM 1 to #define AMR_ENABLE_CAM 0 Scott 189161 works, also for the iir now what? Next set to try: 189219 189229 189253 189402 189531 189569 189591 Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: Danny Braniss wrote: Danny Braniss wrote: it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. 189150 is in the middle of a big string of related commits. Try updating to the following change numbers and retesting: 189088 189107 189161 If the last one does not work, try editing /sys/dev/amr/amr.c to change #define AMR_ENABLE_CAM 1 to #define AMR_ENABLE_CAM 0 Scott 189161 works, also for the iir now what? Next set to try: 189219 broken 189229 broken Ok, so 189161 works, 189219 doesn't, correct? If so, did you also make the change to amr.c yet? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: it seems March 12 was a bit off :-) it took some time, but I managed to close the gap: 189100 ok 189150 fails I will continue tomorrow, but this should be helpful. 189150 is in the middle of a big string of related commits. Try updating to the following change numbers and retesting: 189088 189107 189161 If the last one does not work, try editing /sys/dev/amr/amr.c to change #define AMR_ENABLE_CAM 1 to #define AMR_ENABLE_CAM 0 Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: amr driver broken since March 12
Danny Braniss wrote: Danny Braniss wrote: at least for me :-) [and sorry for the cross posting] old (March 12 , i know need the svn rev number but...) None of the commit activity on March 12 is jumping out at me as being suspicious. However, you are now the second person who has told me about AMR problems in 7.1 recently. If you have a precise svn change number, it would help greatly. Scott my bad. the last working amr/iir is from March 12. I first detected the problem sometime later, but not later than March 23. So it has to be changes in that time frame. both drivers are showing similar symptoms: waiting for not busy the iir goes on for ever, and it's the cam that eventually panics, run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_config (actually not 100% true, depending if WITNESS is on or off, it sometimes just hangs). the amr seems to time out: amr0: adapter is busy thanks for looking into the problem, danny Ok, here are a series of revisions to step through, in forward order. Make sure that you are starting with at least revision 189568. Then, update to exactly the revision numbers below, recompile the kernel, and test: 190087 190091 ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Improving the kernel/i386 timecounter performance (GSoC proposal)
I've been talking about this for years. All I need is help with the VM magic to create the page on fork. I also want two pages, one global for gettimeofday (and any other global data we can think of) and one per-process for static data like getpid/getgid. Scott Sergey Babkin wrote: (Sorry for the top quoting). Probably the best implementation of gettimeofd=y() is to have a page in the kernel mapped read-only to all the user pr=cesses. Put the kernel's idea of time into this page. Then getting the =ime becomes a simple read (OK, two reads, to make sure that no update =as happened in between). The TSC can then be used to add the precis=on between the ticks of the kernel timer: i.e. remember the value of TS= when the last tick happen, and the highest rate at which TSC may be ti=king at this CPU, and export in the same page. This would guarantee thatthe time is not moving back. However there are more issues with TS=. TSC is guaranteed to have the same value on all the processors that s=are the same system bus. But if the machine is built of multiple buses =ith bridges between them, all bets are off. Each bus may be stopped, resta=ted and clocked separately. There is no way to tell, on which CPU is th= process currently runnning, and it may be rescheduled do a different C=U right before or after the RDTSC instruction. -SB Ma= 26, 2009 06:55:04 PM, [1]...@phk.freebsd.dk wrote: In message [2]17560ccf0903260551v1f5cba9eu8 7727c0bae7b...@mail.gmail.com, Prasha nt Vaibhav writes: =The gettimeofday() function's implementation will then be change= to read the timestamp counter (TSC) from the processor, and use the g=;reading in conjunction with the timing info exported by the kernel to =calculate and return the time info in proper format. I take it a= read, that you know that there are other relvant functions than gettim=ofday() and that these must provide a monotonic timescale when queried =nterleaved ? Be aware that the TSC may not be, and may not stay syn=hronized across multiple cores. Further more, the TSC is not con=tant frequency and in particular not known frequency at all times. There are a lot of nasty cases to check, and a nasty interpolation =equired, which, in my tests some years back, totally negated any speedu= from using the TSC in the first place. At the very minimum, you wi=l have to add a quirk table where known good {CPU+MOBO+BIOS} combinatio=s can be entered, as we find them. This will also pave way f=r optionally making the FreeBSD kernel tickless, Rubbish. T=mecounters are not even closely associated with the tick or ticklessnes= of the kernel. [1] - The TSC frequency might change on cert=in processors with non-constant TSC rate (because of SpeedStep, =ynamic freq scaling etc.). The only way to combat this is that t=e kernel be notified every time the processor frequency changes.=very cpu frequency driver will need to be updated to notify the=ernel before and after a cpu freq change. That is not good enough= the bios may autonomously change the cpu speed and the skew from not k=owing exactly _when_ and _how_ the cpu clock changed, is a significant =umber of microseconds, plenty of time to make strange things happen. You will want to study carefully Dave Mills work to tame the alpha =hips wandering SAW clocks. Poul-Henning [1] In my mind, rewo=king the callout system in the kernel would be a much better more neded=nd much more worthwhile project. -- Poul-Henning Kamp | =NIX since Zilog Zeus 3.20 [3]...@freebsd.org | TCP=IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe N=ver attribute to malice what can adequately be explained by incompetence.=r___ [4]freebsd-hack...@freebsd.org mailing list [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackersTo unsubscribe, send any mail to [6]fre ebsd-hackers-unsubscr...@freebsd.org References 1. 3Dmailto:p...@phk.freebsd.dk; 2. file://localhost/tmp/3D 3. 3Dmailto:p...@freebsd.org; 4. 3Dmailto:fre 5. 3Dhttp://lists.=/ 6. 3Dmailto:freebsd-hackers-unsub___ freebsd-curr...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: Improving the kernel/i386 timecounter performance (GSoC proposal)
Robert Watson wrote: On Fri, 27 Mar 2009, Scott Long wrote: I've been talking about this for years. All I need is help with the VM magic to create the page on fork. I also want two pages, one global for gettimeofday (and any other global data we can think of) and one per-process for static data like getpid/getgid. FWIW, there are some variations in schemes across OS's -- one extreme is the Linux approach, which actually exports a mini shared library in ELF format on the shared page, providing implementations of various services (such as entering system calls), time stuff, etc. Less extreme are the shared pages offered on Mac OS X, etc. Yes, but I'd like to start somewhere, and considering that it's been impossible in _5_ years to get the 30 minutes of Peter or JeffR or JHB time to get the basic VM magic done, I'm keeping my expectations as modest as possible. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: CFT: Graphics support for /boot/loader
Julian Elischer wrote: Max Laier wrote: On Thursday 05 February 2009 23:18:36 Oliver Fromme wrote: I have posted detailed instructions on the FreeBSD wiki: http://wiki.freebsd.org/OliverFromme/BootLoaderTest Any kind of feedback is welcome. quick test in qemu - works well. Very cool! can you send a screenshot for those of us who can't test it now? http://wiki.freebsd.org/OliverFromme/BootLoader ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: CFT: Graphics support for /boot/loader
Oliver Fromme wrote: Hello fellow hackers, Some of you might remember that I'm working on graphics support for our /boot/loader. Unfortunately, progress has been rather slow because of non-FreeBSD-related activity. Anyway, I have now prepared a tarball containing a loader binary for public testing. If you are eager to give it a try, please feel free to do so. It should work with any FreeBSD version on i386 and amd64 platforms. I have posted detailed instructions on the FreeBSD wiki: http://wiki.freebsd.org/OliverFromme/BootLoaderTest Any kind of feedback is welcome. I think that this is really neat, you've done an impressive job with it good job. However, I do take issue with your criticism of the ASCII logo; I actually spent a decent amount of time designing the block text logo =-) I wish that there hadn't been moronic politics over the beastie logo, as that does look a lot better, even if it is text. And text is still required for serial consoles. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: strange behaviour with /sbin/init and serial console
Ed Schouten wrote: Hello Theirry, * Thierry Herbelot [EMAIL PROTECTED] wrote: with the following patch on /sbin/init, I have two different behaviours depending on the console type (on a i386/32 PC) : - on a video console, I see the expected two messages, - on a serial console, the messages are not displayed (init silently finishes its job and gets to start /etc/rc and everything) I assume that the writev system call is implemented in src/sys/kern/tty_cons.c::cnwrite(), but I could not parse the code to find an explanation. any taker ? TfH PS : this is initially for a RELENG_6 machine, but the code is quite similar under RELENG_7 or Current Any data written to /dev/console is not multiplexed to all console devices, but only the first active device in the list. The reason behind this, is because it adds a real lot of complexity to the console code, especially related to polling and reading on /dev/console. This weekend I'm going to commit a replacement implementation of /dev/console, which also has this restriction. The multiplexed console feature is one thing that linux got right. In a corporate setting, you really need both a serial console and a video console in order to effectively manage the machines, as you want to be able to access them both remotely and locally. While it might be hard to build multiplexing into the console driver, do you think it would be possible to layer a multiplexer on top of it, similar to how the kbdmux driver works? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: fs/udf: vm pages overlap while reading large dir [patch]
Pav Lucistnik wrote: Andriy Gapon píše v čt 28. 02. 2008 v 10:33 +0200: And while I have your attention, I have a related question. I have produced a bunch of ISO9660 Level 3 / UDF hybrid media with mkisofs, and when I mount the UDF part of them, the mount point (root directory of media) have 0x000 permissions. Yes that's right, d- in ls -l. That makes the whole volume inaccessible for everyone except root. Is this something you can mend in our UDF driver, or should I go dig inside mkisofs guts? Windows handle these media without any visible problems. I wonder if Windows even observes the permissions bits. You'd have to special-case the UDF driver code in FreeBSD, which certainly possible but not terribly attractive. I'd be interested to see what exactly mkiso is doing. Maybe it's putting permissions into extended attributes and assuming the filesystem driver will read those instead. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: fs/udf: vm pages overlap while reading large dir [patch]
Andriy Gapon wrote: on 26/02/2008 21:23 Pav Lucistnik said the following: Pav Lucistnik píše v út 05. 02. 2008 v 19:16 +0100: Andriy Gapon píše v út 05. 02. 2008 v 16:40 +0200: Yay, and can you fix the sequential read performance while you're at it? Kthx! this was almost trivial :-) See the attached patch, first hunk is just for consistency. The code was borrowed from cd9660, only field/variable names are adjusted. Just tested it with my shiny new Bluray drive, and it work wonders. Finally seamless playback of media files off UDF carrying media. So, how does it look WRT committing it? Pav, thank you for the feedback/reminder. In my personal option the latest patch posted and described at the following like is a very good candidate for commit: http://docs.FreeBSD.org/cgi/mid.cgi?47AA43B9.1040608 It might have a couple of style related issues, but it should improve things a bit even if some larger VM/VFS/GEOM issues remain. And by the way, a patch from the following PR would be a good side-dish for the above patch: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/78987 I think it is simple and obvious enough. I will commit both of these to CVS today. Thanks again. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: fs/udf: vm pages overlap while reading large dir [patch]
Andriy Gapon wrote: on 04/02/2008 22:07 Pav Lucistnik said the following: Julian Elischer píše v po 04. 02. 2008 v 10:36 -0800: Andriy Gapon wrote: More on the problem with reading big directories on UDF. You do realise that you have now made yourself the official maintainer of the UDF file system by submitting a competent and insightful analysis of the problem? Yay, and can you fix the sequential read performance while you're at it? Kthx! Pav, this was almost trivial :-) See the attached patch, first hunk is just for consistency. The code was borrowed from cd9660, only field/variable names are adjusted. Your patch looks reasonable. Btw, for the same reason that read-ahead makes file reading much faster, I would not change directory reading to be 1 sector at a time (unless you also do read-ahead for it). But there is another issue that I also mentioned in the email about directory reading. It is UDF_INVALID_BMAP case of udf_bmap_internal, i.e. the case when file data is embedded into a file entry. This is a special case that needs to be handled differently. udf_readatoffset() handles it, but the latest udf_read code doesn't. I have a real UDF filesystem where this type of allocation is used for small files and those files can not be read. Oh, so directory data can also follow this convention? Blah. Feel free to fix that too if you want =-) Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: fs/udf: vm pages overlap while reading large dir
Andriy Gapon wrote: More on the problem with reading big directories on UDF. First, some sleuthing. I came to believe that the problem is caused by some larger change in vfs/vm/buf area. It seems that now VMIO is applied to more vnode types than before. In particular it seems that now vnodes for devices have non-NULL v_object (or maybe this is about directory vnodes, I am not sure). Also it seems that the problem should happen for any directory with size larger than four 2048-bytes sectors (I think that any directory with 300 files would definitely qualify). After some code reading and run-time debugging, here are some facts about udf directory reading: 1. bread-ing is done via device vnode (as opposed to directory vnodes), as a consequence udf_strategy is not involved in directory reading 2. the device vnode has a non-NULL v_object, this means that VMIO is used 3. it seems that the code assumed that non-VM buffers are used 4. bread is done on as much data as possible, up to MAXBSIZE [and even slightly more in some cases] (i.e. code determines directory data size and attempts to read in everything in one go) 5. physical sector number adjusted to DEV_BSIZE (512 bytes) sectors is passed to bread - this is because of #1 above and peculiarities of buf system 6. directory reading and file reading are quite different, because the latter does reading via file vnode Let's a consider a simple case of directory reading 5 * PAGE_SIZE (4K) bytes starting from a physical sector N with N%2 = 0 from media with sector size of 2K (most common CD/DVD case): 1. we want to read 12 KB = 3 pages = 6 sectors starting from sector N, N%2 = 0 2. 4*N is passed as a sector number to bread 3. bo_bsize of the corresponding bufobj is a media sector size, i.e. 2K 4. actual read in bread will happen from b_iooffset of 4*N*DEV_BSIZE = N*2K, i.e. correct offset within the device 5. for a fresh read getblk will be called 6. getblk will set b_offset to blkno*bo_bsize=4*N*2K, i.e. 4 times the correct byte offset within the device 7. allocbuf will allocate pages using B_VMIO case 8. allocbuf calculates base page index as follows: pi = b_offset/PAGE_SIZE this means the following: sectors N and N+1 will be bound to a page with index 4*N*2K/4K = 2*N sectors N+2 and N+3 will be bound to the next page 2*N +1 sectors N+4 and N+5 is tied to the next page 2*N + 2 Now let's consider a directory read of 2 sectors (1 page) starting at physical sector N+1. Using the same calculations as above, we see that the sector will be tied to a page 2*(N+1) = 2*N + 2. And this page already contains valid cached data from the previous read, *but* it contains data from sectors N+4 and N+5 instead of N+1 and N+2. So, we see, that because of b_offset being 4 times the actual byte offset, we get incorrect page indexing. Namely, sector X gets associated with different pages depending on what sector is used as a starting sector for bread. If bread starts at sector N, then data of sector N+1 is associated with (second half of) page with index 2*N; but if bread starts at sector N+1 then it gets associated with (the first half of) page with index 2*(N+1). Previously, before VMIO change, data for all reads was put into separate buffers that did not share anything between them. So the problem was limited only to wasting memory with duplicate data, but no actual over-runs did happen. Now we have the over-runs because the VM pages are shared between the buffers of the same vnode. One obvious solution is to limit bread size to 2*PAGE_SIZE = 4 * sector_size. In this case, as before, we would waste some memory on duplicate data but we would avoid page overruns. If we limit bread size even more, to 1 sector, then we would not have any duplicate data at all. But there would still be some resource waste - each page would correspond to one sector, so 4K page would have only 2K of valid data and the other half in each page is unused. Another solution, which to me seems to be better, is to do usual reading - pass a directory vnode and 2048-byte sector offset to bread. In this case udf_strategy will get called for bklno translation, all offsets and indexes will be correct and everything will work perfectly as it is the case for all other filesystems. The only overhead in this case comes from the need to handle UDF_INVALID_BMAP case (where data is embedded into file entry). So it means that we have to call bmap_internal to see if we have that special case and hanlde it, and if the case is not special we would call bread on a directory vnode which means that bmap_internal would be called again in udf_strategy. One optimization that can be done in this case is to create a ligher version of bmap_internal that would merely check for the special case and wouldn't do anything else. I am attaching a patch based on the ideas above. It fixes the problem for me and doesn't seem to create any new ones :-) About the patch: hunk #1 fixes a copy/paste hunk #2 should fixes strategy to not set
Re: amrd disk performance drop after running under high load
Boris Samorodov wrote: Hi! Since nobody answered so far, here is my two cents. I'm not an expert here so it's only my imho. On Wed, 17 Oct 2007 22:52:49 +0400 Alexey Popov wrote: interrupt total rate irq6: fdc0 8 0 irq14: ata0 47 0 irq16: uhci0 1428187319 1851 ^^ [1] irq18: uhci212374352 16 irq23: ehci0 3 0 irq46: amr0 11983237 15 irq64: em01427141755 1850 ^^ [2] cpu0: timer 1540896452 1997 cpu1: timer 1542377798 1999 Total 5962960971 7730 [1] and [2] looks suspicious to me (totals and rate are too close to each other and btw to timers). Let the latter (timers) alone. Do you use any USB device? Can you try to use other network card? That behaviour seems to be an interrupt storm and/or irq collision. It's neither. It's a side effect of a feature that FreeBSD abuses for handling interrupts. Note that amr0 and ehci2 are acting similar. It's mostly harmless, but it does waste CPU cycles. I wouldn't expect this on a recent version of FreeBSD, though, at least not from the e1000 driver. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI disconnects dilema
Wilko Bulte wrote: On Fri, Jan 12, 2007 at 09:31:04PM +0200, Danny Braniss wrote.. --s/l3CgOIzMHHjg/5 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 09, 2007 at 09:06:46AM +0200, Danny Braniss wrote: Hi, While I think I have almost solved the problem of network disconnects, It downed on me a major problem: When a 'local' disk crashes, the kernel will probably hang/panic/crash. if i don't try to recover, then there is no change in the above scenario. if i try to recover, then the client does not know that it should umount/fsck/mount. While all this seems familiar, removing a floppy/disk-on-key while it's mounted, we could always say you shouldn't have done that!, with a network connection, it can happen very often - rebooting the target, a network hickup, etc. =20 So, any ideas? In my opinion it should be done this way: You have a queue of I/O requests. You send the to the other end and wait for confirmation. Until confirmation is received, you keep the requests queued. If the other end dies, you try to reconnect (until some timeout expires, the processes which send those requests will just wait), if you reconnect successfully, you resend not-confirmed requests, if you won't be able to reconnect, you just pass the errors up. This is what I did in ggate and it seems to work. That is basically what i'm doing - unacked request get requed. the problem I fear (and maybe I'm paranoid :-): Paranoia is a Good Thing(TM) in data storage land :-) assume the following scenario, the client(initiator) sends a write command, the target acks it, then it crashes, if the write was never completed, the initiator goes on as nothing ever happened. Yes, but what can the initiator do about that? I mean, it does not have any visibility of what the target has (or has not) done with the data. ' This is roughly the same as a RAID box accepting a write into a writeback cache and ACK-ing to the host. You can only assume that the RAID box' cache will get flushed to the spindles properly. All the usual horror scenarios with a broken battery backup of the cache and a powerfailure etc apply here. Wilko I forget, does iSCSI have a concept of a flush_cache command, or the equivalent of what parallel SCSI does with ordered tags? If so, then that's how your app or OS knows that the transaction got committed to stable storage. It's been long assumed in the external storage world that you are at the mercy of the external storage cache, so the problem that Danny is referring to is nothing new. The real question is how to implement the equivalent mechanism that iSCSI provides in a way that the OS/app can make use of it. For example, CAM issues an ordered tag periodically to flush the disk cache to stable storage. Most storage drivers, including CAM, will issue some sort of a flush_cache command to the controller and media during system shutdown. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI disconnects dilema
Danny Braniss wrote: I forget, does iSCSI have a concept of a flush_cache command, or the equivalent of what parallel SCSI does with ordered tags? not realy - or I can't find it. iSCSI is mainly and envelope for scsi commands, so whatever the CAM does, it will pass it on. There are some managemenet commands, so the target can tell the initiator that it's going down for example (and what should the driver do in such a case in freebsd?) If the periph is open (i.e. mounted), I'd just ignore this and have the stack go through a normal retry timeout cycle to see if the device comes back. If it's closed, then I'd remove the periph. Knowing if it's opened or closed is likely hard to do from the iSCSI driver, which is one reason why iSCSI knowledge needs to eventually be moved upwards in CAM. If so, then that's how your app or OS knows that the transaction got committed to stable storage. It's been long assumed in the external storage world that you are at the mercy of the external storage cache, so the problem that Danny is referring to is nothing new. The real question is how to implement the equivalent mechanism that iSCSI provides in a way that the OS/app can make use of it. For example, CAM issues an ordered tag periodically to flush the disk cache to stable storage. nice, (or wishful thinking :-), the scsi part of iSCSI is/can be software/virtual. If the target device returns a successful completion from a command, the initiator must assume that it's not lying. You could do a flush/sync cache command after every I/O, but then you'd have a completely unacceptable level of performance. But again, this is not a new problem specific to iSCSI. It's long been a design consideration of external storage, and is why external storage 1) carries a high price tag to accompany good engineering and testing, and 2) comes with some form of battery backup, to prevent data loss in case of power loss. Most storage drivers, including CAM, will issue some sort of a flush_cache command to the controller and media during system shutdown. this took me a long time to fix! the userland program got killed at shutdown, the link was lost, and so there was no way to flush buffers, fixed by calling fget(...) too. I guess I can summarize: (and use the 3 monkey law :-) 1- assume the target is 'well behaved' and will flush cache. 2- there is - currently - no way to tell the OS that not all seems to be as expected. 3- keep quiet and hope for the best. danny So you had a scenario where a program was doing I/O right up to system (initiator) shutdown, and some of those I/O's got lost in the process? I guess I don't understand why the OS didn't flush all outstanding I/O buffers after terminating the program and before finishing the shutdown. Maybe you are doing something illegal in your driver, or maybe you need to implement a kernel shutdown hook that will allow you to block the shutdown until everything is flushed. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI/shutdown advice needed
Danny Braniss wrote: hi, I'm trying to finish up the iSCSI initiator, and need some advice. To shutdown the initiator, I need to: 1- close down the CAM-peripherals, (ie da) 2- empty up all pending iSCSI transactions 3- close the tcp connection 2 3 I can handle, it's 1 that im stuck. Q: how can I call the peripheral close function, or is there some CAM command? I tried xpt_async(AC_LOST_DEVICE, isp-cam_path, NULL); but this it far to drastic, and actually will cause panic if the device is still mounted. danny UFS doesn't handle devices going away unexpectedly. Fixing it involves massive changes to both UFS and the VM system. However, sending the AC_LOST_DEVICE is probably the right thing to do. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SATA300 Controllers
Wilko Bulte wrote: On Wed, Jul 05, 2006 at 08:02:55PM -0500, Derrick T. Woolworth wrote.. Hello all, Sorry for cross-posting, but these issues seem relevant for lists... Has anyone had success with SATA300 controllers with FreeBSD 6.1? I've been trying Promise and nVidia nForce4 and I'm not having any luck. Using a MSI K8NGM2-L motherboard and others, but 6.1's installation hangs as soon as it sees ad4. I've also tried using an Adaptec 1210SA controller and had zero Well, just as a datapoint this works fine for me: [EMAIL PROTECTED] ~: dmesg|grep -i Prom atapci0: Promise PDC20771 SATA300 controller port 0xd480-0xd4ff,0xd000-0xd0ff mem 0xf7ff6000-0xf7ff6fff,0xf7fa-0xf7fb irq 21 at device 13.0 on pci2 ar0: 238475MB Promise Fasttrak RAID1 status: READY [EMAIL PROTECTED] ~: uname -a FreeBSD freebie.xs4all.nl 6.1-STABLE FreeBSD 6.1-STABLE #2: Wed Jun 14 22:01:33 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/FREEBIE i386 Promise has a good relationship with FreeBSD, I would expect their controllers to work pretty well. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI/sendto(...)
Danny Braniss wrote: Hi, on a fairly new 6.1-stable, and probably before, once in a blue moon, sendto return error 64 (EHOSTDOWN?). but the packet seems to have been received by the target, since i get a response, and further more, everything keeps on working. what is error 64? danny EHOSTDOWN comes from the ARP layer of the IP stack, and would be consistent with the host either getting no arp response or rejected responses from the target. It would be useful to run tcpdump+ethereal on your connection to see what is really going on. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: iSCSI/sendto(...)
Danny Braniss wrote: Danny Braniss wrote: Hi, on a fairly new 6.1-stable, and probably before, once in a blue moon, sendto return error 64 (EHOSTDOWN?). but the packet seems to have been received by the target, since i get a response, and further more, everything keeps on working. what is error 64? danny EHOSTDOWN comes from the ARP layer of the IP stack, and would be consistent with the host either getting no arp response or rejected responses from the target. It would be useful to run tcpdump+ethereal on your connection to see what is really going on. too much traffic, and would be like looking for a needle in a haystack. (i can't reproduce this at will) the question is, if it was an error, how come the packet did go out. need more proof for the above statement - working on it. danny I find that ethereal does a great job of associating packets and making it easy to sort through mountains of data. It's not so good at actually collecting the packets, so I run tcpdump in raw collection mode and then feed the output to ethereal for analysis. Having tcpdump generate a circular ring of files that are at most 20MB works best. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: misc questions about the devicedriver arch
M. Warner Losh wrote: : THIRD : Because the PCIE configure space is 4k long ,shall we change the : #define PCI_REGMAX 255 : to facilitate the PCI express config R/W? Maybe. Lemme investigate because PCIe changes this from a well known constant for all pci busses, to a variable one... Warner When I added PCIe extended config support, I never took into consderation the userland access point of view. Changing this definition to 4096 might Just Work, and it might Not Work. Dunno. In the 18 months since I implemented it, no other person has asked about userland access. Other than the silly case of people trying to write device drivers in PERL, I'm not sure how much value it gives compared to the stability and security risk it imposes. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: misc questions about the devicedriver arch
william wallace wrote: On 5/30/06, Scott Long [EMAIL PROTECTED] wrote: M. Warner Losh wrote: : THIRD : Because the PCIE configure space is 4k long ,shall we change the : #define PCI_REGMAX 255 : to facilitate the PCI express config R/W? Maybe. Lemme investigate because PCIe changes this from a well known constant for all pci busses, to a variable one... Warner When I added PCIe extended config support, I never took into consderation the userland access point of view. Changing this definition to 4096 might Just Work, and it might Not Work. Dunno. In the 18 months since I implemented it, no other person has asked about userland access. Other than the silly case of people trying to write device drivers in PERL, I'm not sure how much value it gives compared to the stability and security risk it imposes. Scott I have to clarify my intentions that i am not TRYing to do a userland PCI express driver . I just want to make a interesting branch whitch can do pci express native Hot plug and hot remove ,with Mr Losh and other gentlemen's help ,i am making progress ,and now a loadable module is finishing . I have borrowed many Ideas from Linux ,but several fatal difficulties paused me ,with the PCI_REGMAX included. wish to hear from u :) thank u! The PCI_REGMAX definition is not used by the extended configuration space code. However, this code only exists on i386 right now. I haven't gotten around to implementing it on amd64 yet. Implementing it there is not just a trivial change of the defintion. Some platform specific memory map tricks need to be done. It would be possible to port the i386 code wholesale, but that code is not terribly efficient on the amd64 platform. So, what problem are you running into? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: misc questions about the devicedriver arch
william wallace wrote: [...] MSI: I've bantered around different suggestions for an API that will support this. The basic thing that a driver needs from this is to know exactly how many message interrupt vectors are available to it. It can't just register vectors and handlers blindly since the purpose of MSI is to assign special meanings to each vector and allow the driver to handle each one in specifically. [...] I just wanted to briefly say that an MSI implementation has been done recently, and that it should start getting wider circulation and review soon. That's not to say that more work and design can't be done in this area, but we should probably wait a bit and see what has been done already. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 答复: 答复: help:How to map a physic al address into a kernel address?
[EMAIL PROTECTED] wrote: Hi guys: The attached file is the sample codes of my HBA driver. I make notes on the place where the address transfer is needed. Please make comments if possible. Thanks a lot! Hong It looks like the primary question that you are asking in the code is this: How to get the kernel virtual address of csio-data_ptr? Correct? The answer is that csio-data_ptr is a kernel virtual address already if the CAM_DATA_PHYS flag is not set. For prepare_sg_table, you can just ignore the case where it isn't unless you expect to also write software that will use the flag (CAM was originally written for an application that did use this flag, but it's use is no longer common). As for ft_map_sg, the only way that you can be in there is if CAM_DATA_PHYS was not set, so it's safe to say that csio-data_ptr is a kernel virtaul address. One thing to note about your code is that Local_StartIO should be called from within ft_map_sg instead of ft_cam_action. That way the EINPROGRESS status of bus_dmamap_load will be handled correctly. I can' describe this more if you have questions. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: help:How to map a physical address into a kernel address?
[EMAIL PROTECTED] wrote: Hi guys: To access sg_table in kernel address, I need to map the starting physical address of a segment into a kernel address. As I know that, we can use phystovirt()/bustovirt(), or kmap()/kmap_atomic() to map a bus/physical address or a physical page into a kernel address in Linux, but I did not find such a function in FreeBSD. Please help me on this, it is very urgent! Thanks a lot! What kind of memory are you trying to access? Are you trying to access memory on the card that is pointed to by PCI base address registers? If so then you need to use the bus_space API. Or are you trying to allocate memory in the kernel and then give the physical address of that memory to your card? If so then you need to use bus_dma. Both Warner and I are happy to help guide you with these APIs. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: RFC: Optionally verbose SYSINIT
This would be awesome, please do it. Scott Benno Rice wrote: One of the things that I found useful both in starting the PowerPC port and in doing the XScale stuff I'm working on is making the SYSINIT stuff done by mi_startup() verbose. This generally requires hacking your own code into mi_startup() to print out which SYSINIT you're up to and the like. jhb recently pointed me at this version he wrote which uses DDB to look up the symbol corresponding to the SYSINIT in question which makes it even more useful. I would like to commit this version, which I've made optional based on a VERBOSE_SYSINIT option, so as to make it available to anyone else further down the line who's porting to a new architecture. Comments? Questions? Index: conf/options === RCS file: /home/ncvs/src/sys/conf/options,v retrieving revision 1.540 diff -u -r1.540 options --- conf/options7 May 2006 18:12:17 - 1.540 +++ conf/options11 May 2006 05:34:26 - @@ -158,6 +158,7 @@ TURNSTILE_PROFILING TTYHOG opt_tty.h VFS_AIO +VERBOSE_SYSINITopt_global.h WLCACHEopt_wavelan.h WLDEBUGopt_wavelan.h Index: kern/init_main.c === RCS file: /home/ncvs/src/sys/kern/init_main.c,v retrieving revision 1.262 diff -u -r1.262 init_main.c --- kern/init_main.c7 Feb 2006 21:22:01 - 1.262 +++ kern/init_main.c11 May 2006 05:35:21 - @@ -84,6 +84,9 @@ #include vm/vm_map.h #include sys/copyright.h +#include ddb/ddb.h +#include ddb/db_sym.h + void mi_startup(void); /* Should be elsewhere */ /* Components of the first process -- never freed. */ @@ -169,6 +172,11 @@ register struct sysinit **xipp; /* interior loop of sort*/ register struct sysinit *save; /* bubble*/ +#if defined(VERBOSE_SYSINIT) + int last; + int verbose; +#endif + if (sysinit == NULL) { sysinit = SET_BEGIN(sysinit_set); sysinit_end = SET_LIMIT(sysinit_set); @@ -191,6 +199,14 @@ } } +#if defined(VERBOSE_SYSINIT) + last = SI_SUB_COPYRIGHT; + verbose = 0; +#if !defined(DDB) + printf(VERBOSE_SYSINIT: DDB not enabled, symbol lookups disabled.\n); +#endif +#endif + /* * Traverse the (now) ordered list of system initialization tasks. * Perform each task, and continue on to the next task. @@ -206,9 +222,38 @@ if ((*sipp)-subsystem == SI_SUB_DONE) continue; +#if defined(VERBOSE_SYSINIT) + if ((*sipp)-subsystem last) { + verbose = 1; + last = (*sipp)-subsystem; + printf(subsystem %x\n, last); + } + if (verbose) { +#if defined(DDB) + const char *name; + c_db_sym_t sym; + db_expr_t offset; + + sym = db_search_symbol((vm_offset_t)(*sipp)-func, + DB_STGY_PROC, offset); + db_symbol_values(sym, name, NULL); + if (name != NULL) + printf( %s(%p)... , name, (*sipp)-udata); + else +#endif + printf( %p(%p)... , (*sipp)-func, + (*sipp)-udata); + } +#endif + /* Call function */ (*((*sipp)-func))((*sipp)-udata); +#if defined(VERBOSE_SYSINIT) + if (verbose) + printf(done.\n); +#endif + /* Check off the one we're just done */ (*sipp)-subsystem = SI_SUB_DONE; ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 6.1 Released
Mike Jakubik wrote: Jonathan Noack wrote: The *entire* errata page was from 6.0; it was a mistake. This wasn't some put on the rose-colored classes and gloss over major issues thing. It was a long release cycle and something was forgotten. C'est la vie. It's always a good idea to check the most up-to-date version of the errata page on the web anyway, so it's *not* too late to update it. How convenient. These problems needed to be addressed in the release notes, not some on line version. So, you're still waiting for Scott to personally fix the problems and he couldn't deliver? Huh? I quote you (http://lists.freebsd.org/pipermail/freebsd-stable/2006-May/025209.html): Scott, thanks for the very generous gesture, but i cant ask you something like this. He emailed me personally, i accepted his help offer, never heard from him since. Sorry, things got lost in the shuffle to get this released. For your specific snapshot deadlocks, please test the changes that have gone into 7-CURRENT and report back if they fix your problem. Only with active testing will we know if they are good to be backported in time for 6.2. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Atomic updates of NFS export lists
Andrey Simonenko wrote: Greetings, In my environment non-atomic updates of NFS export lists are not acceptable. So, I decided to correct this problem. As the result mountd, kern/vfs_export.c were completely rewritten, mount.h, vfs_mount.c and nfs_srvsubs.c also got changes. For details see kern/9619. I've been looking at this since my company is also running into these problems. I've integrated your patchset into my tree, and I'll let you know how it works after a few days of testing. One thing to note is that you've significantly re-written much of mountd, as well as changed the API/ABI a bit and removed some command line switches. That makes it less attractive for inclusion in RELENG_6, but is fine for 7-CURRENT. With that in mind, you should switch over to using nmount() instead of mount(), that way you can completely remove the per-filesystem handling code that you added. If there is any way that you can trim the changes to just implement the new export primitives and leave out the libsock stuff, it would be much easier to justify getting into RELENG_6. I don't have an opinion on the libsock design, but you should talk to people like Robert Watson about that before this goes into 7-CURRENT. But thank you very much for this. It was a pleasant surprise to see this after I had been talking to others about exactly these problems for a few weeks. Hopefully we can get this integrated into FreeBSD soon. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD 6.1 Released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It is my great pleasure and privilege to announce the availability of FreeBSD 6.1-RELEASE. This release is the next step in the development of the 6.X branch, delivering several performance improvements, many bugfixes, and a few new features. These include: ~ Addition of a keyboard multiplexer. This allows USB and PS/2 keyboards to coexist without any special options at boot. ~ Many fixes for filesystem stability. High load stress tests are now run successfully on a regular basis as part of the normal FreeBSD QA process. ~ Automatic configuration for man Bluetooth devices, as well as automatic support for running WiFi access points. ~ Addition of drivers for new ethernet and SAS and SATA RAID controllers. ~ BIND updated to 9.3.2 ~ sendmail updated to 8.13.6 NOTE: It was discovered at the last minute that the errata notes that were packaged with the release are out of date. For a complete list of known problems, please see the online errata list, available at: http://www.FreeBSD.org/releases/6.1R/errata.html For more information about FreeBSD release engineering activities, please see: http://www.FreeBSD.org/releng Availability - FreeBSD 6.1-RELEASE supports the i386, pc98, alpha, sparc64, amd64, powerpc, and ia64 architectures and can be installed directly over the net using bootable media or copied to a local NFS/FTP server. Distributions for all architectures are available now. Please continue to support the FreeBSD Project by purchasing media from one of our supporting vendors. The following companies will be offering FreeBSD 6.1 based products: ~ FreeBSD Mall, Inc.http://www.freebsdmall.com/ ~ Daemonnews, Inc. http://www.bsdmall.com/freebsd1.html If you can't afford FreeBSD on media, are impatient, or just want to use it for evangelism purposes, then by all means download the ISO images. We can't promise that all the mirror sites will carry the larger ISO images, but they will at least be available from the following sites. MD5 and SHA256 checksums for the release images are included at the bottom of this message. Bittorrent -- The FreeBSD project encourages the use of BitTorrent for distributing the release ISO images. A collection of torrent files to download the images is available at http://torrents.freebsd.org:8080/ FTP --- At the time of this announcement the following FTP sites have FreeBSD 6.1-RELEASE available. ftp://ftp.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.FreeBSD.org/pub/FreeBSD/ ftp://ftp3.FreeBSD.org/pub/FreeBSD/ ftp://ftp5.FreeBSD.org/pub/FreeBSD/ ftp://ftp.at.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.ch.FreeBSD.org/pub/FreeBSD/ ftp://ftp.cz.FreeBSD.org/pub/FreeBSD/ ftp://ftp.ee.FreeBSD.org/pub/FreeBSD/ ftp://ftp.fi.FreeBSD.org/pub/FreeBSD/ ftp://ftp.fr.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.ie.FreeBSD.org/pub/FreeBSD/ ftp://ftp.is.FreeBSD.org/pub/FreeBSD/ ftp://ftp1.ru.FreeBSD.org/pub/FreeBSD/ ftp://ftp.se.FreeBSD.org/pub/FreeBSD/ ftp://ftp.si.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.tw.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.uk.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.us.FreeBSD.org/pub/FreeBSD/ ftp://ftp5.us.FreeBSD.org/pub/FreeBSD/ FreeBSD is also available via anonymous FTP from mirror sites in the following countries: Argentina, Australia, Brazil, Bulgaria, Canada, China, Czech Republic, Denmark, Estonia, Finland, France, Germany, Hong Kong, Hungary, Iceland, Ireland, Israel, Japan, Korea, Lithuania, Amylonia, the Netherlands, New Zealand, Poland, Portugal, Romania, Russia, Saudi Arabia, South Africa, Slovak Republic, Slovenia, Spain, Sweden, Taiwan, Thailand, Ukraine, and the United Kingdom. Before trying the central FTP site, please check your regional mirror(s) first by going to: ftp://ftp.yourdomain.FreeBSD.org/pub/FreeBSD Any additional mirror sites will be labeled ftp2, ftp3 and so on. More information about FreeBSD mirror sites can be found at: http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html For instructions on installing FreeBSD, please see Chapter 2 of The FreeBSD Handbook. It provides a complete installation walk-through for users new to FreeBSD, and can be found online at: http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/install.html Acknowledgments Many companies donated equipment, network access, or man-hours to finance the release engineering activities for FreeBSD 6.1 including The FreeBSD Foundation, FreeBSD Systems, Hewlett-Packard, Yahoo!, Sentex Communications, and Copan Systems. The release engineering team for 6.1-RELEASE includes: Scott Long [EMAIL PROTECTED] Release Engineering, Ken Smith [EMAIL PROTECTED]I386, AMD64, Sparc64 Release Building, Mirror Site Coordination Robert Watson [EMAIL PROTECTED] Release Engineering, Security Doug White [EMAIL
Re: Core Duo - only one cpu being used
Erich Dollansky wrote: Hi, Eric Anderson wrote: Erich Dollansky wrote: Hi, Eric Anderson wrote: PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU COMMAND 11 root 1 171 52 0K 8K CPU1 0 0:00 99.02% idle: cpu1 2653 root 1 1280 18564K 17560K RUN0 0:01 34.00% cc1plus could it be that it is just a problem with top itself? It cannot be that CPU1 uses 99% for the idle process and 34% for the compiler. Play with the other sort options. You might find the the idle process for CPU0. Is this what you want: $ ps -auxw | grep idle root11 99.0 0.0 0 8 ?? RL 7:45PM 0:00.00 [idle: cpu1] root12 0.0 0.0 0 8 ?? RL 7:45PM 51:04.57 [idle: cpu0] something is really wrong here. CPU1 gets 99% of the time but uses then only 0 seconds while CPU0 gets 0% of the time but uses 51 hours? CPU1 is being treated as a hyperthreading core instead of a real core, and is being disabled per our policy on Intel hyperthreading. By 'disabled' I mean that it is started, but it is being excluded from scheduling decisions, and thus is only running its idle proc. It's also handling any interrupts that come to it, such as timer and IPI interrupts, so it's at 99% instead of 100% for the idle proc. There is nothing broken about the number you are seeing, your system is just running under a scheduling policy that it should not be. This should have been fixed a week or so ago by a commit to HEAD, RELENG_6, and RELENG_6_1 by Colin Percival. How old is kernel? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD 6.1-RC2 available
All, I'm foregoing the formal pretty announcement for 6.1-RC2 because the message needs to get out and I don't have an hour to spend on making it look nice. FreeBSD 6.1-RC2 is available for download. This is the last RC before the release. Please test it to make sure that there have been no regressions since the last RC, and to make sure that it there are no new problems with installation. Other than a few cosmetic tweaks, there will be no more changes before 6.1. The list of known issues: - Using UFS snapshots and quotas at the same time can cause system lockups. There is no work-around available at this time, so please avoid this configuration. This will be fixed in a future release. - Under rare and heavily loaded circumstances, there is a possibility to leak pty's. This can result in not being able to long into the system. The cause of this is not well understood, and it appears to be very difficult to trigger it. - DEVFS is known to have several problems with multiple processes doing directory listings at the same time, as well as with unmounting DEVFS directories at the same time. There is no known work-around for this at this time. This will be fixed in a future release. - A number of improvements and fixes for various drivers have come in at the last minute that still require much more testing and validation. This includes the 'if_nve' and 'if_bge' drivers in particular. These updates will be included in future releases. MD5 (6.1-RC1-amd64-bootonly.iso) = 93abe294e7678e00b7391f47a01074fe MD5 (6.1-RC1-amd64-disc1.iso) = c1b718b6752f0e48edb8b822ee9b0dc8 MD5 (6.1-RC1-amd64-disc2.iso) = 4a67ae8ed7a7852e08442205d6a5cd7c MD5 (6.1-RC1-i386-bootonly.iso) = b56aac9ca1a868daaf5673cd21bf78f5 MD5 (6.1-RC1-i386-disc1.iso) = 12521c3f9d40f637e4cdb40ea398d072 MD5 (6.1-RC1-i386-disc2.iso) = 53615f19889fe85c41e2bcea0b2be525 MD5 (6.1-RC2-ia64-bootonly.iso) = 481e6f1899c0ba632272e7853b8ef59e MD5 (6.1-RC2-ia64-disc1.iso) = f4601bb9089af1bcde5b751f5762f35a MD5 (6.1-RC2-ia64-disc2.iso) = b44d5a0538b784cbb5de0a8ec23e4256 MD5 (6.1-RC2-ia64-livefs.iso) = 0fe8b66a80edaa50ac353d5471930035 MD5 (6.1-RC2-pc98-disc1.iso) = 773a64a475596d586d0a1573d88310cc SHA256 (6.1-RC1-amd64-bootonly.iso) = 88e072b4898692813517aa254a33f1e7469de0e590c36bfb3e92cb120ac0ad16 SHA256 (6.1-RC1-amd64-disc1.iso) = 017e69c5461fe2c865a395830dde88c8a55e7ec83d9a195b3b619346b44f9cc6 SHA256 (6.1-RC1-amd64-disc2.iso) = 81624f3b8dfa67ceab1dc6ec0a94c4485ad85955321c39d13c9ab4a678f776ef SHA256 (6.1-RC1-i386-bootonly.iso) = ec1a3fbf53186b5bc44dbfcdc77872c847f3c55532bb62f2afb4133328e7994f SHA256 (6.1-RC1-i386-disc1.iso) = e0b83f2cbd27db20f330036d0a25b8366b9e45df4b9c09354f76e584a9eb3b83 SHA256 (6.1-RC1-i386-disc2.iso) = de1fe5009229efd44b25bb18c4e68b03027259171cd9e017fe5bffadaa3402bb SHA256 (6.1-RC2-ia64-bootonly.iso) = c044989257754fa17daa352f76c3e011dfc04b3b242c2153c7a1ec47a773d4d1 SHA256 (6.1-RC2-ia64-disc1.iso) = 60bec7c25b8f645a9d20d3240397c7a92f42d24ff5d01b4604ece5f9ee499ccc SHA256 (6.1-RC2-ia64-disc2.iso) = 854048d4ba4dcf00657501d36a5fb15a94ed4c20e646031960ebc3315c3a513e SHA256 (6.1-RC2-ia64-livefs.iso) = fb3fadb00c9ddb6233172a34a7d47ab80171b54410835954c20f50849359ee73 SHA256 (6.1-RC2-pc98-disc1.iso) = f2b5f17a3355465727613e33807964a0cf92d9c02868cd2c25440995b2c6ebfd ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Zero Copy, FreeBSD and Linus Torvalds opinion
Iantcho Vassilev wrote: Hello guys, in bsdnews.com i found this link http://kerneltrap.org/node/6506 and particulary this: I claim that Mach people (and apparently FreeBSD) are incompetent idiots. Playing games with VM is bad. memory copies are _also_ bad, but quite frankly, memory copies often have _less_ downside than VM games, and bigger caches will only continue to drive that point home. What do you think about it? I claim that Linus is an attention whore. How about that? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD 6.1-RC1 Available
Announcement The FreeBSD Release Engineering Team is pleased to announce the availability of FreeBSD 6.1-RC1. It is meant to be a refinement of the 6-STABLE, branch with few dramatic changes. A lot of bugfixes have been made, some drivers have been updated, and some areas have been tweaked for better performance, etc. but no large changes have been made to the basic architecture. This RC is late in coming due to many more bugs being fixed, as well as a keyboard multiplexer being added. This is enabled by default via the 'kbdmux' driver and allows multiple keyboards of any type to be plugged in and work at once. In turn, the boot menu option to handle USB keyboards specially has been removed as it is no longer needed. This feature has been tested for several months, but more testing is always needed. We encourage people to help with testing so any final bugs can be identified and worked out. Availability of ISO images is given below. If you have an older system you want to update using the normal CVS/cvsup source based upgrade the branch tag to use is RELENG_6_1. Problem reports can be submitted using the send-pr(1) command. The FreeBSD 5.5 release process is on hold while we put the final touches on 6.1. It will resume within 1-2 weeks with a 5.5-RC1 release. The list of open issues and things still being worked on are on the todo list: http://www.freebsd.org/releases/6.1R/todo.html Known Issues The NDIS driver is known to not work correctly with the wpa_supplicant package. This will be fixed for the release. A string termination problem can cause geom(8) commands to abort randomly. This will be fixed for the release. Availability The RC1 ISOs and FTP support are available on most of the FreeBSD Mirror sites. A list of the mirror sites is available here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html The MD5s are: MD5 (6.1-RC1-alpha-disc1.iso) = c9ce5255facfc44a30f34f55dd6ee2d6 MD5 (6.1-RC1-alpha-bootonly.iso) = 9277c6cd9c5dd17a9aeaa9394fe1d2f8 MD5 (6.1-RC1-amd64-bootonly.iso) = 93abe294e7678e00b7391f47a01074fe MD5 (6.1-RC1-amd64-disc1.iso) = c1b718b6752f0e48edb8b822ee9b0dc8 MD5 (6.1-RC1-amd64-disc2.iso) = 4a67ae8ed7a7852e08442205d6a5cd7c MD5 (6.1-RC1-i386-bootonly.iso) = b56aac9ca1a868daaf5673cd21bf78f5 MD5 (6.1-RC1-i386-disc1.iso) = 12521c3f9d40f637e4cdb40ea398d072 MD5 (6.1-RC1-i386-disc2.iso) = 53615f19889fe85c41e2bcea0b2be525 MD5 (6.1-RC1-ia64-bootonly.iso) = b2f284c8f6c28455ac59cb37ff2f6658 MD5 (6.1-RC1-ia64-disc1.iso) = e1385927a3272674512f8205aef0addc MD5 (6.1-RC1-ia64-disc2.iso) = 8b212b2f0914f13621996a8ad8397c71 MD5 (6.1-RC1-ia64-livefs.iso) = 75fdb240f273e7f9b2cdefcab43144c2 MD5 (6.1-RC1-pc98-disc1.iso) = 65465e3298efc5122607ea3d0b3e7136 SHA256 (6.1-RC1-alpha-disc1.iso) = d40b8e3e1944f28c5ba1b1f55eb7b5cc22472177116b98f85f2c5bb0ffb59a5f SHA256 (6.1-RC1-alpha-bootonly.iso) = 43ceaf712475d00b7287d09753635383cb284ad8fc63b98608e55ab458aed157 SHA256 (6.1-RC1-amd64-bootonly.iso) = 88e072b4898692813517aa254a33f1e7469de0e590c36bfb3e92cb120ac0ad16 SHA256 (6.1-RC1-amd64-disc1.iso) = 017e69c5461fe2c865a395830dde88c8a55e7ec83d9a195b3b619346b44f9cc6 SHA256 (6.1-RC1-amd64-disc2.iso) = 81624f3b8dfa67ceab1dc6ec0a94c4485ad85955321c39d13c9ab4a678f776ef SHA256 (6.1-RC1-i386-bootonly.iso) = ec1a3fbf53186b5bc44dbfcdc77872c847f3c55532bb62f2afb4133328e7994f SHA256 (6.1-RC1-i386-disc1.iso) = e0b83f2cbd27db20f330036d0a25b8366b9e45df4b9c09354f76e584a9eb3b83 SHA256 (6.1-RC1-i386-disc2.iso) = de1fe5009229efd44b25bb18c4e68b03027259171cd9e017fe5bffadaa3402bb SHA256 (6.1-RC1-ia64-bootonly.iso) = 2b7290e4babcb647ec8b2a499fc5e0fc6918ac20ee0432e9ce2d216c84a540fa SHA256 (6.1-RC1-ia64-disc1.iso) = 64c035cf52544e2088720d6e6e602f95b53f6d7b342431a1aa2f4b4c32438847 SHA256 (6.1-RC1-ia64-disc2.iso) = e4f2f0599e9e80edec10f30de565e34e16bd8716da6673834036ce4c66d32dc0 SHA256 (6.1-RC1-ia64-livefs.iso) = cac9b95ba01a69d78af0034d96868a7b93e20840f3ec0eacb7a30c7e7dd0b39a SHA256 (6.1-RC1-pc98-disc1.iso) = b797113a34628130f759dc9756b741230dcd064f4519f1a477b3491a02f346ca ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Context switching
Nickolas wrote: Hello All! I'm porting a CPI card driver from linux to FreeBSD. Some initialization routines require much time (~1-2 seconds). Initialization of hardware should be done during opening device special file. So, I need to switch thread context. I'm doing it in such way: mi_switch(SW_VOL, choosethread()); Main trouble: system panic after program exit. dmesg output: -- Fatal trap 12: page fault while in user mode fault virtual address = 0xbfbfe5bc fault code = user write, protection violation instruction pointer = 0x1f:0x8074604 stack pointer = 0x2f:0xbfbfe5c0 frame pointer = 0x2f:0xbfbfe5f8 code segment= base 0xc090f8c0, limit 0x0, type 0x13 = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 472 (bash) trap number = 12 panic: page fault -- Please, tell how correct context switching should be implemented? OS version: FreeBSD 5.4 tsleep and msleep are the appropriate ways to context switch. mi_switch is an implementation detail of the scheduler. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Using any network interface whatsoever
Ceri Davies wrote: On Fri, Apr 07, 2006 at 03:57:42PM -0700, Brooks Davis wrote: On Fri, Apr 07, 2006 at 11:53:42PM +0100, Ceri Davies wrote: I'm trying to configure a bootable image to be used in various situations and on various (mostly unknown) hardware. For the filesystem I can use geom_label and /dev/ufs/UnlikelyString, but I'd also like to have it try to configure whatever interfaces the machine happens to have via DHCP. Other than specifying ifconfig_if0=DHCP once for every possible value of if, is there a mechanism to do this already? ifconfig_DEFAULT Superb, thank you! If you have non-Ethernet-like interfaces compiled in, you will probably want create empty ifconfig_if variables for them since DHCP won't work very well there. :) Good point, thanks again :) Ceri Well, the real question is why we force the details of driver names onto users. Network and storage drivers are especially guilty of this, but tty devices also are annoying. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Using any network interface whatsoever
Mike Meyer wrote: In [EMAIL PROTECTED], Scott Long [EMAIL PROTECTED] typed: Well, the real question is why we force the details of driver names onto users. Network and storage drivers are especially guilty of this, but tty devices also are annoying. Because Unix has always made the hardware details available to administrators. Times have changed so that users now need to do things that used to be restricted to administrators. This historical behavior is a *good* thing. If all devices of type foo are just named foo and assigned numbers by the system, then I have no control over the names. If I don't care which is which, this isn't a problem. If I do care - for instance, I want to distinguish between the ethernet interface that's on the internet and the one that's on my LAN, or I want root to be on the disk with the root file system on it - then this is a PITA, because every time I add hardware to the system, or re-arrange the cards in the cage, or similar things, I risk breaking the system configuration. If the device names are completely determined by the hardware settings, then this doesn't happen. Real world examples of this type of breakage include a FreeBSD 4.x system with SCSI disks that failed to boot when a USB mass storage device was plugged into it, and a Solaris system that started swapping on it's Ingres raw database partition after a disk was added. If a system is meant for desktop use where you typically have at most one of anything, then hiding the names from the users is a good thing. In a server environment, where you may have multiple instances of several different device types, then being able to easily tell which is which is a good thing. mike You're argument here doesn't really make sense. Youre' saying that instead of /dev/da0, we should have /dev/HITACHI-HUS103073FL3800-SA19-B0T1L0, and instead of em0, it should be em0-192.168.254.199-24-192.168.254.1-192.168.254.255, right? That way all the information is present and there is no chance of mixing up devices. I'm not saying that we should get rid of the device information. I'm fully happy making it available to top layer applications. Administrators definitely need the information to make good decisions. But the information isn't always needed, and it does make simple management tasks harder. It also adds complexity that can lead to problems. Why when I add a RAID driver do I also need to hack up sysinstall so that it'll recognise the RAID devices? This is 2006, not 1976! The computer should be helping us in administration tasks, not hiding behind inconsistent and obscure names. Now, for your specific case of SCSI, it is possible to wire down device assignments by the administrator. It's been documented how to do this in man pages and kernel config files, most recently by me personally, for years. The flaw is that it still requires specific operator intervention to make work. That's where things like volume labels come in. Does a sysadmin care about the low-level device name for a drive on a Windows or Mac system? Does he even know without taking a deep look inside the system? Does not knowing it make it any less possible to easily and reliable manage and control the hardware? It's all done through human-readable labels that are easy to work with. The low level information is still available when needed, but it's not the primary means of control. I think that's fine; it strikes the balance between control and ease of use that I'm looking for. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Using any network interface whatsoever
Mike Meyer wrote: In [EMAIL PROTECTED], Scott Long [EMAIL PROTECTED] typed: Please trim the text you are repling to. Please, I'm tired of arbitrary email etiquette. But where do you put the label on an ethernet interface? mike It sounds like your message is, don't be like Linux. Fine, what do you want instead? How does having 2 em devices in my system, named em0 and em1, tell me by name which one is connected to which LAN? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Using any network interface whatsoever
Ceri Davies wrote: On Sat, Apr 08, 2006 at 08:34:30AM -0600, Scott Long wrote: On Fri, Apr 07, 2006 at 11:53:42PM +0100, Ceri Davies wrote: For the filesystem I can use geom_label and /dev/ufs/UnlikelyString, but I'd also like to have it try to configure whatever interfaces the machine happens to have via DHCP. Other than specifying ifconfig_if0=DHCP once for every possible value of if, is there a mechanism to do this already? ifconfig_DEFAULT Well, the real question is why we force the details of driver names onto users. Network and storage drivers are especially guilty of this, but tty devices also are annoying. The current situation on BSD, where I can identify which interface is meant by its type, is definitely preferable to the Linux situation where eth0 may mean something different tomorrow depending on what is plugged in. Since we can rename devices arbitrarily, I don't really see a problem with respect to anything else. Ceri I'll say again, how does having em0, em1, em2, and em3 help me know what is going on with each of those interfaces? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Kernel Quality?
Robert Huff wrote: Sam Leffler writes: OTOH we've done nothing with user application code and based on the work I've seen done by netbsd there's plenty of stuff to be fixed there. When you say user application code, is this an alias for ports or do you mean non-ported applications? Robert Huff user application code == code not in src/sys/... That means src/lib, src/bin, src/sbin, src/usr.bin, etc. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patchset-9 release (Re: [unionfs][patch] improvements of the unionfs - Problem Report, kern/91010)
Jacques Marneweck wrote: Danny Braniss wrote: Daichi GOTO wrote: All folks have interests in improved unionfs should keep attentions and ask how about merge? at every turn :) OK. How about a merge? I'd really like to see this in 6-STABLE. Regards, Jan Mikkelsen. just a 'me too'. I've been running with the patch(under 6.1) and it's definitely better than the panics with the unpatched version. in other words, IMHO, it does not break anything, and it actualy fixes somethings. danny Any ETA to when we can see this merged into 6.1 and 5.5? Regards --jm Since it's not in HEAD yet, it's pretty improbable that it'll get into 5.5 and 6.1. It would be nice to get it in for 6.2 though. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: patchset-9 release (Re: [unionfs][patch] improvements of the unionfs - Problem Report, kern/91010)
Daichi GOTO wrote: Jan Mikkelsen wrote: Daichi GOTO wrote: All folks have interests in improved unionfs should keep attentions and ask how about merge? at every turn :) OK. How about a merge? I'd really like to see this in 6-STABLE. Me too, but unfortunately it is difficult with some reasons (detail information http://people.freebsd.org/~daichi/unionfs/). Of course, our patch gives the conditions for integration of -current OK. For -stable is BAD. We must keep the API compatibility of command/library for integration of -stable. With some technical/specifical reasons, our improved unionfs has a little uncompatibility. For the last time, integration of -stable will be left to the judgment of src committers and others. Regards, Jan Mikkelsen. Right now, unionfs is somewhat usable for read-only purposes. As long as your work doesn't alter or break the behaviour of read-only mounts, I think it's very much ready to go into CVS. From there it can get wider testing and review and be considered for 6-stable. Since read-write support in the existing code is pretty much worthless, I don't think that there will be a problem justifying the operational changes that you describe in your documentation. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.1-PRE boot locks up, using USB keyboard
John Baldwin wrote: On Wednesday 15 March 2006 12:11, Rick C. Petty wrote: On Wed, Mar 15, 2006 at 10:46:01AM -0500, John Baldwin wrote: I'm using a USB keyboard, no PS/2. I've tried the hint to disable kbdmux, I've tried with and without selecting the Boot w/ USB keyboard and the machine locks up in the same spot no matter what I try. The same hardware boots just fine with 6.0-RELEASE (although I need to choose the USB keyboard option if I plan on typing). Any suggestions? What if you turn off USB keyboard support in your BIOS? My BIOS (Asus A8N-E rev 1010) has no option for disabling USB keyboard support, but I can either disable the USB controller or disable the USB legacy support. I doubt either of these is desirable. Fortunately, I discovered the problem.. The legacy support option is the one that makes a USB keyboard look like a PS/2 keyboard. The ukbd device is compiled into GENERIC. I also had ukbd_load=YES in my loader.conf so it would be compatible with a custom kernel. When GENERIC boots, I get the message that ukbd is already loaded (file exists). I would expect that the kernel just ignores the attempt, but apparently there is an adverse effect. Whenever ukbd is loaded by /boot/loader and that device already exists in the kernel, the boot locks up after: atkbdc0: Keyboard controller (i8042) at port 0x60,0x64 on isa0 when using a USB keyboard. I would think this is a bug. It is 100% repeatable for me. If I comment out the line in /boot/loader.conf, the system boots nicely. Perhaps this is related to kbdmux(4), but I'm not sure. I've also noticed related problems when trying to load umass and ums through the boot loader and manually (I will try to reproduce these). Maybe the problem is in the USB layer?? FYI, I tried this on 6.1-BETA4, fresh from the ISOs. Ok. There are several edge cases that can blow up if you kldload a module or load a module from the loader that is already present in the kernel. Alternately, I've heard from some people with a similar problem that turning off USB2 but leaving plain USB on avoids the problem. I'm not exactly sure how or why this is, but it's worth a try I guess. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD 6.1-BETA2/FreeBSD 5.5-BETA2 Available
Announcement The FreeBSD Release Engineering Team is pleased to announce the availability of FreeBSD 6.1-BETA4 and FreeBSD 5.5-BETA4. Both FreeBSD 6.1 and FreeBSD 5.5 are meant to be a refinement of their respective branches with few dramatic changes. A lot of bugfixes have been made, some drivers have been updated, and some areas have been tweaked for better performance, etc. but no large changes have been made to the basic architecture. The FreeBSD 5.5 Release is being done for people who are unable to make the jump to FreeBSD 6.X at this time. We do encourage people to make that transition as soon as possible, though. There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5 but not all of the bugfixes done to RELENG_6 have been backported to RELENG_5. This will almost certainly be the last 5.X release. We encourage people to help with testing so any final bugs can be identified and worked out. Availability of ISO images is given below. If you have an older system you want to update using the normal CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1 and RELENG_5 for 5.5, though that will change later in the release cycle when we start doing the Release Candidates. Problem reports can be submitted using the send-pr(1) command. The list of open issues and things still being worked on are on the todo list: http://www.freebsd.org/releases/6.1R/todo.html http://www.freebsd.org/releases/5.5R/todo.html Known Issues A couple of significant changes were made to 6.1-BETA4. First is a large set of fixes to the VFS layer and various filesystems that should sigficantly help performance under heavy load and also fix problems with forcefully unmounting these filesystems. While these changes have recieved considerable developer testing, users are requested to test filesystem stability as much as possible to ensure that there are no regressions. The second large change is that sysinstall will now install both the GENERIC and SMP kernels and automatically select the appropriate one based on whether it detects one CPU in the system or multiple CPUs. However, single CPU systems with hyperthreading will still be treated as uni-processor by sysinstall. The automatic selection can be overridden within sysinstall. Testing of this is requested to help identify systems that are not detected correctly. Availability The BETA4 ISOs and FTP support are available on most of the FreeBSD Mirror sites. A list of the mirror sites is available here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html The MD5s are: MD5 (5.5-BETA4-pc98-disc1.iso) = bf6cf1238c000a01fe8c34ed4554e66e MD5 (5.5-BETA4-alpha-bootonly.iso) = 84e55974d8854692a85d43558e10c658 MD5 (5.5-BETA4-alpha-disc1.iso) = b5fc0a01dc6cb96924c7cd9c18af6dd9 MD5 (5.5-BETA4-i386-bootonly.iso) = e54261162e775b692138597ae2512bf6 MD5 (5.5-BETA4-i386-disc1.iso) = 15f3161d4c5f996bbbc9b28682198dde MD5 (5.5-BETA4-i386-disc2.iso) = 10c4c7985eea736862480b0212e60155 MD5 (5.5-BETA4-amd64-bootonly.iso) = 371522c9ab80e7d5c0fadd71d670543e MD5 (5.5-BETA4-amd64-disc1.iso) = 4811510b11620f8706f51e4e0154b8b1 MD5 (5.5-BETA4-amd64-disc2.iso) = 54ee8ec3240de9d84576f1f0cbf2048a MD5 (6.1-BETA4-pc98-disc1.iso) = ee891f12ddb7b62b2c1d3672555ceb3b MD5 (6.1-BETA4-ia64-bootonly.iso) = b05da331e737c6c614acbb924584199e MD5 (6.1-BETA4-ia64-disc1.iso) = d0b09231c1d55308fb8fb45eec845284 MD5 (6.1-BETA4-ia64-livefs.iso) = 9db26824bb09ee98e33eceadb643 MD5 (6.1-BETA4-alpha-bootonly.iso) = c77a7d80803efeba3b8140072ccd4969 MD5 (6.1-BETA4-alpha-disc1.iso) = b2701eb2931dd815b3b595f9183ae5a4 MD5 (6.1-BETA4-i386-bootonly.iso) = 113f1b990d298aa8b7f81d93a3636dc3 MD5 (6.1-BETA4-i386-disc1.iso) = aee3a4416eec24b1795346efeb624416 MD5 (6.1-BETA4-i386-disc2.iso) = 01b01719f7a06d2613a3e9fe15417b3f MD5 (6.1-BETA4-amd64-bootonly.iso) = c52a2081931d89cbbebf50f198e8b169 MD5 (6.1-BETA4-amd64-disc1.iso) = 5624a6ba41abdc60802d21be9ab4cf6e MD5 (6.1-BETA4-amd64-disc2.iso) = db427ec7ab4af75a8224a890c476846d The SHA256s are: SHA256 (5.5-BETA4-pc98-disc1.iso) = 0574c7db49a81c77d1d9cded1add28451026fee5ca52605be03ab6660f9b5ab5 SHA256 (5.5-BETA4-i386-bootonly.iso) = 077a4b6561311af08d9f760d734fc822d2589554f4a25ac413cfeb275a59361c SHA256 (5.5-BETA4-i386-disc1.iso) = 3367499f48d7fdc526a1f447f8e83ee4eef7a76e74784eb7471124499440e05d SHA256 (5.5-BETA4-i386-disc2.iso) = 40751884348826807f6c24ec78d424568be21c3e64fcb002f7cfd2bf9ec3bfa7 SHA256 (5.5-BETA4-amd64-bootonly.iso) = ca7390623cfa64589a4f19c80fa557e8a6085c54335811010c6b609a4202fd20 SHA256 (5.5-BETA4-amd64-disc1.iso) = a9cba0901cf6747193e173eb1c1d1b843ddaa59d12f2e8e84d0d634d2aba5bb0 SHA256 (5.5-BETA4-amd64-disc2.iso) = 763d5dfe8d7bed4dadc8850e04e2ad04b8c3f4ae6e283df6cfdd25b9475a80d0 SHA256 (6.1-BETA4-pc98-disc1.iso) = 3de12c8ec0d65a651bcb200049c5fa7b4a9228ebea6f64dea15b35ab07d19178 SHA256 (6.1-BETA4-ia64-bootonly.iso) =
BETA4! [Re: FreeBSD 6.1-BETA2/FreeBSD 5.5-BETA2 Available]
Sorry, I accidentally sent out an incomplete draft. This announcement is for BETA4, of course. Also, the note about VFS changes below should stress that the changes were made for stability, not performance. Sorry for the confusion. Scott Long wrote: Announcement The FreeBSD Release Engineering Team is pleased to announce the availability of FreeBSD 6.1-BETA4 and FreeBSD 5.5-BETA4. Both FreeBSD 6.1 and FreeBSD 5.5 are meant to be a refinement of their respective branches with few dramatic changes. A lot of bugfixes have been made, some drivers have been updated, and some areas have been tweaked for better performance, etc. but no large changes have been made to the basic architecture. The FreeBSD 5.5 Release is being done for people who are unable to make the jump to FreeBSD 6.X at this time. We do encourage people to make that transition as soon as possible, though. There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5 but not all of the bugfixes done to RELENG_6 have been backported to RELENG_5. This will almost certainly be the last 5.X release. We encourage people to help with testing so any final bugs can be identified and worked out. Availability of ISO images is given below. If you have an older system you want to update using the normal CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1 and RELENG_5 for 5.5, though that will change later in the release cycle when we start doing the Release Candidates. Problem reports can be submitted using the send-pr(1) command. The list of open issues and things still being worked on are on the todo list: http://www.freebsd.org/releases/6.1R/todo.html http://www.freebsd.org/releases/5.5R/todo.html Known Issues A couple of significant changes were made to 6.1-BETA4. First is a large set of fixes to the VFS layer and various filesystems that should sigficantly help performance under heavy load and also fix problems with forcefully unmounting these filesystems. While these changes have recieved considerable developer testing, users are requested to test filesystem stability as much as possible to ensure that there are no regressions. The second large change is that sysinstall will now install both the GENERIC and SMP kernels and automatically select the appropriate one based on whether it detects one CPU in the system or multiple CPUs. However, single CPU systems with hyperthreading will still be treated as uni-processor by sysinstall. The automatic selection can be overridden within sysinstall. Testing of this is requested to help identify systems that are not detected correctly. Availability The BETA4 ISOs and FTP support are available on most of the FreeBSD Mirror sites. A list of the mirror sites is available here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html The MD5s are: MD5 (5.5-BETA4-pc98-disc1.iso) = bf6cf1238c000a01fe8c34ed4554e66e MD5 (5.5-BETA4-alpha-bootonly.iso) = 84e55974d8854692a85d43558e10c658 MD5 (5.5-BETA4-alpha-disc1.iso) = b5fc0a01dc6cb96924c7cd9c18af6dd9 MD5 (5.5-BETA4-i386-bootonly.iso) = e54261162e775b692138597ae2512bf6 MD5 (5.5-BETA4-i386-disc1.iso) = 15f3161d4c5f996bbbc9b28682198dde MD5 (5.5-BETA4-i386-disc2.iso) = 10c4c7985eea736862480b0212e60155 MD5 (5.5-BETA4-amd64-bootonly.iso) = 371522c9ab80e7d5c0fadd71d670543e MD5 (5.5-BETA4-amd64-disc1.iso) = 4811510b11620f8706f51e4e0154b8b1 MD5 (5.5-BETA4-amd64-disc2.iso) = 54ee8ec3240de9d84576f1f0cbf2048a MD5 (6.1-BETA4-pc98-disc1.iso) = ee891f12ddb7b62b2c1d3672555ceb3b MD5 (6.1-BETA4-ia64-bootonly.iso) = b05da331e737c6c614acbb924584199e MD5 (6.1-BETA4-ia64-disc1.iso) = d0b09231c1d55308fb8fb45eec845284 MD5 (6.1-BETA4-ia64-livefs.iso) = 9db26824bb09ee98e33eceadb643 MD5 (6.1-BETA4-alpha-bootonly.iso) = c77a7d80803efeba3b8140072ccd4969 MD5 (6.1-BETA4-alpha-disc1.iso) = b2701eb2931dd815b3b595f9183ae5a4 MD5 (6.1-BETA4-i386-bootonly.iso) = 113f1b990d298aa8b7f81d93a3636dc3 MD5 (6.1-BETA4-i386-disc1.iso) = aee3a4416eec24b1795346efeb624416 MD5 (6.1-BETA4-i386-disc2.iso) = 01b01719f7a06d2613a3e9fe15417b3f MD5 (6.1-BETA4-amd64-bootonly.iso) = c52a2081931d89cbbebf50f198e8b169 MD5 (6.1-BETA4-amd64-disc1.iso) = 5624a6ba41abdc60802d21be9ab4cf6e MD5 (6.1-BETA4-amd64-disc2.iso) = db427ec7ab4af75a8224a890c476846d The SHA256s are: SHA256 (5.5-BETA4-pc98-disc1.iso) = 0574c7db49a81c77d1d9cded1add28451026fee5ca52605be03ab6660f9b5ab5 SHA256 (5.5-BETA4-i386-bootonly.iso) = 077a4b6561311af08d9f760d734fc822d2589554f4a25ac413cfeb275a59361c SHA256 (5.5-BETA4-i386-disc1.iso) = 3367499f48d7fdc526a1f447f8e83ee4eef7a76e74784eb7471124499440e05d SHA256 (5.5-BETA4-i386-disc2.iso) = 40751884348826807f6c24ec78d424568be21c3e64fcb002f7cfd2bf9ec3bfa7 SHA256 (5.5-BETA4-amd64-bootonly.iso) = ca7390623cfa64589a4f19c80fa557e8a6085c54335811010c6b609a4202fd20 SHA256 (5.5-BETA4-amd64-disc1.iso) = a9cba0901cf6747193e173eb1c1d1b843ddaa59d12f2e8e84d0d634d2aba5bb0 SHA256 (5.5-BETA4-amd64
Re: VMWARE GSX Port?
Aniruddha Bohra wrote: On Thu, 2006-03-02 at 13:28 -0800, Kip Macy wrote: -CURRENT runs on 3.0 as a domU. There is partial dom0 support. The changes have not gone back into the mainline because xenbus is extremely difficult to integrate cleanly. You can check on the state of the xen3 branch in perforce. At several places the difficulties are mentioned. Is there some place that these are listed or discussed in more detail? Thanks Aniruddha The difficulty is that Xen uses a Mach-like message passing system to communicate information like DomU configuration. This requires kernel threads to be operational early on in the startup of the DomU kernel, much earlier than what FreeBSD allows. My attempts so far to allow xenbus to synchronously retrieve its devce configuration instead of relying on asynchronous messages has been unsuccessfull. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: VMWARE GSX Port?
Ashok Shrestha wrote: VMWARE GSX was released recently for free. [http://www.vmware.com/news/releases/server_beta.html] Is anyone working on a port for this? I've started on it, but I haven't made much progress yet. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: urgent, need to recover superblock!
Dave wrote: Hello, Some urgency on this issue!I've got a 10 gb ide drive that has critical data on one of it's partitions /dev/ad1e. This drive was originally gmirrored in another box it worked fine, it was the master drive. Now i've installed this drive as a slave in another 6.0 box, and now it shows up as ad1 with the partition i want being ad1e. I did a mount it worked fine. So i knew the drive was working, i then unmounted the partition, and tried to dump it to another drive. This didn't work, dump got an error about incorrect superblock. I then did a mount -o ro /dev/ad1e /mnt and i'm getting an error Incorrect superblock from mount. I then tried fsck /dev/ad1e and got the same error msg. These partitions were formatted with ufs2 as their filesystem. I then ran bsdlabel ad1 and got a printout of my label, this showed up which gives me hope that this data can be retrieved. An error i'm getting from bsdlabel says that the c: partition does not cover the entire disk and that may result in utilities not working correctly. Any help appreciated. Some urgency! Dave. Sounds like you need to install ports/syutils/ffsrecov and spend some quality time with it tonight. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD 6.1-BETA2/FreeBSD 5.5-BETA2 Available
Announcement The FreeBSD Release Engineering Team is pleased to announce the availability of FreeBSD 6.1-BETA2 and FreeBSD 5.5-BETA2. Both FreeBSD 6.1 and FreeBSD 5.5 are meant to be a refinement of their respective branches with few dramatic changes. A lot of bugfixes have been made, some drivers have been updated, and some areas have been tweaked for better performance, etc. but no large changes have been made to the basic architecture. The FreeBSD 5.5 Release is being done for people who are unable to make the jump to FreeBSD 6.X at this time. We do encourage people to make that transition as soon as possible, though. There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5 but not all of the bugfixes done to RELENG_6 have been backported to RELENG_5. This will almost certainly be the last 5.X release. We encourage people to help with testing so any final bugs can be identified and worked out. Availability of ISO images is given below. If you have an older system you want to update using the normal CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1 and RELENG_5 for 5.5, though that will change later in the release cycle when we start doing the Release Candidates. Problem reports can be submitted using the send-pr(1) command. The list of open issues and things still being worked on are on the todo list: http://www.freebsd.org/releases/6.1R/todo.html http://www.freebsd.org/releases/5.5R/todo.html Known Issues The DHCP problem that affected the 6.1-BETA1 installer has been fixed. Several critical fixes were made to the ATA subsystem after the BETA2 builds started, so please check for newer updates before reporting problems. The Intel 2200/2915 Wireless is known to have a number of stability problems that are being worked on right now. Work is also in progress to make ht-plug of USB memory devices more reliable. Updated packages for all architectures might not be available yet. Availability The BETA2 ISOs and FTP support are available on most of the FreeBSD Mirror sites. A list of the mirror sites is available here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html The MD5s are: MD5 (5.5-BETA2-amd64-bootonly.iso) = a2333f7f2184e2899c7317202843fab6 MD5 (5.5-BETA2-amd64-disc1.iso) = 1c5efbe2276c94890a6e8efb152f55d5 MD5 (5.5-BETA2-amd64-disc2.iso) = bd1e55d73401d6160d2bc1eaef90348c MD5 (5.5-BETA2-i386-bootonly.iso) = 26fba5024002189a3d0bc8be58a62bc8 MD5 (5.5-BETA2-i386-disc1.iso) = 5fc65d29cd7139dadd20e207b16c81f1 MD5 (5.5-BETA2-i386-disc2.iso) = 24ca73eb276887fa5c90fa670e3b5e64 MD5 (5.5-BETA2-sparc64-bootonly.iso) = 8835b89782db79fe042b3d365c9e63ea MD5 (5.5-BETA2-sparc64-disc1.iso) = 58712a0f7e30e19bfdd52ca5961050db MD5 (5.5-BETA2-sparc64-disc2.iso) = 1e350504487f0f51fbe590eb830ad3f4 MD5 (5.5-BETA2-alpha-bootonly.iso) = 8ec56c534a22ee5a57af88b01fe3bf76 MD5 (5.5-BETA2-alpha-disc1.iso) = 9abd0f971d2928c95dfff8e0808bbc2c MD5 (5.5-BETA2-pc98-disc1.iso) = 92d7862ed1b77989fe1545346b9c2cef MD5 (6.1-BETA2-amd64-bootonly.iso) = c9d59cadf063e3a67626c1e5477b27bb MD5 (6.1-BETA2-amd64-disc1.iso) = e471a79dbcfbea829c8ab6caa9683e61 MD5 (6.1-BETA2-amd64-disc2.iso) = 29367aa8f31a4af3e49d110574f25319 MD5 (6.1-BETA2-i386-bootonly.iso) = cd77279cdf230e4b37c2d579d2a01cc1 MD5 (6.1-BETA2-i386-disc1.iso) = 3267d7794079d3b803f5f5f004cf04f1 MD5 (6.1-BETA2-i386-disc2.iso) = f47ad00c0240320661a78320befdeaa1 MD5 (6.1-BETA2-sparc64-bootonly.iso) = fc070774c0516391f7176518a0b0fabc MD5 (6.1-BETA2-sparc64-disc1.iso) = 540aa7abd4f51ea93138086f886e11de MD5 (6.1-BETA2-sparc64-disc2.iso) = 0d29f1554eb53a8748b9ec280868c460 MD5 (6.1-BETA2-alpha-bootonly.iso) = 923056c4b9c249f5393530ca62618aa6 MD5 (6.1-BETA2-alpha-disc1.iso) = fab96d1ee15340e5e8c639066f313751 MD5 (6.1-BETA2-pc98-disc1.iso) = 95be33fe72af53e9d6140b85385dc8c8 MD5 (6.1-BETA2-ia64-bootonly.iso) = 8eceddde4826e30480933cd05d306084 MD5 (6.1-BETA2-ia64-disc1.iso) = b034fce6098c6c203db379511174c729 MD5 (6.1-BETA2-ia64-livefs.iso) = 2592fa45458555862063b49e2946c16d The SHA256s are: SHA256 (5.5-BETA2-amd64-bootonly.iso) = 2616b50051eb8213877d4ba506b6f72f218870dffd40d1000b6d022d398d7f09 SHA256 (5.5-BETA2-amd64-disc1.iso) = 603a590632679cf3151ab183da4b53dde241e4990b335f396902cc5c5c5ff531 SHA256 (5.5-BETA2-amd64-disc2.iso) = 0834e6d5e024db45793ff92b4e436d5c466213bd05b6c1091380d58822c5fb0b SHA256 (5.5-BETA2-i386-bootonly.iso) = b5709ee350faf8010db5eac0db0639945045212bc3e3ac96ed42b3433d698b32 SHA256 (5.5-BETA2-i386-disc1.iso) = 012356396548d81840dbe004c5b125b1d5725e239728c920e41b786a539c67b9 SHA256 (5.5-BETA2-i386-disc2.iso) = 567d8321f609f99da8864a48b862a647ebc65ab9bb28b8b3f79a8d86cf108d9d SHA256 (5.5-BETA2-sparc64-bootonly.iso) = f713b1ba0eff3596b57ee22d629e4940e2a61eddf5bb207aa8597158be269851 SHA256 (5.5-BETA2-sparc64-disc1.iso) = 50375e00fb40d3e3d56dd6b2f3321916a5324f349ddb1c8e4ec904fd13eaa4e8 SHA256 (5.5-BETA2-sparc64-disc2.iso) =
Re: Panic Kernel Dump to umass device?
Nate Nielsen wrote: I'm developing for small embedded systems, and I'm looking into the possibility of dumping a kernel core dump to a USB memory stick (umass driver). It currently doesn't work (see below), but I'm interested in fixing it. Yes, I know it'll be slow. It's probably also a non-tested (and non-reliable) code path for a kernel dump. But leaving those issues aside... First I wanted to ask if anyone else has tried this. Is it an insane idea, impossible? I'm not very familiar with the CAM/SCSI/USB sub-systems so perhaps someone more knowledgeable than I can set me straight. Currently when doing a dump to a USB device, I get the following. This with 6.0-RELEASE. Dump device is /dev/da0s1. Fatal trap 12: page fault while in kernel mode fault virtual address = 0x0 fault code = supervisor write, page not present instruction pointer = 0x20:0xc0cea412 stack pointer = 0x28:0xc6cf5c1c frame pointer = 0x28:0xc6cf5c24 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 473 (kldload) trap number = 12 panic: page fault Uptime: 3m48s Dumping 95 MB (2 chunks) Aborting dump due to I/O error. status == 0xb, scsi status == 0x0 ** DUMP FAILED (ERROR 5) ** Automatic reboot in 5 seconds - press a key on the console to abort It waits for about a minute after 'Dumping 95 MB (2 chunks)'. The light on the USB stick goes and remains stuck in the on state. The status: 0xb seems to be CAM_CMD_TIMEOUT. ERROR 5 is EIO. As far as I know, kernel dumps are always dune without interrupts and the driver runs with polling. It's likely that the umass driver and/or USB subsystem doesn't like this. Cheers, Nate You're correct that dumping is meant to be done with interrupts and task switching disabled. The first thing that the umass driver is missing is a working CAM poll handler. Without this, there is no way for command completions to be seen when interrupts are disabled. Beyond that, I somewhat suspect that the USB stack expects to be able to push command completion work off to worker threads, at least for some situations, and that also will not work in the kernel dump environment. So, there is a lot of work needed to make this happen. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
BETA1 announcement
Announcement The FreeBSD Release Engineering Team is pleased to announce the beginning of both the FreeBSD 6.1 and FreeBSD 5.5 release cycles with the availability of FreeBSD 6.1-BETA1 and FreeBSD 5.5-BETA1 Both FreeBSD 6.1 and FreeBSD 5.5 are meant to be a refinement of their respective branches with few dramatic changes. A lot of bugfixes have been made, some drivers have been updated, and some areas have been tweaked for better performance, etc. but no large changes have been made to the basic architecture. The FreeBSD 5.5 Release is being done for people who are unable to make the jump to FreeBSD 6.X at this time. We do encourage people to make that transition as soon as possible, though. There have been some updates made between FreeBSD 5.4 and FreeBSD 5.5 but not all of the bugfixes done to RELENG_6 have been backported to RELENG_5. This will almost certainly be the last 5.X release. We encourage people to help with testing so any final bugs can be identified and worked out. Availability of ISO images is given below. If you have an older system you want to update using the normal CVS/cvsup source based upgrade the branch tag to use is RELENG_6 for 6.1 and RELENG_5 for 5.5, though that will change later in the release cycle when we start doing the Release Candidates. Problem reports can be submitted using the send-pr(1) command. The list of open issues and things still being worked on are on the todo list: http://www.freebsd.org/releases/6.1R/todo.html http://www.freebsd.org/releases/5.5R/todo.html Known Issues Other than the list of open issues in the todo lists BETA1 has a few other known issues. There is a problem with using DHCP during system installation. A fix for this is already being worked on. And as usual at this stage of a release the availability of pre-built packages on the ISO images varies widely from architecture to architecture. The list of packages that will be available as part of the release itself will certainly be different. Availability The BETA1 ISOs and FTP support are available on most of the FreeBSD Mirror sites. A list of the mirror sites is available here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html The MD5s are: MD5 (5.5-BETA1-alpha-bootonly.iso) = af05452acda5868b5515bc868038ffba MD5 (5.5-BETA1-alpha-disc1.iso) = 4660ef47a3d1ffc49611484c7fab4cd0 MD5 (5.5-BETA1-amd64-bootonly.iso) = c0f161a4711ca422832907692e47f54c MD5 (5.5-BETA1-amd64-disc1.iso) = 4e64fe4c4cd0dec41ee234f84f8c4946 MD5 (5.5-BETA1-amd64-disc2.iso) = ba9898176a7afbfc2d0162e38ec8d205 MD5 (5.5-BETA1-i386-bootonly.iso) = 5a5214b758db033529897884350b8a19 MD5 (5.5-BETA1-i386-disc1.iso) = 10c489414716782d9d8ce942dd4f7de8 MD5 (5.5-BETA1-i386-disc2.iso) = f7bcae220c1cfc8cff67e1aaa5a3bb69 MD5 (5.5-BETA1-pc98-disc1.iso) = d581fa4725e9b2daf939f45619b63c93 MD5 (5.5-BETA1-sparc64-bootonly.iso) = 34a90df46d5b1a6c7fe2cd263138f1ec MD5 (5.5-BETA1-sparc64-disc1.iso) = 3261ec3570b7b1a1f8089577564f5693 MD5 (5.5-BETA1-sparc64-disc2.iso) = 614541ce58508efe94b3ca2f9fc6159c MD5 (6.1-BETA1-alpha-bootonly.iso) = e63dc0fcbc4222e82c3bb7e040f791cb MD5 (6.1-BETA1-alpha-disc1.iso) = c536ed08fedbd45141f48919803042dc MD5 (6.1-BETA1-amd64-bootonly.iso) = 84a72b2a6d86fd29f7e35da9092300ec MD5 (6.1-BETA1-amd64-disc1.iso) = 20f36ee823ed4313f89289d5357fc365 MD5 (6.1-BETA1-amd64-disc2.iso) = 2bbf97c74d7df701037634a7b0c971cb MD5 (6.1-BETA1-i386-bootonly.iso) = 037b99d2dddb93f75f3a3445908103a4 MD5 (6.1-BETA1-i386-disc1.iso) = 3cc6e3e66abce6420c316e04631ddb19 MD5 (6.1-BETA1-i386-disc2.iso) = c822d0a62f3e402f21088ca8abefce3e MD5 (6.1-BETA1-ia64-bootonly.iso) = 1fb12b97f70980ab12c2d31526c128ec MD5 (6.1-BETA1-ia64-disc1.iso) = d5e34526c056caf543d300412cf07648 MD5 (6.1-BETA1-ia64-livefs.iso) = c29cb7b7b0724d70b6e07e724bf44b62 MD5 (6.1-BETA1-pc98-disc1.iso) = 6894340e1b7dac32de263974a62e9beb MD5 (6.1-BETA1-sparc64-bootonly.iso) = af91246b0b42bf16e65d38a5a68b6726 MD5 (6.1-BETA1-sparc64-disc1.iso) = 668c98638c2b830ca3276cdfe18815a5 MD5 (6.1-BETA1-sparc64-disc2.iso) = f01049d7a1011db04101b97c954da363 The SHA256s are: SHA256 (5.5-BETA1-amd64-bootonly.iso) = 2553bf02cf3acf38f5ca6ccb2884a1d83b1046e039686fbfca8d8b799173fb7c SHA256 (5.5-BETA1-amd64-disc1.iso) = ee9ee3c11651d6c1991a168e7c6731d129209c072bf93bd6e453af1da255e1fd SHA256 (5.5-BETA1-amd64-disc2.iso) = 9e871ea6dd7ee3e4ef4bdcb7f3eb7efcda7a1f96b8397f313010c03d781b4c93 SHA256 (5.5-BETA1-i386-bootonly.iso) = 335428bcc6e391578c354a042ab125866968ebe3c6031011fe0062558d56328c SHA256 (5.5-BETA1-i386-disc1.iso) = a33138b4bf7b224e24c20d6578954897d8095ce316cb72cd117a2c449c26a55b SHA256 (5.5-BETA1-i386-disc2.iso) = ec68782fa3e82c74582de00b62e14f395e3a45ec0c10e95fb5673a8354e3f4f5 SHA256 (5.5-BETA1-pc98-disc1.iso) = a884fd69a0e49752c0a963c8b0d2cd84c5ffa8d2b0cae57e18fa59136722214c SHA256 (5.5-BETA1-sparc64-bootonly.iso) = 8a073b3704e038f2217b559e2e6f68ed33a5f0f45a4546ccdccb4e0b71f1b79f SHA256
Re: Weird PCI interrupt delivery problem (resolution, sort of)
John Baldwin wrote: On Tuesday 24 January 2006 19:34, Craig Boston wrote: On Tue, Jan 24, 2006 at 10:43:49AM -0500, John Baldwin wrote: What if you do a read of the lapic before the write? Maybe doing 'x = lapic-eoi; lapic-eoi = 0;'? Reading the lapic before the write has no effect. Reading the lapic after the write makes it work. Hmm, perhaps the read forces the write to post? Scott? Either that, or the read imposes enough delay to let whatever was happening during the DELAY call work. I find it hard to believe that uncached writes would get delayed like this. I've lost the original posting on this, could you provide the dmesg and computer make/model again? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Difference between a kthread and an ordinary process.
Pranav Peshwe wrote: Hello, When a kthread is created using the kthread_create (9) function, i found out that a new instance of struct proc is created and allocated for the thread just as in case of a creation of a new process.Also, the thread is assigned a pid as in the case of a process. What is the difference between a kernel thread and a normal process created using fork ? except the address space sharing with swapper and kernel mode execution of the kthread. Is a kthread effectively just a process always running in kernel mode ? That is exactly what a kthread is. There is some work in process to make them true threads within one or more processes. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Weird PCI interrupt delivery problem (resolution, sort of)
Craig Boston wrote: After trying everything I could think of to do to the I/O APIC code and coming up empty, tonight I went back to the local APIC. I had previously ruled it out since the lapic timer interrupt continued to work fine even when the others stopped. However, adding some DELAY(1) calls at key points caused it to work, much like adding WITNESS does. I managed to get it down to a single change that makes APIC mode work on this laptop: --- local_apic.c.orig Thu Jan 19 18:32:37 2006 +++ local_apic.cThu Jan 19 18:32:28 2006 @@ -599,4 +599,5 @@ lapic_eoi(void) { lapic-eoi = 0; + lapic-eoi = 0; } ...and welcome to bizarro world. There's absolutely no reason I can think of why that would change anything, other than buggy hardware. I looked at what Linux was doing, and they're also using a single write to EOI interrupts, so long as the X86_GOOD_APIC config option is enabled (and it is for P5/MMX or newer). Otherwise it does an extra read before writing to any APIC register. I don't know if linux works on this hardware or not -- the live CD I tried wasn't compiled for APIC support. At this point, since AFAIK nobody else has reported the same problem, I'm content with a local workaround. It's just... wierd. Craig This points to a bus coherency problem. I wonder if your BIOS is incorrectly setting the memory region of the apics as cachable. You'll want to bug Baldwin about this. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: An idea of remove MUTEX_WAKE_ALL
Daniel Eischen wrote: On Tue, 3 Jan 2006, John Baldwin wrote: On Sunday 01 January 2006 02:21 am, prime wrote: Hi hackers, I have an idea about remove the kernel option MUTEX_WAKE_ALL. When we unlock the mutex(in _mtx_unlock_sleep),we can directly give the lock to the first thread waiting on the turnstile.And a thread gets the mutex after he returned from turnstile_wait so he can simply jump out the _obtain_lock loop in _mtx_lock_sleep. This makes a mutex always be owned by a thread when there are threads waiting on the turnstile,so priority inheritance can work now. This idea need only a few changes in kern/kern_mutex.c .But when NO_ADAPTIVE_MUTEXS not set,it makes threads that spinning on other CPU to get the mutex have to spin for a long time,and this makes the short term mutex more expensive(maybe should use spin mutex instead). What do think about the idea? Thanks. Sun actually found that the performance was better when you did MUTEX_WAKE_ALL because once you woke up N threads, if they don't all resume at once then they will acquire the lock in sequence and the lock acquires and releaes will all be simple ones rather than all being the complicated contested case. There are more details in _Solaris Internals_. Yes, but doesn't this partly rely on having the threads spin(*) for a bit if the current lock owner is running on another CPU? Do we currently do that? (*) No, I am not referring to spin mutexes. Adaptive mutexes are enabled by default and have been for at least a year. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: An idea of remove MUTEX_WAKE_ALL
Daniel Eischen wrote: On Tue, 3 Jan 2006, Scott Long wrote: for a bit if the current lock owner is running on another CPU? Do we currently do that? (*) No, I am not referring to spin mutexes. Adaptive mutexes are enabled by default and have been for at least a year. Ahh, then that's what they (Adaptive) do. Well, it's a bit different from Solaris, I believe. They do not sleep after a certain number of contested spins, and instead just continue to spin. As we reduce the coverage of large contested locks (like Giant) this becomes much less of performance problem, though. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: My wish list for 6.1
Xin LI wrote: Hi, Scott, On 12/16/05, Scott Long [EMAIL PROTECTED] wrote: Guys, With code freeze for 6.1 about 6 weeks away, I'd like to put out my 'wish list' for it: More-or-less OT question: Shall we switch ULE as the default scheduler on -HEAD to encourage more testing against it? Cheers, Only if there is someone committed to tracking and fixing bugs. Last time we tried this, we wasted a lot of time and energy. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
My wish list for 6.1
Guys, With code freeze for 6.1 about 6 weeks away, I'd like to put out my 'wish list' for it: 1. working kbdmux. We need this for the growing number of systems that assume that USB is the primary keyboard. Current status appears to be that the kbdmux driver breaks very easily. We need this working well enough where it can be enabled by default, and all attached keyboards Just Work. 2. SMP kernels for install. Right now we only install a UP kernel, for performance reasons. We should be able to package both a UP and SMP kernel into the release bits, and have sysinstall install both. It should also select the correct one for the target system and make that the default on boot. The easiest way to do this would be to have sysinstall boot an SMP kernel and then look at the hw.ncpu sysctl. The only problem is being able to have sysinstall fall back to booting a UP kernel for itself if the SMP one fails. This can probably be 'faked' by setting one of the SMP-disabling variables in the loader. But in any case, the point is to make the process Just Work for the user, without the user needing to know arcane loader/sysctl knobs. SMP laptops are right around the corner, and we should be ready to support SMP out-of-the-box. 3. Full review and update of the install docs, handbook, FAQ, etc. There are sections that are embarrassingly out of date (one section of the handbook apparently states that we only support a single brand of wifi cards). A co-worker of mine tried to install 6.0 using just the handbook install guide, and discovered that it really doesn't match reality anymore, in both big and small ways. Contact me directly if you would like his list of comments. Thanks! Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: scsi-target and the buffer cache
Eric Anderson wrote: Nate Lawson wrote: Eric Anderson wrote: I'm curious about whether a target mode device would use the buffer cache or not. Here's a scenario: Host A: has fibre channel host adapter, in target mode, large memory pool, and another fiber channel host adapter connecting to fibre channel block device. Host B: Fibre channel host adapter, connecting to Host A. 'sees' the target mode block device created by Host A. Will Host A use the buffer cache to cache blocks between the real block device, and the shared target mode device? What about if Host A put a filesystem on the block device, created a single file the size of the filesystem, and shared that filesystem via a target mode device to Host B? What I'm wanting is a box (FreeBSD?) that can be placed between a fibre channel block device (like a RAID array), and a fibre channel host using that block device, and act as a block cache for that device, using the FreeBSD's memory. If it had a significant amount of memory, this could be very useful. If you use the example scsi_target usermode (usr/share/examples/scsi_target), then the buffer cache will be used since its reads/writes are from usermode like normal. If you don't want that behavior, you can set O_DIRECT in the open() call of the backing store file. If you chose to modify the kernel side, you'd have to make sure your accesses were through the VOP layer and then it would be cached. You should check to be sure the target mode performance meets your expectations also. I guess I would be using the user mode tool, unless there's another way? Your comment on performance also makes me a little worried about that now - do you think I would see a large performance hit? Thanks! Eric The way the target mode stack works in FreeBSD is that the kernel provides some of the basic services, but the actual target emulator is meant to live in userland. The userland program responds to events from the kernel via the select interface. This generally works pretty well. However, it does mean that control has to cross the kernel-userland boundary at least once for every event. What I'd suggest doing is prototyping your target emulator in userland and evaluating the performance there, and then moving it to the kernel if you _really_ need more performance. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: scsi-target and the buffer cache
Eric Anderson wrote: Nate Lawson wrote: Scott Long wrote: Eric Anderson wrote: Nate Lawson wrote: Eric Anderson wrote: I'm curious about whether a target mode device would use the buffer cache or not. Here's a scenario: Host A: has fibre channel host adapter, in target mode, large memory pool, and another fiber channel host adapter connecting to fibre channel block device. Host B: Fibre channel host adapter, connecting to Host A. 'sees' the target mode block device created by Host A. Will Host A use the buffer cache to cache blocks between the real block device, and the shared target mode device? What about if Host A put a filesystem on the block device, created a single file the size of the filesystem, and shared that filesystem via a target mode device to Host B? What I'm wanting is a box (FreeBSD?) that can be placed between a fibre channel block device (like a RAID array), and a fibre channel host using that block device, and act as a block cache for that device, using the FreeBSD's memory. If it had a significant amount of memory, this could be very useful. If you use the example scsi_target usermode (usr/share/examples/scsi_target), then the buffer cache will be used since its reads/writes are from usermode like normal. If you don't want that behavior, you can set O_DIRECT in the open() call of the backing store file. If you chose to modify the kernel side, you'd have to make sure your accesses were through the VOP layer and then it would be cached. You should check to be sure the target mode performance meets your expectations also. I guess I would be using the user mode tool, unless there's another way? Your comment on performance also makes me a little worried about that now - do you think I would see a large performance hit? Thanks! Eric The way the target mode stack works in FreeBSD is that the kernel provides some of the basic services, but the actual target emulator is meant to live in userland. The userland program responds to events from the kernel via the select interface. This generally works pretty well. However, it does mean that control has to cross the kernel-userland boundary at least once for every event. What I'd suggest doing is prototyping your target emulator in userland and evaluating the performance there, and then moving it to the kernel if you _really_ need more performance. Agree 100%. While having it in usermode means there are boundary crossings that increase per-transaction latency, the actual bulk data transfer is via zero-copy IO and you should be able to exceed the data transfer rates of several 10K RPM drives on decent hardware. Ok, great.. Now, will scsi_target work ok with raw devices, or only files? (although I'm not sure theres all that much difference really). Thanks!! Eric You can write your userland code to use whatever files or devices you want. Are you talking about the scs_target.c code in /usr/share/examples? That's just a skeletal example that you can use as a starting point for your own work. Scott Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sym(4) broken on amd64 (Time to port new driver?)
Sergey N. Voronkov wrote: Looks like it is broken for a while - _sym_calloc2: failed to allocate HCB is always there... And... Looks like Gerard Roudier havn't more interest in maintaining this driver - there is the second generation of the original driver into linux source three since 2001, which is newer ported to FreeBSD by the author. And FreeBSD hooks was removed from the driver code... May be it is a time to port siop(4) from NetBSD? No. I'm working on fixing this right now. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Sharing the same VM address space between Kernel and UserSpace
John Giacomoni wrote: I am in need of a way to share memory between kernel space and possibly multiple different user-space processes for an extended period of time. This memory would need to be a single unpageable region. I am using the vm routines as cribbed from mmap, however I'd like the address spaces to be viewed as the same regardless of which process I'm in to avoid swizzling pointers as I'm storing data structures in the shared memory region. I imagine I'd need to find a way to expose part of the kernel address space to user space to accomplish this. Is there a way to do this? thanks John G If you get this working then it'll be very useful for the syspage support that was talked about recently. The kernel can access addresses in the user space so long as they are wired and won't cause a fault. Thus I imagine that you only need to allocate the memory, wire it, mark it with the appropriate page permissions, and reserve a user address range for it in the process map. I'd look at the process exec path in the kernel for places to hook in. The only other trick then is how to let the user process know the address for this magic region. An easy way would be to store it in a sysctl that can be read at runtime. A harder way would be to have the kernel dummy up an elf segment in the image activator code that the dynamic linker could read and put into a global variable for the program to access. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: twe and giant
Charles Sprickman wrote: Hello all, I was just wondering about this... I recently bumped a soon-to-be production box to 6.0 as it seems like upgrading now is easier than doing it the week after the box goes into production (it's amazing how the release engineering team knows to schedule this way...:). One thing I noticed in reviewing the boot messages is that twe is still under the giant lock: twe0: 3ware Storage Controller. Driver version 1.50.01.002 port 0x2860-0x286f mem 0xf400-0xf47f irq 17 at device 13.0 on pci0 twe0: [GIANT-LOCKED] twe0: 4 ports, Firmware FE8S 1.05.00.068, BIOS BE7X 1.08.00.048 I bring this up because I could have sworn that I read here or elsewhere that this driver was revamped. I also could have sworn that at some point in 5.x it was not under giant. Maybe I'm imagining things... Anyhow, does anyone know the status of this, and also is there a central repository that tracks changes like this that I can watch? Thanks, Charles I have some old patches that lock twe. They aren't quite complete or right due to an edge case with DMA handling. I'll probably dust them off and finish them soon. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD 6.0 Released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It is my great pleasure and privilege to announce the availability of FreeBSD 6.0-RELEASE. This release is the next step in delivering the high performance and enterprise features that have been under development in the FreeBSD 5.x series for that last several years. Some of the many changes since 5.4 include: ~ Significant performance improvements to the filesystem and direct disk access layers of the OS. The filesystem is now multithreaded and can take full advantage of multiple CPU systems. ~ Expanded support for wireless networking adapters and new support for the WPA wireless security protocol. ~ Experimental support for the PowerPC platform. For a complete list of new features and known problems, please see the release notes and errata list, available at: http://www.FreeBSD.org/releases/6.0R/relnotes.html http://www.FreeBSD.org/releases/6.0R/errata.html For more information about FreeBSD release engineering activities, please see: http://www.FreeBSD.org/releng Availability - FreeBSD 6.0-RELEASE supports the i386, pc98, alpha, sparc64, amd64, powerpc, and ia64 architectures and can be installed directly over the net using bootable media or copied to a local NFS/FTP server. Distributions for all architectures are available now. Please continue to support the FreeBSD Project by purchasing media from one of our supporting vendors. The following companies will be offering FreeBSD 6.0 based products: ~ FreeBSD Mall, Inc.http://www.freebsdmall.com/ ~ Daemonnews, Inc. http://www.bsdmall.com/freebsd1.html If you can't afford FreeBSD on media, are impatient, or just want to use it for evangelism purposes, then by all means download the ISO images. We can't promise that all the mirror sites will carry the larger ISO images, but they will at least be available from the following sites. MD5 and SHA256 checksums for the release images are included at the bottom of this message. Bittorrent -- The FreeBSD project encourages the use of BitTorrent for distributing the release ISO images. A collection of torrent files to download the images is available at ftp://ftp.freebsd.org/pub/FreeBSD/torrents/6.0-RELEASE FTP --- At the time of this announcement the following FTP sites have FreeBSD 6.0-RELEASE available. ftp://ftp.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.FreeBSD.org/pub/FreeBSD/ ftp://ftp3.FreeBSD.org/pub/FreeBSD/ ftp://ftp5.FreeBSD.org/pub/FreeBSD/ ftp://ftp.at.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.ch.FreeBSD.org/pub/FreeBSD/ ftp://ftp.cz.FreeBSD.org/pub/FreeBSD/ ftp://ftp.ee.FreeBSD.org/pub/FreeBSD/ ftp://ftp.es.FreeBSD.org/pub/FreeBSD/ ftp://ftp.fi.FreeBSD.org/pub/FreeBSD/ ftp://ftp.fr.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.ie.FreeBSD.org/pub/FreeBSD/ ftp://ftp.is.FreeBSD.org/pub/FreeBSD/ ftp://ftp5.pl.FreeBSD.org/pub/FreeBSD/ ftp://ftp3.ru.FreeBSD.org/pub/FreeBSD/ ftp://ftp.se.FreeBSD.org/pub/FreeBSD/ ftp://ftp.si.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.tw.FreeBSD.org/pub/FreeBSD/ ftp://ftp.uk.FreeBSD.org/pub/FreeBSD/ ftp://ftp2.us.FreeBSD.org/pub/FreeBSD/ ftp://ftp5.us.FreeBSD.org/pub/FreeBSD/ FreeBSD is also available via anonymous FTP from mirror sites in the following countries: Argentina, Australia, Brazil, Bulgaria, Canada, China, Czech Republic, Denmark, Estonia, Finland, France, Germany, Hong Kong, Hungary, Iceland, Ireland, Japan, Korea, Lithuania, Amylonia, the Netherlands, New Zealand, Poland, Portugal, Romania, Russia, Saudi Arabia, South Africa, Slovak Republic, Slovenia, Spain, Sweden, Taiwan, Thailand, Ukraine, and the United Kingdom. Before trying the central FTP site, please check your regional mirror(s) first by going to: ftp://ftp.yourdomain.FreeBSD.org/pub/FreeBSD Any additional mirror sites will be labeled ftp2, ftp3 and so on. More information about FreeBSD mirror sites can be found at: http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/mirrors-ftp.html For instructions on installing FreeBSD, please see Chapter 2 of The FreeBSD Handbook. It provides a complete installation walk-through for users new to FreeBSD, and can be found online at: http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/install.html Acknowledgments Many companies donated equipment, network access, or man-hours to finance the release engineering activities for FreeBSD 6.0 including The FreeBSD Foundation, FreeBSD Systems, Hewlett-Packard, Yahoo!, Sentex Communications, and SPARTA. The release engineering team for 6.0-RELEASE includes: Scott Long [EMAIL PROTECTED] Release Engineering, I386 and AMD64 Release Building Ken Smith [EMAIL PROTECTED]Sparc64 Release Building, Mirror Site Coordination Robert Watson [EMAIL PROTECTED] Release Engineering, Security Doug White [EMAIL
Re: locking in a device driver
Dinesh Nair wrote: On 11/03/05 03:12 Warner Losh said the following: Yes. if you tsleep with signals enabled, the periodic timer will go off, and you'll return early. This typically isn't what you want either. looks like i've got a lot of work to do, poring thru all the ioctls for the device and trying to use another method to wait instead of tsleep(). Note that a thread can block on select/poll in 4.x and still allow other threads to run. I used this to solve a very similar problem to your in a 4.x app of mine. I have the app thread wait on select() on the device node for the driver. When the driver gets to a state when an ioctl won't block (like data being available to read), then it does the appropriate magic in it's d_poll method. select in userland sees this, allows the thread to resume running, and the thread then calls ioctl. Of course you have to be careful that you don't have multiple threads competing for the same data or that the data won't somehow disappear before the ioctl call runs. But it does work. Look at the aac(4) driver for my example of this. The other option is to use rfork, aka 'linuxthreads' to similate threads via linked processes that share their address space. Each 'thread' is actually a process, and if one 'thread' blocks the rest are still allowed to run. It's more heavy-weight than real threads, but it does also work. works. If you use libc_r on 5, you'll see exactly this behavior. If you use libpthread or libthr, you won't. i use gcc -pthread, so it's libc_r on 4.x. what does 'gcc -pthread' link to on 5.x ? lpthread, I believe. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: locking in a device driver
Dinesh Nair wrote: On 10/28/05 16:40 Dinesh Nair said the following: On 10/28/05 10:52 M. Warner Losh said the following: libc_r will block all other threads in the application while an ioctl executes. libpthread and libthr won't. I've had several bugs at work which is a Good Thing(tm) indeed for me on 4.x. which may not be a Good Thing(tm) after all. this could be causing the problem i'm seeing with the driver on 4.x. any methods to get around this, short of not using threads ? I think this thread has gone too far into hyperbole and conjecture. What is your code trying to do, and what problems are you seeing? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: locking in a device driver
Dinesh Nair wrote: On 11/02/05 03:02 Julian Elischer said the following: drops to splzero or similar,.. woken process called, starts manipulating another buffer collides with next interrupt. that makes a lot of sense, i'll try with using splxxx() in the pseudo driver, to block out the real driver. it's currently splhigh() due to INTR_TYPE_MISC being used, but i guess i could change this to INTR_TYPE_NET or INTR_TYPE_TTY. what would be good for a telecommunications line card which is time sensitive and interrupts at a constant 1000Hz ? INTR_TYPE_TTY and spltty it needs to call splxxx() while it is doing it.. I would suggest having two buffers and swapping them under splxxx() so that the one that the driver is accessing is not the one you are draining. that way teh splxxx() levle needs to only be held for the small time you are doing the swap. the first buffer is actually the buffer into which DMA reads/writes are done. what i referred to as another buffer is in fact a ring of buffers. the real driver writes into the top of the ring, and increments the top ring pointer. the pseudo driver reads from the bottom of the ring and increments the bottom ring pointer. buf1 buf2 buf3 buf4 buf5 buf6 buf7 buf8 ^ ^ | | | +-- top ring pointer, incremented as real driver reads | from device +-- bottom ring pointer, incremented as userland reads from pseudo You'll also want to use an spl in the top half of the pseudo driver to cover where the pointers are read and changed. not locks, but spl, and only step 8 needs to be changed because all teh rest are already done at high spl. wouldnt a lockmgr() around the access to these ring buffers help since we're locking access to data and not necessarily execution ? lockmgr is far to heavy-weight and complex for this. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Display files currently in the buffer cache
Eric Anderson wrote: Mark Kirkwood wrote: Dear hackers, I'm interested in being able to display some data about the contents of the buffer cache , say file name and page offset (something like IRIX's 'bufview'). Is there any utilities that do this currently? (searched around but didn't see anything in ports). Assuming not, is it feasible to write one to do this? (if so, any pointers appreciated - massive FreeBSD internals newbie here). This would be a cool tool! I've been thinking of that too, and also would like to have a lkdump tool - which dumps information about currently locked files. Eric Does the FreeBSD VM really have a concept of filenames at all? I thought that all it understood was buffer objects and vnodes. And since there isn't a strong correlation between vnodes and the filesystem namespace, it would be hard to provide such information. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Very slow writing to SATA disk
Søren Schmidt wrote: On 28/10/2005, at 23:45, Mikhail Teterin wrote: Indeed, 55C is way to high for 24/7 usage, and it might be that the drive is choking on it and barely is able to compensate. The reads are pretty quick... I'd like to be able to spin it down, but ataidle is broken :-( Ask the maintainer to get it fixed, but be warned experience says it might hose your data... What does SMART say ? any unusual like high correction rates or anything ? (SMART data deleted) Well except the excessive temperature nothing out of the ordinary... Now, you say read speed is OK, but write speed isnt, is that on the raw disk device or though the filesystem ? Søren Schmidt [EMAIL PROTECTED] For what it's worth, I'm seeing slow write speeds on some tests with other (non-ata) controllers. Haven't had time to isolate it just yet. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: correct use of bus_dmamap_sync
Dinesh Nair wrote: On 10/27/05 04:16 Scott Long said the following: an example would be using (BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE) which would be 0x03 in freebsd 4.x and 0x06 in freebsd 5.x. the gotcha is that 0x03 in freebsd 4.x is BUS_DMASYNC_POSTWRITE. so therefore, BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE will be BUS_DMASYNC_POSTWRITE in 4.x which in the syscall is actually a no op. Yes, that is fugly. Just don't use the | versions for now I would guess. Trying to maintain source compatibility between 4.x and 5.x/6.x will make you encounter a whole lot more problems than just this. could you elaborate on what busdma related problems there'd be, between 4.x and 5.x/6.x ? do, for example, the inner workings of the bus_dma* syscalls work the same on both ? I was speaking about driver code in general. For busdma specifically, the only difference is the extra arguments to bus_dma_tag_create(). Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: locking in a device driver
Dinesh Nair wrote: carrying on this discussion, what would be a good locking mechanism to use to protect tsleep() and other sensitive areas in a driver in freebsd 4.x ? the current code for the driver in 5.x uses mtx_lock and mtx_unlock with some parts even being protected by mtx_lock(Giant). would the use of simple_lock() or s_lock() do, given that SIMPLELOCK_DEBUG was defined in the 4.x kernel ? the mechanism is actually a pseudo device driver which communicates with the real device driver. the pseudo device driver creates a bunch of /dev/ devices which the userland reads/writes to, and the pseudo device driver then places data in a few buffers. the real device driver then reads these buffers and uses busdma to send the data to the device. reading is done by using busdma to read from the device and then placing the data in these buffers for the pseudo device to return to the userland process. locking in the real device driver uses splhigh/splx, but what locking should be used in the pseudo device driver ? If you need to protect your pseudodriver from being interrupted by the real driver then you'll need to use the same spl() as the driver. Note that you shouldn't be using splhigh() unless you really know what you are doing. Other than that, there likely isn't anything that you need to do for 'locking' in 4.x. The kernel is non-reentrant there, so you don't need to worry about synchronizing multiple threads. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: locking in a device driver
M. Warner Losh wrote: In message: [EMAIL PROTECTED] Scott Long [EMAIL PROTECTED] writes: : Dinesh Nair wrote: : : carrying on this discussion, what would be a good locking mechanism to : use to protect tsleep() and other sensitive areas in a driver in freebsd : 4.x ? : : the current code for the driver in 5.x uses mtx_lock and mtx_unlock with : some parts even being protected by mtx_lock(Giant). : : would the use of simple_lock() or s_lock() do, given that : SIMPLELOCK_DEBUG was defined in the 4.x kernel ? : : the mechanism is actually a pseudo device driver which communicates with : the real device driver. the pseudo device driver creates a bunch of : /dev/ devices which the userland reads/writes to, and the pseudo device : driver then places data in a few buffers. : : the real device driver then reads these buffers and uses busdma to send : the data to the device. reading is done by using busdma to read from the : device and then placing the data in these buffers for the pseudo device : to return to the userland process. : : locking in the real device driver uses splhigh/splx, but what locking : should be used in the pseudo device driver ? : : : If you need to protect your pseudodriver from being interrupted by the : real driver then you'll need to use the same spl() as the driver. Note : that you shouldn't be using splhigh() unless you really know what you : are doing. Other than that, there likely isn't anything that you need : to do for 'locking' in 4.x. The kernel is non-reentrant there, so you : don't need to worry about synchronizing multiple threads. One thing to also bear in mind is that in 4.x spl locking is a code lock. It keeps multiple 'threads' of execution from entering a block of code. mutexes in -current are data locks. While usually one can think of the two the same, it can trip the unweary up. Locking in 4.x is indeed much simpler. Warner I wouldn't characterize spls that way. An spl keeps top-half code from being preempted by an interrupt that would cause bottom half code to run. It's more of a special critical section that a code serializer. It's big advantage is that it doesn't mask out all interrupts, just the ones that you want, and it's much more light weight on x86 that doing explicit cli/sti instructions. It's the BGL spinlock that keeps multiple processes from executing the top half at the same time, and there is no control over that; it's just 'there'. The synchronization guarantees that you have in the 4.x kernel are: 1. Only one process will be executing in the kernel at a time. Multiple processes might be blocked at the same time, but only one will be executing, regardless of the number of CPUs. 2. Only one interrupt handler will execute at a time, and while it is executing there will not be any top half code executing on any other CPUs. 3. Interrupt handlers can preempt a process executing in the kernel unless the appropriate spl mask/level is set. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [Fwd: Re: use of bus_dmamap_sync]
John Baldwin wrote: On Wednesday 26 October 2005 04:47 am, Dinesh Nair wrote: On 10/26/05 10:39 Scott Long said the following: Apparently the original poster sent his question to me in private, then sent it again to the mailing list right as I was responding in private. apologies on that, scott. an initial search only turned up your message in the archives, but spreading it wider (not confining the google to lists.freebsd.org) brought up more hits, and that made me post it into -hackers. do bear with me as i try to understand this. Below is my response. Note that I edited it slightly to fix an error that I found bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD); Ask hardware for data bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD); read from readbuf (i'm assuming that device has put data in readbuf) POSITION B } in other words, the PREREAD/POSTREAD wrap around the device's access to memory, and not the CPU's ? Yes, scott's notes are more correct than mine here. bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE); notify hardware of the write bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE); The point of the syncs is to do the proper memory barrier and cache coherency magic between the CPU and the bus as well as do the memory copies for bounce buffers. If you are dealing with statically mapped buffers, i.e. for an rx/tx descriptor ring, then you'll want code however, reading thru the syscall code, bus_dmamem_alloc() sets the dmamap to NULL, and if it's null, bus_dmamap_sync() is not called at all. would this mean that if memory is allocated by bus_dmamem_alloc(), it does not need to be synced with bus_dmamap_sync() ? The value of the map is an implementation detail, which is why it's an opaque typedef. Portable code should always assume that the map has valid data. Now, specifically for i386, if you have a device with a 4GB address limit, and it has no data alignment constraints (unlike twe), and you are not using PAE, then yes the map will be NULL and the syncs will do nothing. Assuming that all three of these cases are false is not good, though. Perhaps on i386. Each arch implements sync(). Argh, it does look like the memory barriers needed on e.g., Alpha aren't used with static buffers because of the map != NULL check in sys/busdma.h. *sigh* I guess archs that need membars even without bounce buffers need to always allocate and setup a bus_dmamap. None of that matters for i386 though. Feel free to fix alpha. Again, long ago, I thought that alpha pretended to be coherent in the 2GB DMA window that we use so that it could be more like i386. If that's not true then that's fine. If you need to make structural changes to the MI code on order to fix alpha, please let me know. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: correct use of bus_dmamap_sync
John Baldwin wrote: On Wednesday 26 October 2005 02:13 am, Dinesh Nair wrote: On 10/26/05 04:10 John Baldwin said the following: Yes, and on some archs the sync() operations do have memory barriers in place, but there isn't any bounce buffering with bus_dmamem_alloc() memory. and in _bus_dmamap_load() in /usr/src/sys/i386/i386/busdma_machdep.c, apparently if the second argument to bus_dmamap_load (the pointer to bus_dmamap_t)) is NULL, the syscall code sets it to nobounce_dmamap, a static struct which doesnt seem to be used/allocated, except within the syscall. what would the implications of using NULL for the dmamap address be ? Well, you need it to get the physical address to pass to your device for it to do DMA against. on freebsd 4.x, vtophys(buffer) returns the same value as the this address. (i.e, when the callback function from bus_dmamap_load() is called, the address of the segment returned is the same as vtophys(buffer)). this is the current observed behaviour on 4.x. On i386, yes. It won't on sparc64 when using an IOMMU for example. The whole point of using bus_dma is to not use vtophys() since by doing that you are assuming that the PA's used by the CPU map 1:1 to the addresses used by your device to do DMA, and on architectures with an IOMMU such as sparc64, G5 ppc boxes, and probably amd64 boxes in the future, that is not a valid assumption at all. Well, the point of busdma is to make the DMA mechanics transparent to the driver. It's not just about IOMMUs, it's also about handling alignment constraints and address boundaries and exclusion areas. It's a set-it-and-forget-it deal. Set the requirements and constraints in the tag, follow the API, and the details Just Work without having to worry about them. have things changed between freebsd 4.x (which i'm using) and freebsd 5.x ? I don't think so as far as the interface. the values of the BUS_DMASYNC_ constants have changed though. they're an enum with values 0-3 in 4.x but in 5.x they're defined as 0x01, 0x02, 0x04 and 0x08. due to this, combining BUS_DMASYNC_XXX thru an OR could possibly give different behaviour on 4.x and 5.x. an example would be using (BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE) which would be 0x03 in freebsd 4.x and 0x06 in freebsd 5.x. the gotcha is that 0x03 in freebsd 4.x is BUS_DMASYNC_POSTWRITE. so therefore, BUS_DMASYNC_POSTREAD|BUS_DMASYNC_PREWRITE will be BUS_DMASYNC_POSTWRITE in 4.x which in the syscall is actually a no op. Yes, that is fugly. Just don't use the | versions for now I would guess. Trying to maintain source compatibility between 4.x and 5.x/6.x will make you encounter a whole lot more problems than just this. also, in both 4.x and 5.x, only POSTREAD and PREWRITE have any real meaning, as PREREAD and POSTWRITE are no ops. On i386, yes. Eventually those operations might be used to manipulate IOMMU mappings for example. I honestly don't ever expect to see IOMMU code for i386. The IOMMU that is provided by the AGP bus is fairly limited in what it can do, and trying to coordinate its use with X would be simply a nightmare. I'm less clear on the IOMMU that exists for amd64 and whether it's a true IOMMU or just an aliasing of the AGP IOMMU. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
[Fwd: Re: use of bus_dmamap_sync]
Apparently the original poster sent his question to me in private, then sent it again to the mailing list right as I was responding in private. Anyways, no need to continue to guess; if anyone has any questions, feel free to ask. Below is my response. Note that I edited it slightly to fix an error that I found Scott Original Message Subject: Re: use of bus_dmamap_sync Date: Tue, 25 Oct 2005 07:59:03 -0600 From: Scott Long [EMAIL PROTECTED] To: Dinesh Nair [EMAIL PROTECTED] References: [EMAIL PROTECTED] Dinesh Nair wrote: hi scott, i came across this message of yours, http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044395.html and you seem like the perfect person to assist me in something. i've been trying to figure out the best places to use bus_dmamap_sync when reading/writing to a dma mapped address space. however, i cant seem to get the gist of this, either from the mailing list discussions or the man page. could you assist me ? i'm on FreeBSD 4.11 right now, and i notice the definitions of BUS_DMASYNC_* has changed from an enum (0-3) in 4.x to a typedef in 5.x. this is what i have done. i have used two buffers to handle reads from the device and writes to the device. the pseudocode is as follows rx_func() { POSITION A bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD); Ask hardware for data bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD); read from readbuf (i'm assuming that device has put data in readbuf) POSITION B } tx_func() { POSITION C write to txbuf (here's where we write to txbuf) bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE); notify hardware of the write POSITION D bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE); } what BUS_DMASYNC_{PRE,POST}{READ,WRITE} option should i use for bus_dmamap_sync in position A, B, C and D ? any assistance would be gladly appreciated, as i'm seeing some really weird symptoms on this device, where data written out is being immediately read in. i'm guessing this has to do with my wrong usage of bus_dmamap_sync(). The point of the syncs is to do the proper memory barrier and cache coherency magic between the CPU and the bus as well as do the memory copies for bounce buffers. If you are dealing with statically mapped buffers, i.e. for an rx/tx descriptor ring, then you'll want code exactly like described above. In reality, most platforms only do stuff for the POSTREAD and PREWRITE cases, but for the sake of completeness the others are documented and usually used in drivers. NetBSD might have platforms that require operations for PREREAD and POSTWRITE, but I've never looked that closely. If you are dealing with dynamic buffers, i.e. for mbuf data, then you'll want the PREREAD and PREWRITE ops to happen in the callback function for bus_dmamap_load() and the POSTREAD and POSTWRITE ops to happen right before calling bus_dmamap_unload. So in this case is would be: rx_buf() { allocate buffer allocate map bus_dmamap_load(tag, map, buffer, size, rx_callback, arg, flags) } rx_callback(arg, segs, nsegs, errno) { convert segs to hardware format bus_dmamap_sync(tag, map, BUS_DMASYNC_PREREAD) notify hardware about buffer } rx_complete() { bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTREAD) bus_dmamap_unload(tag, map, buffer) deallocate map process buffer } tx_buf() { fill buffer allocate map bus_dmamap_load(tag, map, buffer, size, tx_callback, arg, flags) } tx_callback(arg, segs, nsegs, errno) { convert segs to hardware format bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE) notify hardware about buffer } tx_complete() bus_dmamap_sync(tag, map, BUS_DMASYNC_POSTWRITE) bus_dmamap_unload(tag, map, buffer) deallocate map free buffer } This is the design that busdma was originally modelled on. It works well for storage devices where the load operation must succeed. It doesn't work as well for network devices where the latency of the indirect calls is measurable. So for that, I added bus_dmamap_load_mbuf_sg(). It eliminates the callback function and returns the scatter gather list directly. So, the above example would be: tx_buf() { bus_dma_segment_t segs[maxsegs]; int nsegs; fill buffer allocate map bus_dmamap_load_mbuf_sg(tag, map, buffer, size, segs, nsegs) convert segs to hardware format bus_dmamap_sync(tag, map, BUS_DMASYNC_PREWRITE) notify hardware about buffer } Also, the 'allocate map' part should be done carefully. Most network drivers are lazy and call bus_dmamap_create() and bus_dmamap_destroy() for each buffer. It's often better to pre-allocate the maps at init time, put them on a list, and then just push and pop them off the list at runtime. This is usually faster than calling the busdma
Re: Driver Development Books?
Pete wrote: Hello, I have what may seem to be a silly question, but I cannot find any other decent resources on the web. . The problem that I am having right now is that I have a fairly nice graphics card which, for the moment is only supported on Windows Operating systems, and old 2.4 Linux kernels. So far there has not been much positive outlook in porting the drivers to *BSD or any of the 2.6 kernels that I know of, let alone 64-bit drivers for non-Win OSes. So I guess that makes my question fairly simple then; I know that driver code is written in C (which I am learning currently) but thats about all I know. I'm probably not far off when I say that I need more to go on. Yet, from looking at Amazon.com I have not been able to find any books on writing driver code, which is really frustrating. One of my security related books, Rootkits, tells me about how to write drivers for a completely different reason so I know a bit more about how they work but again the code involved does not interface hardware to the OS, just injects a custom application. The other tool that I will probably use is Jungo, which is a nice-looking application which automates a skeletal version of the driver you need, but again, I would not know how to fill it out. Any help is appreciated. -Pete There are indeed no books that I know of on the subject of writing drivers for any *BSD, let alone FreeBSD. For the last year I've wanted to sit down and write such a book, but the amount of time needed to do this is daunting. Anyways, there were a couple of articles published back around 2000 on DeamonNews that covered some basic information on writing kernel modules, and they are likely still available via the various web search engines. For more detailed information, you'll need to dig into the kernel source code, look for appropriate manual pages, and ask questions. There are a number of really good people on this list that try to answer most questions like this, so don't be afraid to ask. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Driver Development Books?
Sangwoo Shim wrote: 2005/10/12, Scott Long [EMAIL PROTECTED]: Pete wrote: Hello, I have what may seem to be a silly question, but I cannot find any other decent resources on the web. . The problem that I am having right now is that I have a fairly nice graphics card which, for the moment is only supported on Windows Operating systems, and old 2.4 Linux kernels. So far there has not been much positive outlook in porting the drivers to *BSD or any of the 2.6 kernels that I know of, let alone 64-bit drivers for non-Win OSes. So I guess that makes my question fairly simple then; I know that driver code is written in C (which I am learning currently) but thats about all I know. I'm probably not far off when I say that I need more to go on. Yet, from looking at Amazon.com I have not been able to find any books on writing driver code, which is really frustrating. One of my security related books, Rootkits, tells me about how to write drivers for a completely different reason so I know a bit more about how they work but again the code involved does not interface hardware to the OS, just injects a custom application. The other tool that I will probably use is Jungo, which is a nice-looking application which automates a skeletal version of the driver you need, but again, I would not know how to fill it out. Any help is appreciated. -Pete There are indeed no books that I know of on the subject of writing drivers for any *BSD, let alone FreeBSD. [snip] For me, following book was quite helpful: Embedded FreeBSD cookbook, by Paul Cevoli ISBN: 1589950046 It tells about basic kernel data structure for driver writing. One of the best aspect of this book is that it shows you real code for real device (a simple PCI device). Moreover, it was quite easy to read. Although it focuses on FreeBSD 4.X. For those who want some _introduction_ for the FreeBSD driver writing, I would like to recommend this. Regard, Sangwoo Shim Ah, didn't know about that book. Yes, that sounds like a good foundation, though some aspects of drivers in 5.x and beyond are vastly different than in 4.x and prior, particularly concerning synchronization and interrupt behaviour. The next step is to talk about the different driver APIs and infrastructure, as well as debugging guides. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fwd: Re: Linksys WRT54G with freebsd
Bruno Ducrot wrote: On Fri, Sep 23, 2005 at 01:50:45PM +0200, Florent Thoumie wrote: Le Vendredi 23 septembre 2005 à 12:16 +0200, Bachilo Dmitry a écrit : Forwarding to FreeBSD hackers. (Because i am hacking WRT right now and only Linux flashes work) -- ?? ?? -- Subject: Re: Linksys WRT54G with freebsd Date: ?? 23 2005 17:06 From: Thierry Herbelot [EMAIL PROTECTED] To: freebsd-current@freebsd.org Cc: Marcos Biscaysaqu - ThePacific.net [EMAIL PROTECTED] Le Friday 23 September 2005 11:08, vous avez écrit : On the other hand, it's the wireless thing. If not needed, this should be fun to do a port, somehow, even though it's a wireless router. The cool factor of porting FreeBSD to the WRT54G cannot be underestimated, but Linux ports were enormously helped by the opening of the sources of the Linksys Linux port (which is absent for FreeBSD) and the big number of willing developpers (just have a look at the *number* of different Linux ports to the WRT). The latest 6.0 release would be an excellent target, with its brand-new support for WPA and virtual APs ... who volunteers ? The Linksys WRT54g wireless router is based on a Broadcom CPU (derived from MIPS) and FreeBSD/mips seems to be a dead project :-( Indeed. It's targeted to SGI platforms anyway. Maybe there is a need to start a new port if there is enough people interrested? There has been talk of doing this in the past year from some people, but I don't know if it got very far. If you're inspired, go for it! There are plenty of docs on the web about how to attach a serial port header and bootstrap it. And, don't underestimate the mips32 work that is already in the tree; it's likely a good starting point. And, it's more than just a 'coolness' factor. I'd really like to have pf running on mine, that way I could rid of the clunky machine doing static NAT + firewall on my DSL line. THe linux firewall capabilities are soo last century =-) Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: KINDLY HELP : error while kldloading a pci,character driver
rashmi ns wrote: Hi, Amazing, Thanks a lot it really works . Now i have to read what D_VERSION does :-) Thanks , Rashmi.N.S You also need to remove .d_maj. /dev entries are created dynamically now, and you application should have no knowledge of the major and minor number internals of it. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: PCI_MULTI FUNCTION DEVICE DRIVERS
rashmi ns wrote: Hello All, While writing a pci-driver for hdlc controller which has two functions 1.BRIDGE 2.Network Do we need to write two separate drivers for each class-code or how can a single driver manage two different functionalites .Are there any examples on pci-multifunction drivers .I read in the documents that we need to use mbuf structure for multi-function devises are there any drivers which uses the same. Thanks in advance , Rashmi.N.S In the FreeBSD driver model, the driver 'probe' method will get called for each PCI function in the system. The driver can either bid to claim it, or reject it. It's up to the driver author as to what criteria is used to bid/accept vs reject a function. Almost all drivers look at the pci device id set. Looking at merely the PCI class code is not recommended since it it far too ambiguous. Also, unlike Linux, the driver does not have easy access to the PCI enumeration internals of the OS. There are also no guarantees as to what order the bus will be probed or what order functions will be enumerated in. Do you actually need to program both functions of the hardware? Usually a bridge device tends to be passive from a driver standpoint. Is there something special that your bridge does? If so then you'll need to write two sets of drivers, each with a unique probe, attach, and detach method. Making the separate driver instances work together will be a bit tricky. The easy but really messy approach is to create some global variables and methods and have the drivers cheat. I'd avoid this is at all possible. Probably the more correct approach is to have each function walk the device tree and look for its sibling, then communicate via custom DEVMETHOD's. This does have performance implications though, so a combination of walking the tree then calling via direct dispatch is probably the best approach if performance is a factor. If you could provide more information about what your device is and what each function does, I can probably give better answers. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Adding new option to ktrace
Nikhil Dharashivkar wrote: Hi Scott and Rajesh, Thanks for replying me. Basically what happend, while testing scsi driver on freebsd, at some point it crashes. So, there is no way to know how much IO is performed. To know the IO state just before the driver fails, i selected ktrace to print IO information whatever i ll get from dastrategy routine. You have reason to believe that certain I/O patterns cause the crash? What driver is being used? What is the crash? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Adding new option to ktrace
Nikhil Dharashivkar wrote: Hi, i want to hack the ktrace system call. Basically, I want to monitor scsi disk IO through dastrategy() routine. It seems that kern_ktrace.c implements different functions for ktrace options like -tc / -ti ... etc (see man page). So, is it possible to add new option for disk IO with new structure object containing disk io information which will be pass to ktr_submittrequest thr' ktr_request structure. Will data will be written correctly in ktrace.out and will kdump analyze that ? What are you trying to monitor? Would the existing devstat interface work? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Adding new option to ktrace
Rajesh S. Ghanekar wrote: Scott Long wrote: Nikhil Dharashivkar wrote: Hi, i want to hack the ktrace system call. Basically, I want to monitor scsi disk IO through dastrategy() routine. It seems that kern_ktrace.c implements different functions for ktrace options like -tc / -ti ... etc (see man page). So, is it possible to add new option for disk IO with new structure object containing disk io information which will be pass to ktr_submittrequest thr' ktr_request structure. Will data will be written correctly in ktrace.out and will kdump analyze that ? What are you trying to monitor? Would the existing devstat interface work? May be he requires how many bytes transferred (read/write) while a process is executing. I guess devstat doesn't do it from process context, it gives total IO read/writes from a device, if registred via devstat. Please correct me if I am wrong. - Rajesh There isn't a 1:1 correlation between the bytes that the userland program writes, and the bytes that actually get written to disk. Filesystem metadata writes will happen if the file needs to be extended, not to mention the access time being updated. Some writes won't even originate from a userland program, like swap writes. GEOM also decouples the I/O path, so it's not the user process that will actually do the write, it's the g_down kthread. I would think that this would make tracking I/O via ktrace very hard. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Adding new option to ktrace
Nikhil Dharashivkar wrote: Yes, what rajesh saying is right , i want to print IO Bytes. You want to capture writes coming from userland, or you want to capture all low-level disk writes? Are you trying to correlate these writes with a particular user process? Consider an mmaped file. A userland program will modify the memory fronting the file, at at some point the pagedaemon kthread will come in and flush those dirty pages, independent of the user process. Also, like I said, device strategy routines are decoupled from the syscall callers by the g_down kthread. Trying to figure out the userland thread from dastrategy that is responsible for the I/O is going to be tricky, if even possible at all. Scott On 9/6/05, Scott Long [EMAIL PROTECTED] wrote: Rajesh S. Ghanekar wrote: Scott Long wrote: Nikhil Dharashivkar wrote: Hi, i want to hack the ktrace system call. Basically, I want to monitor scsi disk IO through dastrategy() routine. It seems that kern_ktrace.c implements different functions for ktrace options like -tc / -ti ... etc (see man page). So, is it possible to add new option for disk IO with new structure object containing disk io information which will be pass to ktr_submittrequest thr' ktr_request structure. Will data will be written correctly in ktrace.out and will kdump analyze that ? What are you trying to monitor? Would the existing devstat interface work? May be he requires how many bytes transferred (read/write) while a process is executing. I guess devstat doesn't do it from process context, it gives total IO read/writes from a device, if registred via devstat. Please correct me if I am wrong. - Rajesh There isn't a 1:1 correlation between the bytes that the userland program writes, and the bytes that actually get written to disk. Filesystem metadata writes will happen if the file needs to be extended, not to mention the access time being updated. Some writes won't even originate from a userland program, like swap writes. GEOM also decouples the I/O path, so it's not the user process that will actually do the write, it's the g_down kthread. I would think that this would make tracking I/O via ktrace very hard. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Low umass performance with USB 2.0 ports
Ian Dowse wrote: In message [EMAIL PROTECTED], Eygene A. Ryabinkin wri tes: What is filesystem has your USB drive? The one I was extensively testing has FAT, but I've checked the UFS2 -- just a bit better -- 1.8 Mb/second. But you're right -- no wdrains at all. FreeBSD 4.x had very low performance with FAT filesystem, writing process spent lots of time in the wdrain state too. Yes, it has. But here the same flash drive gives different results for ehci and uhci devices, and the total speed of echi is lower due to wdrains: 300 Kb/sec versus 500 Kb/sec. And I sometimes write my data to the Windows partition with FAT to my home HDD -- it has no wdrains. At least, I've not noticed them. For flash I can. The patch in from the email below may help with the wdrain state - can you see if it makes any difference? Is the problem that the interrupt gets fired but not all of the status information has made it's way back to host memory when the driver gets there? Would it make a difference to instead read back the EHCI_USBSTS register after writing to it in ehci_intr1? That way all transactions down to the controller would be guaranteed to be flushed before you continue on. I wonder if this is a remnant of the famous problems with VIA chipsets doing bad things under medium-to-high PCI contention. I don't see any obvious workarounds for this in the Linux EHCI code, so I wonder if it's a case of them not encountering it, or doing something different that avoids the problem. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Low umass performance with USB 2.0 ports
Scott Long wrote: Ian Dowse wrote: In message [EMAIL PROTECTED], Eygene A. Ryabinkin wri tes: What is filesystem has your USB drive? The one I was extensively testing has FAT, but I've checked the UFS2 -- just a bit better -- 1.8 Mb/second. But you're right -- no wdrains at all. FreeBSD 4.x had very low performance with FAT filesystem, writing process spent lots of time in the wdrain state too. Yes, it has. But here the same flash drive gives different results for ehci and uhci devices, and the total speed of echi is lower due to wdrains: 300 Kb/sec versus 500 Kb/sec. And I sometimes write my data to the Windows partition with FAT to my home HDD -- it has no wdrains. At least, I've not noticed them. For flash I can. The patch in from the email below may help with the wdrain state - can you see if it makes any difference? Is the problem that the interrupt gets fired but not all of the status information has made it's way back to host memory when the driver gets there? Would it make a difference to instead read back the EHCI_USBSTS register after writing to it in ehci_intr1? That way all transactions down to the controller would be guaranteed to be flushed before you continue on. I wonder if this is a remnant of the famous problems with VIA chipsets doing bad things under medium-to-high PCI contention. I don't see any obvious workarounds for this in the Linux EHCI code, so I wonder if it's a case of them not encountering it, or doing something different that avoids the problem. Scott Actually, I just peeked inside the Linux EHCI code and it does a dummy read immediately after writing to the status register: /* clear (just) interrupts */ writel (status, ehci-regs-status); readl (ehci-regs-command); /* unblock posted write */ I wonder if that's the whole trick here. Would someone be willing to try the attached patch instead of the one that Ian posted? Scott Index: ehci.c === RCS file: /usr/ncvs/src/sys/dev/usb/ehci.c,v retrieving revision 1.36 diff -u -r1.36 ehci.c --- ehci.c 29 May 2005 04:42:27 - 1.36 +++ ehci.c 31 Aug 2005 19:44:14 - @@ -578,6 +578,7 @@ return (0); EOWRITE4(sc, EHCI_USBSTS, intrs); /* Acknowledge */ + EOREAD4(sc, EHCI_USBCMD); /* Flush posted writes on PCI */ sc-sc_bus.intr_context++; sc-sc_bus.no_intrs++; if (eintrs EHCI_STS_IAA) { ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Low umass performance with USB 2.0 ports
Hans Petter Selasky wrote: On Wednesday 31 August 2005 21:47, Scott Long wrote: Scott Long wrote: Ian Dowse wrote: In message [EMAIL PROTECTED], Eygene A. Ryabinkin wri tes: What is filesystem has your USB drive? The one I was extensively testing has FAT, but I've checked the UFS2 -- just a bit better -- 1.8 Mb/second. But you're right -- no wdrains at all. FreeBSD 4.x had very low performance with FAT filesystem, writing process spent lots of time in the wdrain state too. Yes, it has. But here the same flash drive gives different results for ehci and uhci devices, and the total speed of echi is lower due to wdrains: 300 Kb/sec versus 500 Kb/sec. And I sometimes write my data to the Windows partition with FAT to my home HDD -- it has no wdrains. At least, I've not noticed them. For flash I can. The patch in from the email below may help with the wdrain state - can you see if it makes any difference? Is the problem that the interrupt gets fired but not all of the status information has made it's way back to host memory when the driver gets there? Would it make a difference to instead read back the EHCI_USBSTS register after writing to it in ehci_intr1? That way all transactions down to the controller would be guaranteed to be flushed before you continue on. I wonder if this is a remnant of the famous problems with VIA chipsets doing bad things under medium-to-high PCI contention. I don't see any obvious workarounds for this in the Linux EHCI code, so I wonder if it's a case of them not encountering it, or doing something different that avoids the problem. Scott Actually, I just peeked inside the Linux EHCI code and it does a dummy read immediately after writing to the status register: /* clear (just) interrupts */ writel (status, ehci-regs-status); readl (ehci-regs-command); /* unblock posted write */ I wonder if that's the whole trick here. Would someone be willing to try the attached patch instead of the one that Ian posted? Scott This is not documented in the EHCI chip specification. Flushing posted writes is something that all programmers of PCI devices should understand, so it usually isn't documented in device manuals. There exists the doorbell to ensure that the EHCI controller is finished with data structures. Also I have noticed that the existing EHCI driver does not always dequeue structures from the controller before accessing them. Can you point to an example here? If Scott's patch doesn't work, could you have tried to install the following (compiles on FreeBSD 5/6/7): Yeah, looks like my guess was wrong. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Syscall/Sysret state on i386 arch
John Baldwin wrote: On Sunday 28 August 2005 10:32 am, alexander wrote: The AMD64 arch is using the syscall/sysret opcodes instead of int80h to perform a syscall (/usr/src/lib/libc/amd64/SYS.h). I just checked the output my of dmesg and it says: CPU: AMD Duron(tm) Processor (1311.69-MHz 686-class CPU) Origin = AuthenticAMD Id = 0x671 Stepping = 1 Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV ,\ PAT,PSE36,MMX,FXSR,SSE AMD Features=0xc0400800SYSCALL,MMX+,3DNow+,3DNow I got a hold of the AMD document number 21086.pdf. It describes both opcodes pretty well, but doesn't tell which CPUs support the new opcodes. But since the first revision of that document is dated Sept 1997 quite a lot of i386 CPU's should support the opcodes. The NASM manual only states [P6,AMD] as the required CPU to perform those opcodes. I found some patches for Linux that replace the int80h syscall calling convention with syscall/sysret on i386 and the results look pretty convincing: (INT $0x80 based getpid(), got pid 497) latency:282 cycles (SYSENTER based getpid(), got pid 497) latency:138 cycles on a 266 MHz PII this is 0.51 usecs for a getpid(). (was 1.06 usecs) Quoted from: http://www.ussg.iu.edu/hypermail/linux/kernel/9806.1/0878.html Does anybody know more about this? Is it even possible to replace the current syscall implementation that easily or would that require elaborate changes to all the syscalls (libc), etc. And which CPU's support these new opcodes? Doesn anybody know if the Linux patches actually got comitted to the official kernel? Support for syscall/sysret is determined by a cpuid flag. I do believe someone has worked on either syscall/sysret or sysenter/sysexit support in a p4 branch. You can try asking jeff@ about it. I think it was sysenter/sysexit and it didn't really improve things much. Actually, the results were fairly inconclusive because it was also somewhat unstable under real loads. The work is in Perforce under //depot/user/jeff/sysenter/... I've worked on this branch also, but not in a few months. I can make patches if anyone is interested. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Checking sysctl values from within the kernel.
Dan Nelson wrote: In the last episode (Aug 05), Thordur I. Bjornsson said: If I want to check a sysctl value from within the kernel (e.g. an KLD), should I use the system calls described in sysctl(3) ? If not, what is the propper way to do so ? Since most sysctls are direct mappings onto integer variables in the kernel, just check the variable directly. Most of those integer values are also declared static, so they won't be visible to external code, especially not kld's. There is no easy way to do this. I'm sure that you could hack up some code to simulate a sysctl syscall from within the kernel, but that would be really really gross, evil, and wrong. What values are you trying to get at? Would it make more sense to export them via real accessor functions? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Checking sysctl values from within the kernel.
John Baldwin wrote: On Friday 05 August 2005 10:50 am, Dan Nelson wrote: In the last episode (Aug 05), Thordur I. Bjornsson said: If I want to check a sysctl value from within the kernel (e.g. an KLD), should I use the system calls described in sysctl(3) ? If not, what is the propper way to do so ? Since most sysctls are direct mappings onto integer variables in the kernel, just check the variable directly. There's also a kernel_sysctl() function available in the kernel for in-kernel access to sysctls. You might have to lookup the OID for a given name yourself though. Actually, there's a kernel_sysctlbyname() as well. Shoot, forgot about that function. However, exporting data throughout the kernel via the sysctl interface sounds like poor design. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UFS endian-ness
M. Warner Losh wrote: In message: [EMAIL PROTECTED] Jeremy Baggs [EMAIL PROTECTED] writes: : I was wondering if anyone has done any recent work with, or knows how : (non-)trival it would be adding support for mounting big-endian UFS : filesystems, such as the one in use on os X. It is trivial. NetBSD just does the swapping on input or output and the diffs to do it were small. Warner Do their patches include UFS2 and EA support? Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: await asleep
Daniel Eischen wrote: On Wed, 27 Jul 2005, Norbert Koch wrote: The functions await() and asleep() in kern_synch.c are marked as EXPERIMENTAL/UNTESTED. Is this comment still valid? Does anyone have used those functions successfully? Should I better not use them in my device driver code for RELENG_4? How do I correctly cancel a request (as I should do according to the man page): asleep (NULL, 0, NULL, 0)? The await family was removed in 5.x and beyond, so trying to use them in 4.x will make your driver very unportable. There are better ways than await to handle delayed events. Well, there's tsleep() and wakeup() for FreeBSD 5.0. Other than that, what else can you do? These functions are deprecated in 5.x and 6.x in favor of condvar(9) and mutex(9), so you should really use those instead of tsleep() and wakeup(). It seems the kernel in -current is still using tsleep() and wakeup() in some places. I thought we got rid of all these... Can you explain why tsleep and wakeup should no longer be used? I wasn't aware that they were formally deprecated. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: await asleep
Daniel Eischen wrote: On Wed, 27 Jul 2005, Scott Long wrote: Daniel Eischen wrote: On Wed, 27 Jul 2005, Norbert Koch wrote: The functions await() and asleep() in kern_synch.c are marked as EXPERIMENTAL/UNTESTED. Is this comment still valid? Does anyone have used those functions successfully? Should I better not use them in my device driver code for RELENG_4? How do I correctly cancel a request (as I should do according to the man page): asleep (NULL, 0, NULL, 0)? The await family was removed in 5.x and beyond, so trying to use them in 4.x will make your driver very unportable. There are better ways than await to handle delayed events. Well, there's tsleep() and wakeup() for FreeBSD 5.0. Other than that, what else can you do? These functions are deprecated in 5.x and 6.x in favor of condvar(9) and mutex(9), so you should really use those instead of tsleep() and wakeup(). It seems the kernel in -current is still using tsleep() and wakeup() in some places. I thought we got rid of all these... Can you explain why tsleep and wakeup should no longer be used? I wasn't aware that they were formally deprecated. My mistake then. I thought they were deprecated when mutex and CVs were introduced. There is no need for them except for compatability, Incorrect. A mutex is not a replacement for sleep. CV's and semaphores implement some of what tsleep does, but tsleep is absolutely appropriate when you want to sleep for an event (like disk i/o completing) and don't need to worry about mutexes. Not every inch of the kernel needs to be covered by mutexes, Giant or otherwise. and the priority argument of tsleep() doesn't have any meaning any longer, right? I thought it did, but John can give the definitive answer. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: how to use the function copyout()
Felix-KM wrote: I think that could work (only an idea, not tested): struct Region { void * p; size_t s; }; #define IOBIG _IOWR ('b', 123, struct Region) userland: char data[1000]; struct Region r; r.p = data; r.s = sizeof data; int error = ioctl (fd, IOBIG, r); kernel: int my_ioctl(..., caddr_t data, ...) { ... char data[1000]; ... return copyout(data, ((struct Region *) data)-p, ((struct Region *) data)-s); } Have a try and tell us if it works. Norbert Yes! Now the program works! I have changed the code in this way: struct Region { void * p; size_t s; }; #define IOBIG _IOWR ('b', 123, struct Region) Unless your ioctl handler is going to modify values in the Region struct and pass them back out to userland, you should just use _IOR instead of _IORW. Scott ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]