device major number
stupid question, but could'nt (yet) find an answer, im writing a driver, so i need a major device number (for -stable), is there a list of assigned numbers, and if so where? what's the procedure to 'assigne' one? btw, the driver is for a video grabber, zrn36067 based. thanks, danny To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: (no subject)
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 29, 2001 4:07 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: (no subject) In a message dated 11/29/2001 7:16:17 AM Eastern Standard Time, [EMAIL PROTECTED] writes: Well, let me give you something else to put in your pipe and smoke. :-) I've spent about $800 on a few WANic 4xx cards (used, I'll grant) precisely because source for the driver is available. I happen to not use them with Frame circuits so I used the HDLC in the driver. I have spent $0.00 on ET cards precisely because the driver code is unavailable. Now, as I've never used ET cards, I'll take your statement at face value that their drivers are superior to the WANic one. But, I'm not going to pick a superior binary-only driver over an inferior source-freely-available driver, if I have a choice. You may think this is screwy but it's how I feel. You are entitled to your opinion, but you (and others) should explain that when you are making recommendations because Im sure there are those that actually think that you are recommending the best solution, which clearly isnt the case. Most people prefer a boom-box to a crystal set, and those reading your opinions don't understand that context. But I do, as a matter of fact. The whole thread on the WANic that got fired up a month ago or so when they made their announcement that they were dumping the WANic 405 was centered around the fact that this was just one less synchronous serial interface card that had an open source driver available. I was not arguing that we should dump a lot of effort into a binary-only driver for the successor cards, the WANic 5xx series, I was arguing for disclosure of the registers for it. It was rapidly made clear by Imagestream that they were totally uninterested in going back to SDL, Rockwell and one other company I forget which and arguing for such disclosure. After that I suggested to some other people that got interested in it that a binary driver was possible under NDA, but I certainly wasn't advocating it by that statement. I happen to know that there's a Nokia developer somewhere working on a binary driver for the 5xx series and FreeBSD 2.2.8. But I cannot recommend WANic cards anymore because there's no guarentee that this driver will ever leave Nokia, or even be completed. And as far as other sync cards for FreeBSD, I have no experience with them and they are much more recent additions. As of now the ET cards have a compelling advantage over the rest of the sync cards for FreeBSD because they have more history of use under it. I seem to remember reading in some book that the main advantage FreeBSD has over linux is its corporate-friendly license (who wrote that thing anyway?)...yet you bash the concept of using the license. It seems a bit hypocritcal to me. As a matter of fact, in that book your taking about, on page 193 it specifically list the Evergreen Technologies synchronous serial card along with the WANic as T1 interface cards into FreeBSD. And it also states on that same page that using a Cisco as a T1 interface router is safer but more expensive followed by a list of paragraphs that explain why the extra money is worth the hassle. Nowhere in there is any discussion or statement that the card and driver set that has the open source driver (ie: the WANic) is better than the card and driver that has a binary-only driver (ie: the ET card) Of course, all this was written before the bottom dropped out of the used router market, today you can get a 1601 and DSU for $400 and it's extremely difficult to justify use of a PC as a leaf-node router because it simply isn't as resistant to physical environmental stresses as a total hardware router with no moving parts. I still argue that anyone running BGP and multihoming T1's (which probably describes 3/4 of the smaller ISP's in the world) can get better performance at a lower price from a FreeBSD router with sync cards in it than the Cisco recommended solution (3660 or 7200 series or greater) But even the prices on used high-end Cisco gear are falling and it could be argued that this may not be true anymore. As far as the debate between an open source driver vs a closed source driver, I'll say this much about this issue in regards to T1 cards. Simply put the T1 interface hasn't changed in 20 years (probably a lot longer) and given the glacial pace of change of the US phone system, I expect that there will be T1's still being provisioned when I'm an old Grandpa with my beard down to my knees. I routinely purchase DSU's today on the seconds market for use with brand new T1 installs that have manufacture dates of the late 80's. I have a concern about the ET cards because if I bought an ET card today for use in a router I would expect to be able to use that card for another 20 years, or at least until the PCI slot is no
RE: Netgraph
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of [EMAIL PROTECTED] Sent: Thursday, November 29, 2001 4:44 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: RE: Netgraph Lego is a good analogy. The usefulness is not the point. Its great for hackers, and terrible for the general technical population. It depends on your goal, whether its to build an OS for hackers, or to gain widespread acceptance for FreeBSD from the general technical public. Complicated, unintuitive interfaces with a long learning curve are not generally accepted. DB This is a myth, your greately underestimating the general technical public The general technical public has displayed a willingness to read instructions and follow directions (much different than the general computing public which is a different animal) If there is anything wrong with netgraph is that there's a lack of examples of setting up common configurations in the handbook, man pages, and other documents. Also, speaking as a writer, section 4 of the manual page on netgraph is extremely hard to digest, within the first paragraph alone they redefine the meaning of the words graph, node, hook, and edge I understand it's because of the modularness of the software but this is a man page that needs to be a lot less abbreviated. But none of this matters to the general technical public because what most of those people do is find a FAQ that contains a recipe for what they want to be doing and follow that. Ted Mittelstaedt [EMAIL PROTECTED] Author of: The FreeBSD Corporate Networker's Guide Book website: http://www.freebsd-corp-net-guide.com To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: Netgraph
On Fri, 30 Nov 2001, Ted Mittelstaedt wrote: -Original Message- If there is anything wrong with netgraph is that there's a lack of examples of setting up common configurations in the handbook, man pages, and other documents. /usr/share/examples/netgraph gives examples of some common configurations. Also, speaking as a writer, section 4 of the manual page on netgraph is extremely hard to digest, within the first paragraph alone they redefine the meaning of the words graph, node, hook, and edge I understand it's because of the modularness of the software but this is a man page that needs to be a lot less abbreviated. Suggestions welcome.. :-) But none of this matters to the general technical public because what most of those people do is find a FAQ that contains a recipe for what they want to be doing and follow that. Ted Mittelstaedt [EMAIL PROTECTED] Author of: The FreeBSD Corporate Networker's Guide Book website: http://www.freebsd-corp-net-guide.com To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: Netgraph
-Original Message- From: Julian Elischer [mailto:[EMAIL PROTECTED]] Sent: Friday, November 30, 2001 1:01 AM To: Ted Mittelstaedt Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: RE: Netgraph On Fri, 30 Nov 2001, Ted Mittelstaedt wrote: -Original Message- If there is anything wrong with netgraph is that there's a lack of examples of setting up common configurations in the handbook, man pages, and other documents. /usr/share/examples/netgraph gives examples of some common configurations. Oh dang, I should have checked there. But really, this info needs to be in the ngctl man page. Also, speaking as a writer, section 4 of the manual page on netgraph is extremely hard to digest, within the first paragraph alone they redefine the meaning of the words graph, node, hook, and edge I understand it's because of the modularness of the software but this is a man page that needs to be a lot less abbreviated. Suggestions welcome.. :-) The biggest problem with those man pages is that they tell you exactly what the stuff does but not exactly why you would want to do it. I actually sent in a proposal to BSDcon to give a presentation on building network routers with FreeBSD, with a big DP show of different systems tied together. Discussion of Netgraph would have been part of this of course and while I was writing that part I could have modified the man pages. But it wasn't picked up and I set it aside. Maybe sometime in the future I'll pick it up again. Ted Mittelstaedt [EMAIL PROTECTED] Author of: The FreeBSD Corporate Networker's Guide Book website: http://www.freebsd-corp-net-guide.com To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
On Thu, Nov 29, 2001 at 10:28:09PM -0500, Leo Bicknell wrote: I can't reproduce this result, 16K fills a T1 for 11 ms, which is 22000 km (at 2/3 of light speed), enough to get halfway round the Your math is a little funny. Right, I knew there was something wrong somewhere :-) 4000 km one way == 8000 km two way, 8000 / 168300 = 47ms in my book, theoretial optimum. With an RTT of 47ms, you can move 16k per RTT, or or about 340k/sec. It's where I don't quite agree: for a bulk transfer, there is no RTT to account for, you only need to take into account the one-way delay, TCP does the rest for you assuming the window is large enough. Pierre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
:I managed to track the problem down to the duplex settings on both the :Ethernet cards (AT-2500 TX, Realtek 8139 based, AFAIK) and the 10/100 :Switch. Forcing both the cards and the switch to particular settings :cured the problem, and lead to a massive performance increase. : :FTP seems to be particularly badly affected by the constant collisions :(causing backoff). The problem can be tricky to find as the switch :wasn't perceptably showing collisions on the collision LED, but viewing :the switch stats showed a different story! It probably wasn't collision backoff. It was probably packet loss due to the switch and the host getting confused over the duplex setting. i.e. where the switch thought it was one thing and the host thought it was another. The usual solution to this sort of switch/host confusion is to turn off autonegotiation on both sides and hardwire the duplex to full. On both sides. I've occassionally had similar problems (not in the last year or two, though... but definitely with older cards and switches). -Matt :I've noticed similar problems with Linux and certain cards (it was a :while ago). : :John Vinters :[EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
: FWIW, I'm seeing this as well. However, this appears to be a new : occurance, as we were using a FreeBSD 3.X system for our reference test : platform. : :Someone recently submitted a PR about TCP based NFS being significantly :slower under 4.X. I wonder if it could be related? : : http://www.freebsd.org/cgi/query-pr.cgi?pr=misc/32141 : :There is quite a lot of detail in the PR and the submitter has no :trouble reproducing the problem. : : David. Hmm. I'll play with it a bit tomorrow. Er, later today. One thing I noticed recently with NFS/TCP is that I have to run 'nfsiod -n 4' on the client to get reasonable TCP performance. I don't think I had to do that before. It sure sounds similar... like a delayed-ack problem or improper transmit side backoff. It would be nice if someone able to reproduce the problem can test the TCP connection with newreno turned off (net.inet.tcp.newreno) and delayed acks turned off (net.inet.tcp.delayed_ack). If that fixes the proble it narrows down our search considerably. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
J8sVs8N17IWwwDjBCdUj vWWoDdEQlrmoA8756B5pjCjfn
¥xÆW´«©dѼֳ¡¤J·|»¡©úÀÉ ÂŤѸɩ«¤u§@«Ç³Ì·s¥úºÐ¥Ø¿ý
Re: kern/31575: wrong src ip address for some ICMPs
[Redirected to -net] [Category changed to kern] On Fri, Nov 30, 2001 at 11:01:56AM +0700, Igor M Podlesny wrote: [...] [router] | X|backbone|-- | | Yip1|the same media|--[some another ip-network] |ip2|the same media|--|some box| Here is router with FreeBSD (OpenBSD, and, probably *BSD) and Some box doing traceroute to (for e.g.) a host which is _reachable_ _via_ _backbone_. X, Y -- NICs. Y has several IPs, making several ip-networks on the same media. The problem: traceroute being run on somebox will hear respond from router coming from Y.ip1 address which isn't on its (somebox) IP-network. (well, I deem icmp.echoreply isn't alone in this.) And this happens because wrong IP-addr is passed to ifaof_ifpforaddr(). My patch fixes namely this problem -- I have worked out and applied it and I believe I know what I'm talking about. Look at it, and you'll realize what I mean... You may ask me for details, but, please, don't make different situations asking me how does it correlate with -- damn lack of time... Yeah, now I see what's screwed up. I even thought about this myself this morning (well, you know the saying we use for that :-), before even reading your mail. But your fix is not quite correct, as we may have an individual routing table entry on router pointing back to somebox with a specific interface address (IFA) given, as reported by the route -vn get -host somebox command, and we should actually use that as the source. The correct fix is a bit more complicated, and fortunately I have one. Cheers, -- Ruslan Ermilov Oracle Developer/DBA, [EMAIL PROTECTED] Sunbay Software AG, [EMAIL PROTECTED] FreeBSD committer, +380.652.512.251Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Intel gigabit driver
John Polstra wrote: In article [EMAIL PROTECTED], Andre Oppermann [EMAIL PROTECTED] wrote: What happend at Intel? Their driver is even released under the BSD license! (and the Linux one under the GPL) That last bit is incorrect. The Intel driver for Linux is released under a 3-clause BSD license. I doesn't look like a clean BSD license thought... But it's also not under the GPL as such... Anyway, after the rants here on this list from time to time about Intel's strict NDA and Open Source driver problems I was surprised to see such a move from them. Part of the Intel Linux GiGE driver License: This license shall include changes to the Software that are error corrections or other minor changes to the Software that do not add functionality or features when the Software is incorporated in any version of a operating system that has been distributed under the GNU General Public License 2.0 or later. This patent license shall apply to the combination of the Software and any operating system licensed under the GNU Public License version 2.0 or later if, at the time Intel provides the Software to Recipient, such addition of the Software to the then publicly available versions of such operating system available under the GNU Public License version 2.0 or later (whether in gold, beta or alpha form) causes such combination to be covered by the Licensed Patents. -- Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
whats up? 939054072
Below is the result of your feedback form. It was submitted by ([EMAIL PROTECTED]) on Friday, November 30, 2001 at 06:43:01 --- : Hey, what's up, yall? I found a site and if you want to meet people and talk to :people on webcam, you should check this out. They're now giving members totally free :memberships!You don't even need your own webcam. You can watch live videos of :family, friends, or anybody! What is there to lose?bra :href=http://lllil.com/livewebcam;http://lllil.com/livewebcam brbrbrbr/aTo take yourself off my mailing list a href=http://lllil.com/list-offclick here/a.brbrbr454349041 --- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
On Fri, Nov 30, 2001 at 11:11:56AM +0100, Pierre Beyssac wrote: On Thu, Nov 29, 2001 at 10:28:09PM -0500, Leo Bicknell wrote: 4000 km one way == 8000 km two way, 8000 / 168300 = 47ms in my book, theoretial optimum. With an RTT of 47ms, you can move 16k per RTT, or or about 340k/sec. It's where I don't quite agree: for a bulk transfer, there is no RTT to account for, you only need to take into account the one-way delay, TCP does the rest for you assuming the window is large enough. Asume you have 10ms one way delay and an RTT of 20ms. Lets asume your windowsize fits exactly the one way delay. You start sending data until the send window is exhaustet. You have been seending 10ms from the begining and at that time the first packet of you stream reaches the receiver. Now you have to stop sending data for 10ms because you have to wait for the first acknowledge to arive to free some space of the window - if the receiver delays ack you have to wait longer. You can easily see that you need to take RTT + delayed-ack-time into acount. -- B.Walter COSMO-Project http://www.cosmo-project.de [EMAIL PROTECTED] Usergroup [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
Quoting Bruce A. Mah ([EMAIL PROTECTED]): How early in November? I'm staring at this commit message and wondering if it has any relevance to your situation: - revision 1.107.2.18 date: 2001/11/12 22:11:24; author: nate; state: Exp; lines: +3 -1 MFH: V1.139 when newreno is turned on, if dupacks = 1 or dupacks = 2 and new data is acknowledged, reset the dupacks to 0. The problem was spotted when a connection had its send buffer full because the congestion window was only 1 MSS and was not being incremented because dupacks was not reset to 0. Reviewed by:jlemon - Kernel was built on November 7 probably from sources a day or two earlier. The source tree has been updated since the build. Nate's commit above is not in my kernel. For those interested the server and client dumps are here: http://www.irbs.net/server-dump.html http://www.irbs.net/client-dump.html The server clock looks like its about 900Ms ahead of the client. John Capo To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Intel gigabit driver
In article [EMAIL PROTECTED], Andre Oppermann [EMAIL PROTECTED] wrote: John Polstra wrote: That last bit is incorrect. The Intel driver for Linux is released under a 3-clause BSD license. I doesn't look like a clean BSD license thought... But it's also not under the GPL as such... Anyway, after the rants here on this list from time to time about Intel's strict NDA and Open Source driver problems I was surprised to see such a move from them. Part of the Intel Linux GiGE driver License: This license shall include changes to the Software that are error corrections or other minor changes to the Software that do not add functionality or features when the Software is incorporated in any version of a operating system that has been distributed under the GNU General Public License 2.0 or later. This patent license shall [...] Maybe you have an old version of the driver. I have e1000-3.1.23.tar.gz, which I grabbed from developer.intel.com a few weeks ago. I grepped all of the files in it, and the word GNU doesn't appear anywhere. There is a file named LICENSE which is just a standard BSD license. I'll append it below. John Copyright (c) 1999 - 2001, Intel Corporation All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. [end] -- John Polstra John D. Polstra Co., Inc.Seattle, Washington USA Disappointment is a good sign of basic intelligence. -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Intel gigabit driver
John Polstra wrote: In article [EMAIL PROTECTED], Andre Oppermann [EMAIL PROTECTED] wrote: John Polstra wrote: That last bit is incorrect. The Intel driver for Linux is released under a 3-clause BSD license. I doesn't look like a clean BSD license thought... But it's also not under the GPL as such... Anyway, after the rants here on this list from time to time about Intel's strict NDA and Open Source driver problems I was surprised to see such a move from them. Part of the Intel Linux GiGE driver License: This license shall include changes to the Software that are error corrections or other minor changes to the Software that do not add functionality or features when the Software is incorporated in any version of a operating system that has been distributed under the GNU General Public License 2.0 or later. This patent license shall [...] Maybe you have an old version of the driver. I have e1000-3.1.23.tar.gz, which I grabbed from developer.intel.com a few weeks ago. I grepped all of the files in it, and the word GNU doesn't appear anywhere. There is a file named LICENSE which is just a standard BSD license. I'll append it below. You've got an old one. The newest Linux driver on intel.com is e1000-3.5.19.tar.gz. And it talks about the GPL. -- Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
(snip...a large number of postings regarding slow performance by 4.x kernels with TCP/IP) A friend who works for a local university and I tried moving large files using variouis OS'es and hardware. These are FTP transfers with file sizes from 100 to 300 megabytes.. The conclusion we arrived at was that the TCP performance of FreeBSD 4.x and Linux is aproximately the same and that processor speed makes the most difference. In one case, a fast laptop with 16 bit pcmcia NIC did poorly. Moving large files on 100mb/s ethernet backbones gave the folowing results... Dual 800 mhz PIII processors with Linux 6.1: 10mB/s. Sunblade 100's: 10mB/s. Single 1.4ghz processors (noname box)with 3C905 NICS, FreeBSD-stable (June 2001).: 9.5 mB/s. In the case wehere we had only one machine of a type, we used the dual 800mhz machines as a sink...with the following results (this is probably questionable): Dual 333 Linux 5.1 5mB/s Pentium 350 III with 3C905 NIC, Linux 5.1: 2mB/sec K6-2 400 with smc NIC, Linux 5.1: 2.8mB/sec Dell 500mhz PowerEdge with 4.3 with 3C905 NIC to HP Netserver PII 266, both running 4.3-RELEASE: 3.0 mB/sec. Dell 500mhz PowerEdge with 4.3 to Dell 850mhz laptop running 4.4 with Dlink PCMCIA ethernet card: 1.0 mB/sec. (caused by pcmcia NIC?) PIIMMX 200mhz box running 4.4-Relese with 3C905 to same Dell Laptop: 500kB/sec. Unfortunately, we didn't have any 7.x Linux available or 3.X FreeBSD. FWIW... Jim Durham To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Intel gigabit driver
In article [EMAIL PROTECTED], Andre Oppermann [EMAIL PROTECTED] wrote: John Polstra wrote: Maybe you have an old version of the driver. I have e1000-3.1.23.tar.gz, which I grabbed from developer.intel.com a few weeks ago. I grepped all of the files in it, and the word GNU doesn't appear anywhere. There is a file named LICENSE which is just a standard BSD license. I'll append it below. You've got an old one. The newest Linux driver on intel.com is e1000-3.5.19.tar.gz. And it talks about the GPL. Whoops! :-} Thanks for straightening me out! John -- John Polstra John D. Polstra Co., Inc.Seattle, Washington USA Disappointment is a good sign of basic intelligence. -- Chögyam Trungpa To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
On Thu, Nov 29, 2001 at 10:31:56PM -0500, Jonathan M. Slivko wrote: If you give me your IP address, I can ping *from* Columbia.edu to your machine and see what I get, that should pretty much solve any issues that may arise. pun.isi.edu 128.9.160.150 Thanks. msg29374/pgp0.pgp Description: PGP signature
Re: device major number
On Fri, Nov 30, 2001 at 10:02:56AM +0200, Danny Braniss wrote: stupid question, but could'nt (yet) find an answer, im writing a driver, so i need a major device number (for -stable), is there a list of assigned numbers, and if so where? what's the procedure to 'assigne' one? The list is in src/sys/conf/majors. I believe the normal advice is to use one of the internal ones (200-252) while developing and switch to a new one when the driver is done. -- Brooks -- Any statement of the form X is the one, true Y is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 msg29375/pgp0.pgp Description: PGP signature
Re: FreeBSD performing worse than Linux?
: FWIW, I'm seeing this as well. However, this appears to be a new : occurance, as we were using a FreeBSD 3.X system for our reference test : platform. : :Someone recently submitted a PR about TCP based NFS being significantly :slower under 4.X. I wonder if it could be related? : : http://www.freebsd.org/cgi/query-pr.cgi?pr=misc/32141 : :There is quite a lot of detail in the PR and the submitter has no :trouble reproducing the problem. : : David. Hmm. I'll play with it a bit tomorrow. Er, later today. One thing I noticed recently with NFS/TCP is that I have to run 'nfsiod -n 4' on the client to get reasonable TCP performance. I don't think I had to do that before. It sure sounds similar... like a delayed-ack problem or improper transmit side backoff. It would be nice if someone able to reproduce the problem can test the TCP connection with newreno turned off (net.inet.tcp.newreno) and delayed acks turned off (net.inet.tcp.delayed_ack). If that fixes the proble it narrows down our search considerably. John Capo replied that turning off both did not help his setup any. I was supposed to be testing things yesterday, but the guys got pulled away on another project. Perhaps today I'll get a chance to get some tcpdump's and some more test data. Nate To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
Looking at the complete dump on the server more closely I see what's happening. The server didn't jump ahead in the stream. The client side of these tests is on a fractional T1. In about 60Ms the server pushed a window's worth of data, about 200 packets since the payload was small, 48 bytes. (48 + IP + TCP) * 200 is around 17KB in 60Ms which overflowed the frame switch queue. The other part of the dump where the server is acked for a segment just sent but does not send the next segment till a duplicate ack is received better than a second later is still suspect to me. John Capo Quoting Sergey Babkin ([EMAIL PROTECTED]): And here a _very_ pathological thing has happened: the server just forgot to send the data between sequence numbers 12937 and 28049. Since the dump was done on the server side, this suggests that something very bad has happened with the TCP state on the server side. Possibly the value of the current sequence number in the protocol control block got overwritten by something. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
TCP Performance Graphs
Since the topic has come up again, I'll provide some graphs, and go back to my suggestion to see if it gets some traction this time around. http://www.ufp.org/~bicknell/fbsdtcp.png This graph shows the theoretical maximum performance of FreeBSD's TCP stack (assuming a network with ample free bandwidth, no router buffering, no dropped packets, etc). The red curve is with the existing (16k) window. I've used a scale of 0 to 100ms RTT, as I think that's the range you should find in the contentional US in the real world. Obviously higher values would be needed to make transoceanic hops, satellite hops, or other cases work. As you can see, we should be able to fill a T1 up to about 83 MS RTT, Ethernet up to about 16ms RTT, and DS-3 up to about 3ms. My 'rough estimate' on the real world is you can get about 75% of those figures across what we know and love as the Internet, so you could fill a T1 over a connection with an RTT of about 62ms. The question that immediately comes to mind is, why not simply use as big a value as possible? The problem comes down to buffering the data, and busy servers may have to buffer a lot of data. Having a 1 meg window size may have you buffer 1 meg per connection. Note that FreeBSD's current buffer management is particularly stupid in that it will _always_ buffer 1 Meg, need it or not. Until we fix this we need an interim solution. Most of the commercial Unix vendors as well as Linux have moved to a 32k default. This is the green line marked 'proposal' in the graph. This will, on average double the network memory used. If you want to see the impact of this on your own systems, run 'netstat -M', and consider the worst case of doubled usage. I suspect virtually all server admins won't care about the additional memory if it means additional performance. 32k windows, as the graph shows, let you saturate a T1 with a nice buffer. With T1 DSL and Cable modems being common now I feel very strongly that out-of-the-box ability to saturate these links is essential to make people believe FreeBSD is a good performer. It also provides a nice boost (double throughput, imagine that) to users off Ethernet hubs behind higher speed connections. They can now get full Ethernet speed up to about 32 ms, which opens up a significant number of network sites. I've also included 64k, the largest value that can be used without sliding window support. For now, I would consider 64k to be the largest default we should even think about, and it may not be a good idea due to the larger memory footprint. That said, DRAM prices are at an all time low, so now may be the time to poke people to buy more if they want real performance. I don't know who can move this forward, but I'd really like to see 32k windows be the default in the next release. I think 32k could go into current immediately, and stable nearly immediately to start to get some feedback and insure there are no major issues. Finally, many people keep replying that applications can set larger window sizes on their own, so this is unnecessary. While true, this is completely impractical for a number of reasons: * End users won't. They expect it to work out of the box. Tweaking a setting is unacceptable. * Every bulk transfer application would have to be modified. Take a look in ports, see if you think that's a good idea. * Non-bulk transfer applications can become bulk transfer applications. For instance, is an ssh session an interactive session, or really an scp of a large file? * Hard coding these values into thousands of programs will make future upgrades (when network speed and memory allow) infinitely harder. If Linux and most of the commercial vendors have found 32k to be an acceptable value I think it's time FreeBSD join them. We should be leading, not last to adopt. (Note, for those curious in another view, try http://www.ufp.org/~bicknell/fbsdtcp3d.png) -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
In article local.mail.freebsd-hackers/[EMAIL PROTECTED] you write: Quoting Sergey Babkin ([EMAIL PROTECTED]): John Capo wrote: 21:41:49.001039 client.4427 server.22: P 144:192(48) ack 12937 win 17376 nop,nop,timestamp 53827954 105528895 (DF) [tos 0x10] 21:41:49.001073 server.22 client.4427: . 28049:29497(1448) ack 192 win 17328 nop,nop,timestamp 105529049 53827954 (DF) [tos 0x10] 21:41:49.001085 server.22 client.4427: P 29497:30313(816) ack 192 win 17328 nop,nop,timestamp 105529049 53827954 (DF) [tos 0x10] 21:41:49.109131 client.4427 server.22: . ack 12937 win 17376 nop,nop,timestamp 53827967 105528895 (DF) [tos 0x10] And here a _very_ pathological thing has happened: the server just forgot to send the data between sequence numbers 12937 and 28049. Since the dump was done on the server side, this suggests that something very bad has happened with the TCP state on the server side. Possibly the value of the current sequence number in the protocol control block got overwritten by something. I don't believe this is happening. It looks like the server blasts everything over the the client, and the client drops a whole bunch of segments. When the server gets the dupack, it correctly performs a fast retransmit and continues transmitting where it left off. server side: 21:41:46.396051 client.4427 server.22: . ack 11489 win 17376 21:41:46.418208 client.4427 server.22: . ack 11489 win 17376 21:41:47.460903 server.22 client.4427: . 11489:12937(1448) ack 144 win 17376 client side: 21:41:46.712307 server.22 client.4427: P 11441:11489(48) ack 144 win 17376 21:41:46.763034 server.22 client.4427: . 25937:27385(1448) ack 144 win 17376 21:41:46.763106 client.4427 server.22: . ack 11489 win 17376 21:41:46.785324 server.22 client.4427: P 27385:28049(664) ack 144 win 17376 21:41:46.785370 client.4427 server.22: . ack 11489 win 17376 21:41:47.936278 server.22 client.4427: . 11489:12937(1448) ack 144 win 17376 However, at this point, the client no further packets, so the server really needs to enter slow start and retransmit everything starting at 12937. Instead, it seems that the server remains in congestion avoidance, and keeps sending at leading edge of the window, performing fast retransmits. John, please try tweaking this sysctl: sysctl -w net.inet.tcp.local_slowstart_flightsize=1 which should force the server to start out doing slow start. This isn't exactly a fix for the above problem, but may heip avoid getting into the situation in the first place. -- Jonathan To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Timedout SCB already complete
I have been able to fix this bug in my KLD. I forgot to add a splbio() protection in a function. On Thu, 29 Nov 2001, Zhihui Zhang wrote: While running my KLD that does a lot of I/O, I see the following message: ahc0: Timedout SCB already complete. interrupts may not be functioning. This happens after my KLD runs a while. What could be the problem? Where could the bugs likely exist? Thanks for any clue. -Zhihui To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
The default window size (controlled by the socket buffer size) can be globally modified using sysctl variables: net.inet.tcp.sendspace: 16384 net.inet.tcp.recvspace: 16384 As you mention, changing this (and other things such as the amount of mbufs/clusters, etc.etc.) must be done considering the hw configuration and other issues. It is not a big deal to move the default to 32 or 64k, and I'd vote for that, but if a sysadmin is unable to have a look at this, then the problem is in the sysadmin, not in FreeBSD! cheers luigi --+- Luigi RIZZO, [EMAIL PROTECTED] . ACIRI/ICSI (on leave from Univ. di Pisa) http://www.iet.unipi.it/~luigi/ . 1947 Center St, Berkeley CA 94704 Phone: (510) 666 2927 --+- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
Leo Bicknell writes: The question that immediately comes to mind is, why not simply use as big a value as possible? The problem comes down to buffering the data, and busy servers may have to buffer a lot of data. Having a 1 meg window size may have you buffer 1 meg per connection. Note that FreeBSD's current buffer management is particularly stupid in that it will _always_ buffer 1 Meg, need it or not. Until we fix this we need an interim solution. I thought that I heard a few months ago that Matt Dillon was looking at ways to dynamically size tcp windows from within the kernel. Maybe I'm on crack. Maybe we should look at the Dynamic Righsizing work being done at LANL. See Dynamic Adjustment of TCP Window Sizes and Dynamic Right-Sizing: A Simulation Study at http://public.lanl.gov/radiant/publications.html Cheers, Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 01:47:41PM -0500, Andrew Gallatin wrote: I thought that I heard a few months ago that Matt Dillon was looking at ways to dynamically size tcp windows from within the kernel. Maybe I'm on crack. He is. It is very good work that I wish I could spend more time helping with, as it is clearly the long term solution to this problem. That said, it's far enough out I think we need a temporary fix in increasing the defaults. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 10:29:28AM -0800, Luigi Rizzo wrote: It is not a big deal to move the default to 32 or 64k, and I'd vote for that, but if a sysadmin is unable to have a look at this, then the problem is in the sysadmin, not in FreeBSD! I disagree, on two points: * Many people use FreeBSD as a desktop OS. Think the same people who use Win98, but only slightly smarter. These people are 'sysadmins' only in the sense that they have a root password. When FreeBSD can't fill their DSL line and Linux can, they will switch to Linux never knowing what the real problem was. * Most sysadmins shouldn't be bothered with this. People running news or IRC servers, or huge (100+ box) web farms might know these tricks, but the guy who sets up a server to dump 100k/sec average of web pages shouldn't be bothered. To extend your logic, we might as well make it default to 4k, since that is the most resource conservative, and anyone who cares will increase it. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 02:11:00PM -0500, Leo Bicknell wrote: ... * Many people use FreeBSD as a desktop OS. Think the same people who use Win98, but only slightly smarter. These people are 'sysadmins' only in the sense that they have a root password. When FreeBSD can't fill their DSL line and Linux can, they will switch to Linux never knowing what the real problem was. we are going to gain/lose these people at any blow of wind, any spam that says X is better than Y will cause them to switch, and they'll never bother to read why or how to cure it. Do we care ? Maybe. Do we have the energy to fight FUD ? I doubt it. To extend your logic, we might as well make it default to 4k, since that is the most resource conservative, and anyone who cares will increase it. My logic is that I would like to increase the default to 32 or 64k, but if this involves starting an endless discussion to reach consensus on whether this can be done or not, I prefer to fight other battles (maybe equally pointless). cheers luigi To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: TCP Performance Graphs
Dude, the statement was that Luigi is in favor of _increasing_ the default size. How do you extend his logic to say it might as well be reduced to 4k? Please don't put words in people's mouths. Daniel D-man Manesajian -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Leo Bicknell Sent: Friday, November 30, 2001 11:11 AM To: Luigi Rizzo Cc: [EMAIL PROTECTED] Subject: Re: TCP Performance Graphs On Fri, Nov 30, 2001 at 10:29:28AM -0800, Luigi Rizzo wrote: It is not a big deal to move the default to 32 or 64k, and I'd vote for that, but if a sysadmin is unable to have a look at this, then the problem is in the sysadmin, not in FreeBSD! I disagree, on two points: * Many people use FreeBSD as a desktop OS. Think the same people who use Win98, but only slightly smarter. These people are 'sysadmins' only in the sense that they have a root password. When FreeBSD can't fill their DSL line and Linux can, they will switch to Linux never knowing what the real problem was. * Most sysadmins shouldn't be bothered with this. People running news or IRC servers, or huge (100+ box) web farms might know these tricks, but the guy who sets up a server to dump 100k/sec average of web pages shouldn't be bothered. To extend your logic, we might as well make it default to 4k, since that is the most resource conservative, and anyone who cares will increase it. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: more on jail - suitable for multi user system ?
Joesh Juphland writes: One thing I would like to do as a hobby is start a classic multi-user unix system and giving out shell accounts to whoever wants one. Not a money maker, of course, but it would be fun. My question: does anyone have any comments on using `jail` in a public environment like this - that is, instead of giving away individual shell accounts, you would give away individual jails - basically a whole seperate machine with its own IP and own root access, etc. ? Full jailes (that is - every jail has running sshd) requires different IP for every jail. Big IP alias list for one interface is needed. I think about whole network assignment instead of only host address for interface. It is possible sharing same IP different ports. I usually mount /etc into jail read only to prevent changes in port/jail mapping at startup and restrict local_startup=/etc/rc.d I have startup script that automatically assigns IP and mounts for starting jail. The down side of jailed shell is restrictions for raw sockets (no ping and traceroute) and shared memory. I am not asking about the commercial viability - it's just a hobby system. But in terms of limiting resources (so no one user bogs down the whole system) and in terms of security (nobody can turn rogue and bring down / compromise the system) is this a viable option ? Jail is not ideal but is better then with no jail. There is another answer in list about resourses. Or is jail best kept to environments where the users are in-house (trusted) Best untrasted user is dead user :-) best live untrasted user is jailed. Another way of asking this would be, was jail developed for, and best used for, creating a safe area for daemons like httpd, or was it developed with running many full-blown independent systems on a single machine in mind ? I don't know developer's mind, but safe area for daemons like pop smtpd(any kind) named ntpd (in-pair with non-jailed ntpd) so on created by jail is good enough now. /bin/sh and friends are evils even in jail. _any_ comments appreciated. Sorry, my English is worse then my knowledge. -- @BABOLO http://links.ru/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
* Luigi Rizzo [EMAIL PROTECTED] [011130 13:26] wrote: On Fri, Nov 30, 2001 at 02:11:00PM -0500, Leo Bicknell wrote: ... * Many people use FreeBSD as a desktop OS. Think the same people who use Win98, but only slightly smarter. These people are 'sysadmins' only in the sense that they have a root password. When FreeBSD can't fill their DSL line and Linux can, they will switch to Linux never knowing what the real problem was. we are going to gain/lose these people at any blow of wind, any spam that says X is better than Y will cause them to switch, and they'll never bother to read why or how to cure it. Do we care ? Maybe. Do we have the energy to fight FUD ? I doubt it. To extend your logic, we might as well make it default to 4k, since that is the most resource conservative, and anyone who cares will increase it. My logic is that I would like to increase the default to 32 or 64k, but if this involves starting an endless discussion to reach consensus on whether this can be done or not, I prefer to fight other battles (maybe equally pointless). I was about to set the default in -stable to Leo's suggested values, it seems that -current already has the delta he wants in it, my question is, was anything else changed along the lines of the number of nmbclusters allocated in -current to go along with this change? -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
* Alfred Perlstein [EMAIL PROTECTED] [011130 13:51] wrote: I was about to set the default in -stable to Leo's suggested values, it seems that -current already has the delta he wants in it, my question is, was anything else changed along the lines of the number of nmbclusters allocated in -current to go along with this change? It seems not, I've committed the change. -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
[OT] alarm() question
Apologies for this being more C than freebsd, but I did say OT in the subject... In the most basic use of an alarm, like this: #include stdio.h #include unistd.h #include signal.h sig_t signal(int sig, sig_t func); static void bzzt() { printf(In routine bzzt now, timer expired after 3 seconds\n); } main() { signal(SIGALRM, bzzt); alarm(3); system(/usr/bin/host -t soa 111.0.12.in-addr.arpa); printf(Done\n); } Why does the alarm go off but not interrupt the system call? bzzt() is executed, but the program doesn't print Done and exit for a minute plus. Pointers to FM to RT welcome. Thanks, --- David To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, 30 Nov 2001, Leo Bicknell wrote: Since the topic has come up again, I'll provide some graphs, and go back to my suggestion to see if it gets some traction this time around. http://www.ufp.org/~bicknell/fbsdtcp.png I don't think anyone's doubting the importance of larger windows; it's just that we can't do much increasing until they're dynamic. That being said, Matt did post a patch which implements socket buffer autoscaling a few months back. I've been meaning to review it, but haven't had the time. If you can give it some good testing and prove that it provides better performance in most cases (and hopefully no regressions), I suspect that might provide the momentum to get it looked at by more people and committed. Mike Silby Silbersack To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
Well, this is embarassing. I can reproduce this completely running 4.4-stable (Nov 17th kernel) on two machines. With newreno turned on, a TCP NFS mount only gets 80K/sec. With newreno turned off on the transmit side, a TCP NFS mount gets 7MB/sec. The state of the delayed-ack sysctl is irrelevant. This is without running any nfsiod's (which would mask the degredation of the synchronous messaging). I am tracking it down now. -Matt Matthew Dillon [EMAIL PROTECTED] : (I wrote) :Hmm. I'll play with it a bit tomorrow. Er, later today. One thing :I noticed recently with NFS/TCP is that I have to run 'nfsiod -n 4' :on the client to get reasonable TCP performance. I don't think I :had to do that before. It sure sounds similar... like a delayed-ack :problem or improper transmit side backoff. : :It would be nice if someone able to reproduce the problem can test :the TCP connection with newreno turned off (net.inet.tcp.newreno) :and delayed acks turned off (net.inet.tcp.delayed_ack). If that :fixes the proble it narrows down our search considerably. : :Hello, I am the submitter of PR 32141 mentioned above, : :I did check it with a 4.3 Release server and 4.2 Release client using :'mount_nfs -3 -T ...': :setting net.inet.tcp.newreno=0 gives fast performance (about 8 Mbyte/s, same :for udp mount), setting net.inet.tcp.newreno=1 gives 80kbyte/s. :Setting net.inet.tcp.delayed_ack=0 has no influence (I checked all 4 :combinations). :There is 'nfsiod -n 4' running at clientside (default setting if enabling :nfs via sysinstall). We did not play around with nfsiod settings so far. : :Hope that helps : : Alexander : :-- :Alexander Haderer Charite To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
:I don't think anyone's doubting the importance of larger windows; it's :just that we can't do much increasing until they're dynamic. : :That being said, Matt did post a patch which implements socket buffer :autoscaling a few months back. I've been meaning to review it, but :haven't had the time. If you can give it some good testing and prove that :it provides better performance in most cases (and hopefully no :regressions), I suspect that might provide the momentum to get it :looked at by more people and committed. : :Mike Silby Silbersack One of the things that came out of that conversation, however, was that it should be safe to increase the receive-side window, because programs typically drain the receive buffers the moment data comes in. So I think we can safely increase the dfeault net.inet.tcp.recvspace from 16384 to 32768 immediately. The transmit side requires more thought. I did write that patch, and it does work, but it's too messy for my tastes. I would personally much rather rewrite it to (A) fix the RTT stored in the route tables and (B) adjust the transmit window based on that, which is a much less sophisticated patch (and less messy), but ought to work quite well in regards to transmit buffer management. After I figure out this 80K/sec problem I'll revisit the transmit-side buffer limiting based on my new proposal above. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
I believe I have found the problem. The transmit side has a maximum burst count imposed by newreno. As far as I can tell, if this maxburst is hit (it defaults to 4 packets), the transmitter just stops - presumably until it receives an ack. Now, theoretically this should work just fine... send four packets, receive the first ack and send the next four packets... it should allow us to fill the window geometrically. I believe the idea is to give transmit packets a chance to include acks for received data in a reasonable period of time... I'm not sure, it's J Lemon's commit (from the original newreno commits) so maybe he can work it out. However, if the receiver has delayed-acks turned on only one ack is returned for all four packets. The next four are then sent and one ack is returned. I believe this the cause of the problem. It effectively destroys the TCP window, forcing it to around 1.5Kx4 = 6K. This also explains why performance is so weird... if more then one delayed ack happens to occur per burst you get 'bumps' in the performance. Without the patch, two things will solve or partially solve the problem: * Turn off delayed acks on the receiver (performance 80K-6.8MB/sec) OR * Turn off newreno on the transmitter. (performance 80K-7.9MB/sec) The patch below kills the burst limit on the transmit side and appears to solve the problem permanently. I'm sure I'm breaking something in the newreno RFC, but I am going to commit it to both branches now because our current situation is horrible. -Matt Index: tcp_output.c === RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v retrieving revision 1.39.2.10 diff -u -r1.39.2.10 tcp_output.c --- tcp_output.c2001/07/07 04:30:38 1.39.2.10 +++ tcp_output.c2001/11/30 21:18:10 @@ -912,7 +912,14 @@ tp-t_flags = ~TF_ACKNOW; if (tcp_delack_enabled) callout_stop(tp-tt_delack); +#if 0 + /* +* This completely breaks TCP if newreno is turned on +*/ if (sendalot (!tcp_do_newreno || --maxburst)) + goto again; +#endif + if (sendalot) goto again; return (0); } To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
Wierd, on my Dual PII 300 I'm getting around 8MB/sec to an 800MHz athlon. The athlon is using a 3com 905b I believe, and the PII is using an intel fxp type card. Granted this is from my living room to my bedroom so that may be part of what I see. Also, the Dual PII is running -STABLE as of a week ago, and the Athlon is running -CURRENT as of about a week ago. Ken On Fri, 30 Nov 2001, James C. Durham wrote: (snip...a large number of postings regarding slow performance by 4.x kernels with TCP/IP) A friend who works for a local university and I tried moving large files using variouis OS'es and hardware. These are FTP transfers with file sizes from 100 to 300 megabytes.. The conclusion we arrived at was that the TCP performance of FreeBSD 4.x and Linux is aproximately the same and that processor speed makes the most difference. In one case, a fast laptop with 16 bit pcmcia NIC did poorly. Moving large files on 100mb/s ethernet backbones gave the folowing results... Dual 800 mhz PIII processors with Linux 6.1: 10mB/s. Sunblade 100's: 10mB/s. Single 1.4ghz processors (noname box)with 3C905 NICS, FreeBSD-stable (June 2001).: 9.5 mB/s. In the case wehere we had only one machine of a type, we used the dual 800mhz machines as a sink...with the following results (this is probably questionable): Dual 333 Linux 5.1 5mB/s Pentium 350 III with 3C905 NIC, Linux 5.1: 2mB/sec K6-2 400 with smc NIC, Linux 5.1: 2.8mB/sec Dell 500mhz PowerEdge with 4.3 with 3C905 NIC to HP Netserver PII 266, both running 4.3-RELEASE: 3.0 mB/sec. Dell 500mhz PowerEdge with 4.3 to Dell 850mhz laptop running 4.4 with Dlink PCMCIA ethernet card: 1.0 mB/sec. (caused by pcmcia NIC?) PIIMMX 200mhz box running 4.4-Relese with 3C905 to same Dell Laptop: 500kB/sec. Unfortunately, we didn't have any 7.x Linux available or 3.X FreeBSD. FWIW... Jim Durham To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
I believe I have found the problem. The transmit side has a maximum burst count imposed by newreno. As far as I can tell, if this maxburst is hit (it defaults to 4 packets), the transmitter just stops - presumably until it receives an ack. Note, my experiences (and John Capos) are showing degraded performance when *NOT* on a LAN segment. In other words, when packet loss enters the mix, performance tends to fall off rather quickly. This is with or without newreno (which should theoretically help with packet loss). John claims that disabling delayed_ack doesn't seem to affect his performance, and I've not been able to verify if delayed_ack helps/hurts in my situation, since the testers have been pressed for time so I can't get them to iterate through the different settings. I do however have some packet dumps, although I'm not sure they will tell anything. :( Nate To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
Alfred Perlstein wrote: * Richard Sharpe [EMAIL PROTECTED] [011130 15:02] wrote: The traffic in the tbench case is SMB taffic. Request/response, with a mixture of small requests and responses, and big request/small response or small request/big response, where big is 64K. I have switched off newreno, and it made no difference. I have switched off delayed_ack, and it reduced performance about 5 percent. I have made sure that SO_SNDBUF and SO_RCVBUF were set to 131072 (which seems to be the max), and it increased performance marginally (like about 2%), but consistently. I am still analysing the packet traces I have, but it seems to me that the crucial difference is Linux seems to delay longer before sending ACKs, and thus sends less ACKs. Since the ACK is piggybacked in the response (or the next request), it all works fine, and the reponse/request gets there sooner. However, I have not convinced myself that the saving of 20uS or so per request/response pair accounts for some 40+ Mb/s. Can you try these two commands: sysctl -w net.inet.tcp.recvspace=65536 sysctl -w net.inet.tcp.sendspace=65536 Yes, that is what I did ... -- Richard Sharpe, [EMAIL PROTECTED], LPIC-1 www.samba.org, www.ethereal.com, SAMS Teach Yourself Samba in 24 Hours, Special Edition, Using Samba To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: [OT] alarm() question
David Miller [EMAIL PROTECTED] types: Apologies for this being more C than freebsd, but I did say OT in the subject... In the most basic use of an alarm, like this: #include stdio.h #include unistd.h #include signal.h sig_t signal(int sig, sig_t func); static void bzzt() { printf(In routine bzzt now, timer expired after 3 seconds\n); } main() { signal(SIGALRM, bzzt); alarm(3); system(/usr/bin/host -t soa 111.0.12.in-addr.arpa); printf(Done\n); } Why does the alarm go off but not interrupt the system call? bzzt() is executed, but the program doesn't print Done and exit for a minute plus. Pointers to FM to RT welcome. Try the system() man page. system() does a fork, then exec's a shell with the string. So in the child process, the ALARM handling will be done by the shell, and I'm pretty sure it ignores them. As you noticed, the parent process gets the alarm. Checking the wait system page says that wait system calls - like the one done by the system() library routine - may either be interrupted, or restarted after the signal handler runs. Guess which one is happening here. mike -- Mike Meyer [EMAIL PROTECTED] http://www.mired.org/home/mwm/ Q: How do you make the gods laugh? A: Tell them your plans. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
:Note, my experiences (and John Capos) are showing degraded performance :when *NOT* on a LAN segment. In other words, when packet loss enters :the mix, performance tends to fall off rather quickly. : :This is with or without newreno (which should theoretically help with :packet loss). John claims that disabling delayed_ack doesn't seem to :affect his performance, and I've not been able to verify if delayed_ack :helps/hurts in my situation, since the testers have been pressed for :time so I can't get them to iterate through the different settings. : :I do however have some packet dumps, although I'm not sure they will :tell anything. :( : :Nate Packet loss will screw up TCP performance no matter what you do. NewReno, assuming it is working properly, can improve performance for that case but it will not completely solve the problem (nothing will). Remember that our timers are only good to around 20ms by default, so even the best retransmission case is going to create a serious hicup. The question here is... is it actually packet loss that is creating this issue for you and John, or is it something else? The only way to tell for sure is to run tcpdump on BOTH the client and server and then observe whether packet loss is occuring by comparing the dumps. I would guess that turning off delayed-acks will improve performance in the face of packet loss, since a lost ack packet in that case will not be as big an issue. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
:Note, my experiences (and John Capos) are showing degraded performance :when *NOT* on a LAN segment. In other words, when packet loss enters :the mix, performance tends to fall off rather quickly. : :This is with or without newreno (which should theoretically help with :packet loss). John claims that disabling delayed_ack doesn't seem to :affect his performance, and I've not been able to verify if delayed_ack :helps/hurts in my situation, since the testers have been pressed for :time so I can't get them to iterate through the different settings. : :I do however have some packet dumps, although I'm not sure they will :tell anything. :( : :Nate Packet loss will screw up TCP performance no matter what you do. I know, dealing with that issue is my day job. :) My point is that older FreeBSD releases (and newer Linux releases) seem to be dealing with it in a more sane manner. At least, it didn't effect performance nearly as much as it does in newer releases. NewReno, assuming it is working properly, can improve performance for that case but it will not completely solve the problem (nothing will). Remember that our timers are only good to around 20ms by default, so even the best retransmission case is going to create a serious hicup. See above. The question here is... is it actually packet loss that is creating this issue for you and John, or is it something else? In my opinion, it's how the TCP stack recovers from packet loss that is the problem. The only way to tell for sure is to run tcpdump on BOTH the client and server and then observe whether packet loss is occuring by comparing the dumps. Unfortunately, I'm unable to run tcpdump on the client, since it's running NT and we're not allowed to install any 3rd party apps on it (such as the WinDump package). I'm not saying that I expect the same results as I get on the LAN segment, but I *am* expecting results that are equivalent to what we were seeing with FreeBSD 3.x, and those that are in the same ballpark (or better) than the Linux systems sitting next to it. Given that I get great LAN resuls, I no longer suspect I have a ethernet autonegotiation problem, since I can get almost wire-speeds with local nodes, and close to maximum performance with our wireless products when the network segment the FreeBSD server is relatively idle. I would guess that turning off delayed-acks will improve performance in the face of packet loss, since a lost ack packet in that case will not be as big an issue. I'm not sure I agree. I wouldn't expect it would help/hinder the performance assuming a correctly performing stack, *UNLESS* the packet loss was completely due to congestion. In that case, delayed-acks *may* improve things, but I doubt it would help much with TCP backoff and such. Nate To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
The question that immediately comes to mind is, why not simply use as big a value as possible? The problem comes down to buffering the data, and busy servers may have to buffer a lot of data. Having a 1 meg window size may have you buffer 1 meg per connection. Note that FreeBSD's current buffer management is particularly stupid in that it will _always_ buffer 1 Meg, need it or not. Until we fix this we need an interim solution. I thought that I heard a few months ago that Matt Dillon was looking at ways to dynamically size tcp windows from within the kernel. Maybe I'm on crack. You're not on crack, I don't know if it was Matt Dillon, but someone was doing this, I was using the patches for about a month to test them out. Ken Maybe we should look at the Dynamic Righsizing work being done at LANL. See Dynamic Adjustment of TCP Window Sizes and Dynamic Right-Sizing: A Simulation Study at http://public.lanl.gov/radiant/publications.html Cheers, Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
First off, apologies to Luigi, I was shooting off my mouth. Second off: On Fri, Nov 30, 2001 at 01:50:42PM -0600, Alfred Perlstein wrote: I was about to set the default in -stable to Leo's suggested values, it seems that -current already has the delta he wants in it, my question is, was anything else changed along the lines of the number of nmbclusters allocated in -current to go along with this change? On Fri, Nov 30, 2001 at 01:54:02PM -0600, Alfred Perlstein wrote: It seems not, I've committed the change. When I proposed this before there was a bit of a debate about needing to increase clusters and MBUF's. To summarize, I think we took the following away from it: * For most users it makes no difference, as they are far from the limits. * This will make a small number of people who aren't hitting limits now hit an MBUF limit. - These people probably need increases anyway, as they are too close to the limit now. - Hitting the MBUF limit is fairly, well, harsh, and we might want to add syslog or other logged warnings at like 90% utilization or something. At a minimum I think: * There needs to be a note in the errata for the release this goes in mentioning more MBUF's might be needed. * LINT should be updated with a comment and a value 2 to 4 times GENERIC's default as the default listed value. * The logging at 90% usage should be investigated. I can probably generate patches for that over the weekend, provided I can find a good way to rate limit them. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 05:14:18PM -0500, Leo Bicknell wrote: First off, apologies to Luigi, I was shooting off my mouth. no problem, and no need for apologies :) cheers luigi To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, 30 Nov 2001, Leo Bicknell wrote: * The logging at 90% usage should be investigated. I can probably generate patches for that over the weekend, provided I can find a good way to rate limit them. Luigi, Jonathan and I had already been discussing this idea before this this thread even started. If you come up with a good patch to do this, I'd be happy to review and commit it. (Remember to target both -current and -stable though - the mbuf system differs a decent amount between the two.) Mike Silby Silbersack To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
As a side note, I turned off delayed ack on both machines, and had the sendsize and recvsize set at 32768... I'm talking about wirespeed too, not measured incredibly accurately, but just measured using one of the windowmaker dockapps :-D Ken On Fri, 30 Nov 2001, Kenneth Wayne Culver wrote: Wierd, on my Dual PII 300 I'm getting around 8MB/sec to an 800MHz athlon. The athlon is using a 3com 905b I believe, and the PII is using an intel fxp type card. Granted this is from my living room to my bedroom so that may be part of what I see. Also, the Dual PII is running -STABLE as of a week ago, and the Athlon is running -CURRENT as of about a week ago. Ken On Fri, 30 Nov 2001, James C. Durham wrote: (snip...a large number of postings regarding slow performance by 4.x kernels with TCP/IP) A friend who works for a local university and I tried moving large files using variouis OS'es and hardware. These are FTP transfers with file sizes from 100 to 300 megabytes.. The conclusion we arrived at was that the TCP performance of FreeBSD 4.x and Linux is aproximately the same and that processor speed makes the most difference. In one case, a fast laptop with 16 bit pcmcia NIC did poorly. Moving large files on 100mb/s ethernet backbones gave the folowing results... Dual 800 mhz PIII processors with Linux 6.1: 10mB/s. Sunblade 100's: 10mB/s. Single 1.4ghz processors (noname box)with 3C905 NICS, FreeBSD-stable (June 2001).: 9.5 mB/s. In the case wehere we had only one machine of a type, we used the dual 800mhz machines as a sink...with the following results (this is probably questionable): Dual 333 Linux 5.1 5mB/s Pentium 350 III with 3C905 NIC, Linux 5.1: 2mB/sec K6-2 400 with smc NIC, Linux 5.1: 2.8mB/sec Dell 500mhz PowerEdge with 4.3 with 3C905 NIC to HP Netserver PII 266, both running 4.3-RELEASE: 3.0 mB/sec. Dell 500mhz PowerEdge with 4.3 to Dell 850mhz laptop running 4.4 with Dlink PCMCIA ethernet card: 1.0 mB/sec. (caused by pcmcia NIC?) PIIMMX 200mhz box running 4.4-Relese with 3C905 to same Dell Laptop: 500kB/sec. Unfortunately, we didn't have any 7.x Linux available or 3.X FreeBSD. FWIW... Jim Durham To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
* Leo Bicknell [EMAIL PROTECTED] [011130 16:14] wrote: First off, apologies to Luigi, I was shooting off my mouth. Understandable, it's easy to get heated about an issue when it weighs so much in ones mind. I've done the same on several quite memorable occasions. Second off: On Fri, Nov 30, 2001 at 01:50:42PM -0600, Alfred Perlstein wrote: I was about to set the default in -stable to Leo's suggested values, it seems that -current already has the delta he wants in it, my question is, was anything else changed along the lines of the number of nmbclusters allocated in -current to go along with this change? On Fri, Nov 30, 2001 at 01:54:02PM -0600, Alfred Perlstein wrote: It seems not, I've committed the change. When I proposed this before there was a bit of a debate about needing to increase clusters and MBUF's. To summarize, I think we took the following away from it: * For most users it makes no difference, as they are far from the limits. Agreed. * This will make a small number of people who aren't hitting limits now hit an MBUF limit. - These people probably need increases anyway, as they are too close to the limit now. - Hitting the MBUF limit is fairly, well, harsh, and we might want to add syslog or other logged warnings at like 90% utilization or something. This is a very good idea. At a minimum I think: * There needs to be a note in the errata for the release this goes in mentioning more MBUF's might be needed. * LINT should be updated with a comment and a value 2 to 4 times GENERIC's default as the default listed value. Hmm, well the GENERIC default is some mathematical operation on maxusers. We really ought to make this scale as a default relative to the amount of ram in the system, rather than some low hardcoded value. NetBSD has some stuff for this in their buffercache sizing algorithm in netbsd-stable. It might be worth checking out, the formula is quite smart such that it has a decent size when system ram is low, then for each meg above X it increases it by some percentage. I find it to be too low, but whatever. :) * The logging at 90% usage should be investigated. I can probably generate patches for that over the weekend, provided I can find a good way to rate limit them. Generating one message is usually a good idea, however you could invesitagate how the icmp response limit messages are buffered. -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
Hi, I think that there are two different problems here. My situation involves a LAN (actually, a crossover cable). I have captured a trace of a 1 client run between the Linux driver and the FreeBSD test system as well as between the Linux driver and the same test system running Linux. I am noticing some interesting things. Linux uses the timestamp option in all the TCP segments I have looked at, so it is sending 12 more bytes per segment that FreeBSD. However, more interesting is that for small messages (less that 1460), FreeBSD does not seem to delay sending ACKs, so we get the following pattern: FREEBSD Driver - Test system: 94 byte IP DG with simulated command Test System - Driver: Ack after 83uS Test System - Driver: Psh Ack after 29uS with 79 total bytes in IP DG LINUX Driver - Test system: 106 byte IP DG with simulated command Test System - Driver: Psh Ack after 89uS with 91 total bytes in IP DG So, as you can see, Linux seems to shave some time off each transaction by avoiding sending extra ACKs. Also, what I am seeing is that neither FreeBSD nor Linux is doing ACK coalescing (if that is possible). While I understand that coalescing ACKs will mess up RTT calculations and SRTT a bit, it would serve to reduce the time taken until responses come back. What I am seeing for large transmits is the following: FreeBSD (Test) Linux (Driver) Request, 1500 bytes including request and some data More segments from the request Some ACKs - About one every two segments Last data segment, usually less that 1500 Lots of ACKs one per segment Usually with large window (ie 16020 when the max window seems to be 16384). Response Less than 1500 Now, I have seen something like 10+ ACKS after the driver has finished sending. They appear to be one per sent segment. Then the FreeBSD system sends its response. The optimal would be for the FreeBSD system to delay the ack until it has data to send, which it probably already has. What I see with the Linux trace is that Linux coalesces ACKs. However, the most I have seen it coalesce is two segments. HTH. -- Richard Sharpe, [EMAIL PROTECTED], LPIC-1 www.samba.org, www.ethereal.com, SAMS Teach Yourself Samba in 24 Hours, Special Edition, Using Samba To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
On Fri, Nov 30, 2001 at 04:28:32PM -0600, Alfred Perlstein wrote: * Matthew Dillon [EMAIL PROTECTED] [011130 16:02] wrote: Packet loss will screw up TCP performance no matter what you do. NewReno, assuming it is working properly, can improve performance for that case but it will not completely solve the problem (nothing will). Remember that our timers are only good to around 20ms by default, so even the best retransmission case is going to create a serious hicup. The question here is... is it actually packet loss that is creating this issue for you and John, or is it something else? The only way to tell for sure is to run tcpdump on BOTH the client and server and then observe whether packet loss is occuring by comparing the dumps. I would guess that turning off delayed-acks will improve performance in the face of packet loss, since a lost ack packet in that case will not be as big an issue. I have an odd theory that makes use of my waning remeberence of the stack behavior, this may be totally off base but I'd appreciate it if you guys would consider this scenerio if at all to put my mind at ease. I seem to remeber several places in the stack that detect what looks like a hiccup and immediately begin sending a sequence of ACKs in order to trigger the other side's fast retrasmit code. One of the things that I don't remember seeing is that state is persistant. There isn't anything in the receiver side that does this; ACKs are sent in response to incoming packets. However, state is maintained on the sender side as to whether we are performing fast retransmit or not. -- Jonathan To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
* Jonathan Lemon [EMAIL PROTECTED] [011130 17:00] wrote: On Fri, Nov 30, 2001 at 04:28:32PM -0600, Alfred Perlstein wrote: I have an odd theory that makes use of my waning remeberence of the stack behavior, this may be totally off base but I'd appreciate it if you guys would consider this scenerio if at all to put my mind at ease. I seem to remeber several places in the stack that detect what looks like a hiccup and immediately begin sending a sequence of ACKs in order to trigger the other side's fast retrasmit code. One of the things that I don't remember seeing is that state is persistant. There isn't anything in the receiver side that does this; ACKs are sent in response to incoming packets. However, state is maintained on the sender side as to whether we are performing fast retransmit or not. Either you don't follow or my concept of what happens is off. What i'm saying is this, consider each pair to be in some form of time: h1 send: p1 p2 p3 h2 recv: p1 p3 h1 recv: (nothing acks lost) h2 send: ack1 ack1 ack1 (dude, i missed a packet) h1 send: (nothing, waiting for ack) h2 send: (nothing, waiting for retransmit) h1 send: p1 p2 p3 (ack timed out) h2 send: (nothing, waiting for retransmit) what should happen is this: h1 send: p1 p2 p3 h2 recv: p1 p3 h1 recv: (nothing acks lost) h2 send: ack1 ack1 ack1 (dude, i missed a packet) h2 send: ack1 ack1 ack1 (dude, i missed a packet) h1 recv: ack1 ack1 ack1 h1 send: p2 p3 Basically, will the reciever keep acking not if 'it detects packet loss', but rather 'as long as packets are lost'. -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
On Fri, Nov 30, 2001 at 12:47:29PM -0800, Matthew Dillon wrote: Well, this is embarassing. I can reproduce this completely running 4.4-stable (Nov 17th kernel) on two machines. With newreno turned on, a TCP NFS mount only gets 80K/sec. With newreno turned off on the transmit side, a TCP NFS mount gets 7MB/sec. The state of the delayed-ack sysctl is irrelevant. This is without running any nfsiod's (which would mask the degredation of the synchronous messaging). I am tracking it down now. Is this the same problem that I experience on ssh connections between my 5.0-current laptop and my releng_4 server? When I run an 'ls' from the shell on large directories I get the response back block delay block delay block. I assumed that it was a problem with -current. Joe msg29413/pgp0.pgp Description: PGP signature
Re: TCP Performance Graphs
Alfred Perlstein wrote: Hmm, well the GENERIC default is some mathematical operation on maxusers. We really ought to make this scale as a default relative to the amount of ram in the system, rather than some low hardcoded value. NetBSD has some stuff for this in their buffercache sizing algorithm in netbsd-stable. It might be worth checking out, the formula is quite smart such that it has a decent size when system ram is low, then for each meg above X it increases it by some percentage. I find it to be too low, but whatever. :) This is an arbitrarily hard problem to solve correctly. The problem boils down to the inability to do allocations at interrupt, unless there is a prereserved mapping backing store already in place. This works for systems where the memory size KVA space, but when you stick 4G in a machine, the code in the kernel is way, way off (the machdep.c calculations for swap page mappings and others go off the scale, unfortunately). I think that what needs to happen is a reconsideration of the memory allocation system, almost entirely. THe seperation needs to be into swap path and non-swap path, rather than into allocable at interrupt time, and not. I hate to suggest it, but... perhaps a move away from type stable memory would not be a bad thing. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
who is postmaster?
I've tried getting information about our (FreeBSD) mail system by mailing to postmaster but no-one answers.. so, who IS the postmaster at the moment? I have the .elischer.org domain set up at Netowrk solutions with a contact address of [EMAIL PROTECTED] however whenever I try change anything in the setup it sends an Austhorize this email to the registered email address ([EMAIL PROTECTED]). However they never turn up in my mailbox so I can't do anything (like change the contact email address :-) to maintian my domain. Is freebsd.org throwing away mail from network solutions? (are they blackhole'd?) julian To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
:what should happen is this: : :h1 send: p1 p2 p3 :h2 recv: p1 p3 : :h1 recv: (nothing acks lost) :h2 send: ack1 ack1 ack1 (dude, i missed a packet) : :h2 send: ack1 ack1 ack1 (dude, i missed a packet) :h1 recv: ack1 ack1 ack1 :h1 send: p2 p3 : :Basically, will the reciever keep acking not if 'it detects packet loss', :but rather 'as long as packets are lost'. : :-- :-Alfred Perlstein [[EMAIL PROTECTED]] Yuch. That won't help. Basically you are taking a brute-force approach Send the ack a whole bunch of times in case some of them get lost. Such an approach does not typically work very well. For example, if the packet loss occured due to link congestion your solution will actually make the link more congested rather then less. If there is significant latency in the path the acks can get into a following run with the transmitter, making the transmitter believe that the packet loss is worse then it actually is and responding in kind, resulting in even more incorrect acks. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: who is postmaster?
On Fri, Nov 30, 2001 at 03:56:47PM -0800, Julian Elischer wrote: I've tried getting information about our (FreeBSD) mail system by mailing to postmaster but no-one answers.. so, who IS the postmaster at the moment? Still jmb. Kris msg29419/pgp0.pgp Description: PGP signature
Re: FreeBSD performing worse than Linux?
:Hi, : :I think that there are two different problems here. My situation :involves a LAN (actually, a crossover cable). : :I have captured a trace of a 1 client run between the Linux driver and :the FreeBSD test system as well as between the Linux driver and the same :test system running Linux. : :I am noticing some interesting things. Linux uses the timestamp option :in all the TCP segments I have looked at, so it is sending 12 more bytes :per segment that FreeBSD. : :However, more interesting is that for small messages (less that 1460), :FreeBSD does not seem to delay sending ACKs, so we get the following :pattern: : :FREEBSD : :Driver - Test system: 94 byte IP DG with simulated command :Test System - Driver: Ack after 83uS :Test System - Driver: Psh Ack after 29uS with 79 total bytes in IP DG : :LINUX : :Driver - Test system: 106 byte IP DG with simulated command :Test System - Driver: Psh Ack after 89uS with 91 total bytes in IP DG : :So, as you can see, Linux seems to shave some time off each transaction :by avoiding sending extra ACKs. : :Also, what I am seeing is that neither FreeBSD nor Linux is doing ACK coalescing (if :that is possible). : : :While I understand that coalescing ACKs will mess up RTT calculations and SRTT a bit, :it would serve to reduce the time taken until responses come back. Hmm. If I ssh between two machines and use tcpdump to monitor the packets I see proper delayed-ack operation: (^A - a keystroke that is not echo'd, a delayed-ack occurs) 16:15:36.673259 216.240.41.12.1025 216.240.41.11.22: P 200:220(20) ack 181 win 17520 (DF) [tos 0x10] 16:15:36.70 216.240.41.11.22 216.240.41.12.1025: . ack 220 win 17520 (DF) [tos 0x10] ('a' - a keystroke that is echo'd, delayed-ack, ack is returned in echod data) 16:15:49.143239 216.240.41.12.1025 216.240.41.11.22: P 240:260(20) ack 181 win 17520 (DF) [tos 0x10] 16:15:49.156878 216.240.41.11.22 216.240.41.12.1025: P 181:201(20) ack 260 win 17520 (DF) [tos 0x10] 16:15:49.251975 216.240.41.12.1025 216.240.41.11.22: . ack 201 win 17520 (DF) [tos 0x10] The timestamp could be creating an issue in your tests, though I'm not entirely sure what the rules are for replies to timestamped packets. Are you sure delayed acks are turned on on the FreeBSD box? An actual tcpdump might be helpful here. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
On Fri, Nov 30, 2001 at 11:49:13PM +, Josef Karthauser wrote: On Fri, Nov 30, 2001 at 03:45:21PM -0800, Matthew Dillon wrote: :... : I am tracking it down now. : :Is this the same problem that I experience on ssh connections between :my 5.0-current laptop and my releng_4 server? When I run an 'ls' :from the shell on large directories I get the response back block :delay block delay block. I assumed that it was a problem with :-current. : :Joe It sounds like the same problem. In fact, I seem to recall observing something very similar from my laptop while ssh'd into one of my servers, but at the time I though it was a hicup in the wireless network. Now though I think it was this same issue. I'm just about to reboot the server now with your recently committed changes - I'll let you know if it fixed anything. No, the problem remains after rebuilding the kernel on both boxes. Joe msg29421/pgp0.pgp Description: PGP signature
Re: FreeBSD performing worse than Linux?
: :No, the problem remains after rebuilding the kernel on both boxes. : :Joe Try to track down the sequence with a tcpdump. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)
Quoting Matthew Dillon ([EMAIL PROTECTED]): The question here is... is it actually packet loss that is creating this issue for you and John, or is it something else? The only way to tell for sure is to run tcpdump on BOTH the client and server and then observe whether packet loss is occuring by comparing the dumps. Packet loss is the problem for sure. I am dumping on the server and client side. http://www.irbs.net/server-dump.html http://www.irbs.net/client-dump.html In 60Ms the server pushed out about 200 segments. My test writes 1 byte at a time on an existing ssh conection so the payload per segment is small, 48 bytes. (48 + IP + TCP) * 200 is around 17KB in 60Ms which probably overflowed the frame switch queue. The client is on a fractional T1, the server is on a 10Mb - OC3 connection 1200 network miles away. Jonathan Lemon pointed out in the TCP Anomalies thread that slow start seems to be broken. John Capo To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 12:59:53PM -0800, Matthew Dillon wrote: The transmit side requires more thought. I did write that patch, and it does work, but it's too messy for my tastes. I would personally much rather rewrite it to (A) fix the RTT stored in the route tables and (B) adjust the transmit window based on that, which is a much less sophisticated patch (and less messy), but ought to work quite well in regards to transmit buffer management. I think I tried this patch, and found some problems with it. As I recall the problems were with extremely high bandwidth connections (eg, I have two machines that can move 100Mbps FDX across country (70ms latency), and when I tried the patch with that case performance was bad, in the sense that I got like 20Mbps, rather than 100, like it should have allowed. I believe I theorized at the time that the calculation code had a term that ended up zero, or infinity or something with values that large. Sadly, at this moment I don't have test boxes on that path, but I do now have test boxes behind an otherwise empty T1 that I can do interesting things with (eg, WRED on or off, short queues, long queues, artificial limits like CAR). I have a little time to do testing (emphasis on little), if one or two people would like to do more I can make the resources available, contact me privately. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: who is postmaster?
Julian Elischer wrote: I've tried getting information about our (FreeBSD) mail system by mailing to postmaster but no-one answers.. so, who IS the postmaster at the moment? I have the .elischer.org domain set up at Netowrk solutions with a contact address of [EMAIL PROTECTED] however whenever I try change anything in the setup it sends an Austhorize this email to the registered email address ([EMAIL PROTECTED]). However they never turn up in my mailbox so I can't do anything (like change the contact email address :-) to maintian my domain. Is freebsd.org throwing away mail from network solutions? (are they blackhole'd?) Would you believe me if I told you that Network Solutions dont know how to configure the DNS on their systems? peter@hub[4:39pm]/etc/postfix-58# host opsmail.prod.netsol.com Host not found. Anyway, I have added a special override to allow network solutions' misconfigured systems to send mail. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 05:19:18PM -0500, Mike Silbersack wrote: On Fri, 30 Nov 2001, Leo Bicknell wrote: * The logging at 90% usage should be investigated. I can probably ... Luigi, Jonathan and I had already been discussing this idea before this this thread even started. If you come up with a good patch to do this, I just committed to current (and soon to stable) some code to log _failures_ in mbuf allocations, but that is only meant as an aid to remove worse code in the drivers. I'd be inclined to say that the XX% monitoring is better done by userlevel daemons periodically polling the mbuf stats, rather than doing some extra work every time you allocate or free an mbuf. (Plus, just setting a threshold is not good, you also want some histeresys, because you can easily conceive a system that runs at XX % mbuf occupation, whatever XX you pick.) cheers luigi To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 05:30:33PM -0800, Luigi Rizzo wrote: I just committed to current (and soon to stable) some code to log _failures_ in mbuf allocations, but that is only meant as an aid to remove worse code in the drivers. Note that if we implement a 'fair share' buffering scheme we would never get a failure, which would be a good thing. Unfortuantely fair share is relatively complicated. (Plus, just setting a threshold is not good, you also want some histeresys, because you can easily conceive a system that runs at XX % mbuf occupation, whatever XX you pick.) With fair share or some other type of setup I would agree. Given our current 'things fail badly if you run out' I think a warning at 90% or 200 left, or something would be a real good idea. With the current allocation scheme this is on par with /foo file system is full messages, we should have a networking stack is full, build a kernel with more mbuf's message. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
* Matthew Dillon [EMAIL PROTECTED] [011130 17:45] wrote: :... : I am tracking it down now. : :Is this the same problem that I experience on ssh connections between :my 5.0-current laptop and my releng_4 server? When I run an 'ls' :from the shell on large directories I get the response back block :delay block delay block. I assumed that it was a problem with :-current. : :Joe It sounds like the same problem. In fact, I seem to recall observing something very similar from my laptop while ssh'd into one of my servers, but at the time I though it was a hicup in the wireless network. Now though I think it was this same issue. This may be a server problem, I'm ssh'd into a FreeBSD box from a NetBSD one and don't see the issue, going FreeBSD - FreeBSD at home seems to cause stalls, I'll try my netbsd laptop at home and see if i can reproduce the problem. I thought it was my crappy DSL causing the issue, perhaps not. -- -Alfred Perlstein [[EMAIL PROTECTED]] 'Instead of asking why a piece of software is using 1970s technology, start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 08:39:05PM -0500, Leo Bicknell wrote: On Fri, Nov 30, 2001 at 05:30:33PM -0800, Luigi Rizzo wrote: I just committed to current (and soon to stable) some code to log _failures_ in mbuf allocations, but that is only meant as an aid to remove worse code in the drivers. Note that if we implement a 'fair share' buffering scheme we would never get a failure, which would be a good thing. Unfortuantely fair share is relatively complicated. i don't get this. There is no relation among the max number of mbufs and their potential consumers, such as network interfaces, sockets, dummynet pipes, and others. And so it is unavoidable that even giving 1 mbuf each, you'll eventually fail an allocation. (Plus, just setting a threshold is not good, you also want some histeresys, because you can easily conceive a system that runs at XX % mbuf occupation, whatever XX you pick.) With fair share or some other type of setup I would agree. Given our current 'things fail badly if you run out' I think a warning I should have said any XX != 100. But note that what you say about bad failures is not really true. Many pieces of the kernel now are pretty robust in the face of failures -- certainly dummynet pipes, and the sis and dc drivers are, or i could not have tested the run out of mbuf message code which i just committed. I just think that the latter should not became a further source of trouble by filling up /var/log. cheers luigi To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
SSH stalls (was: FreeBSD performing worse than Linux?)
JK Is this the same problem that I experience on ssh connections between JK my 5.0-current laptop and my releng_4 server? When I run an 'ls' JK from the shell on large directories I get the response back block JK delay block delay block. I assumed that it was a problem with JK -current. I am quite sure that this is a problem introduced in OpenSSH v2.5 or earlier. When I upgraded a FreeBSD 4.2 box from OpenSSH v2.2.0 to a newer version (I don't remember exactly which one now) I noticed this stalling which had never appeared before. If I used SSH Inc ssh-2.4 there was no stalling. It's not FreeBSD-specific either: I am trying this now on a NetBSD 1.5.1 that has OpenSSH v2.5.2 and if I do ten ls -l as fast as I can, I get 14 retransmitted packets and stalling. If I try the same with SSH Inc ssh-3.0.0 I get no retransmitted packets. Strangely enough I get no stalling on either sshd if I cat a 3 megabyte text... -Tomas To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
On Fri, Nov 30, 2001 at 05:48:16PM -0800, Luigi Rizzo wrote: On Fri, Nov 30, 2001 at 08:39:05PM -0500, Leo Bicknell wrote: Note that if we implement a 'fair share' buffering scheme we would never get a failure, which would be a good thing. Unfortuantely fair share is relatively complicated. i don't get this. There is no relation among the max number of mbufs and their potential consumers, such as network interfaces, sockets, dummynet pipes, and others. And so it is unavoidable that even giving 1 mbuf each, you'll eventually fail an allocation. Well, this is true. If the number of sockets exceeds the number of MBUF's you will run out, no matter how well you allocate them. A corner case that should be handled delicately, no doubt, but one much less likely to happen. If each client was limited to one, or even two MBUF's total throughput would be so slow that the admin of the box would notice. That, added to that fact that there are thousands of MBUF's by default makes it nearly impossible that the ignorant sysadmin (aka desktop it should just work user) would run into this case. So, I will rephrase. I think a fair-share scheme would solve this for at least 5 9's of the problem. But note that what you say about bad failures is not really true. Many pieces of the kernel now are pretty robust in the face of failures -- certainly dummynet pipes, and the sis and dc drivers I'm my 'bad failures' is not so much that the box would crash or otherwise completely break itself. Rather my experience with exhausing MBUF's is that you can experience a sort of capture situation, where one or more busy connections can essentially starve out inactive connections. Those inactive connections may well be your ssh session where you're trying to fix it. Network performance when MBUF's are exhausted is eratic at best, and at worst completely stopped for a large number of processes on the system today. The nasty QoS word popped up when we talked about this before, that a QoS scheme could insure some connections go MBUF's, or even if there were more connections than MBUF's insure that connections got two at a time in a 'round robin' fashion or some other sheme to keep everything moving. If I could redesign buffering (from a TCP point of view) from the ground up I would: - Make the buffer size dymanic. Perhaps not at interrupt, but in a unified vm network should be able to take resources if it is active. - Make the buffers dynamically track individual connections. - Implement a fair-share mechanism. - Provide instrumentation to track when connections are slowed for lack of MBUF's. - Provide tuning parameters and maybe QoS parameters to be able to manage total buffer usage, individual connection buffer usage, and connection priorities. -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: FreeBSD performing worse than Linux?
On Saturday, 1 December 2001 at 8:11:19 +1030, Richard Sharpe wrote: Matthew Dillon wrote: Well, this is embarassing. I can reproduce this completely running 4.4-stable (Nov 17th kernel) on two machines. With newreno turned on, a TCP NFS mount only gets 80K/sec. With newreno turned off on the transmit side, a TCP NFS mount gets 7MB/sec. The state of the delayed-ack sysctl is irrelevant. This is without running any nfsiod's (which would mask the degredation of the synchronous messaging). I have upgraded to 4.4-STABLE, and have hacked in some changes to ata-dma.c (provided by Greg Lehey, but I had to do it by hand) What did you have to do by hand? so my drive is now running at UDMA 100. Can you send me dmesg output? In particular, I had a printf output there to show what the BIOS had set. Background for other people: Richard has an IDE chip which claims to be a SiS 5591, which according to the data sheet can't do better than UDMA 33. When he runs Linux on the box, however, it claims to be running at UDMA 100, and this hack seems to have had the same effect. I have also ensured that disk write caching is on, which it seems to be by default in 4.4. Yes, I think this is correct. These changes have made a difference to the NetBench and dbench runs (improved them), but they have made no difference to the tbench runs, which only do network stuff. I'd like to see the new dbench results. The traffic in the tbench case is SMB taffic. Request/response, with a mixture of small requests and responses, and big request/small response or small request/big response, where big is 64K. I have switched off newreno, and it made no difference. I have switched off delayed_ack, and it reduced performance about 5 percent. I have made sure that SO_SNDBUF and SO_RCVBUF were set to 131072 (which seems to be the max), and it increased performance marginally (like about 2%), but consistently. Have you tried Matt Dillon's patch? I am still analysing the packet traces I have, but it seems to me that the crucial difference is Linux seems to delay longer before sending ACKs, and thus sends less ACKs. Since the ACK is piggybacked in the response (or the next request), it all works fine, and the reponse/request gets there sooner. However, I have not convinced myself that the saving of 20uS or so per request/response pair accounts for some 40+ Mb/s. As long as the ack traffic isn't saturating the link, and you're not running half-duplex, I can't see how that would be the problem. Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
need cdrw info
any one know if there's supported IDE cdrw for freebsd4.1? Any software on FBSD4.1 to do the cdrw work? -- WWW.XGFORCE.COM - The Leader in System Clustering and Enterprise Firewall solution. -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: TCP Performance Graphs
:I think I tried this patch, and found some problems with it. As :I recall the problems were with extremely high bandwidth connections :(eg, I have two machines that can move 100Mbps FDX across country :(70ms latency), and when I tried the patch with that case performance :was bad, in the sense that I got like 20Mbps, rather than 100, :like it should have allowed. Yah. RTT noise probably did it in. At those bandwidths the algorithm would be very hard pressed to find the point as it increases CWIN where the RTT goes up. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: [OT] alarm() question
Why does the alarm go off but not interrupt the system call? bzzt() is executed, but the program doesn't print Done and exit for a minute plus. Pointers to FM to RT welcome. The system call is being interrupted, it just gets restarted right away by default. See Steven's UNIX Network Programming for a means of avoiding this behavior. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Make RELEASE broken?
Okay ill ask again just in case nobody saw this! is make release broken in 4.4-STABLE ?? Or is there a definitive guide/FAQ on how to properly use make release to cut a modified distribution ? cause either im doing something wrong, or its definatley broken. Thanks in Advance To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: [OT] alarm() question
The system call is being interrupted, it just gets restarted right away by default. See Steven's UNIX Network Programming for a means of avoiding this behavior. Of course, I'm completely wrong because we're not even talking about a system call here. Mike Mired already posted what you need. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Possible libc_r pthread bug
If at first you don't succeed... I've encountered a problem using pthread_cancel, pthread_join and pthread_setcanceltype, I'm hoping someone can shed some light. (in a nutshell : pthread_setcanceltype doesn't seem to work in FreeBSD 4.4) (posted to -current and -hackers; if there's a more appropriate mailing list for this, please let me know) I recently encountered a situation where, after calling pthread_cancel to cancel a thread, the call to pthread_join hangs indefinitely. I quickly figured out that it was because the thread being cancelled was never reaching a cancellation point (in fact it was an infinite loop with no function calls at all). Sure enough, adding a pthread_testcancel() in the loop allowed pthread_join to return. However this solution isn't acceptable for my requirements. I discovered the pthread_setcanceltype function and its PTHREAD_CANCEL_ASYNCHRONOUS parameter, which looked like they would give me exactly what I needed : allow threads to be cancelled regardless of what they are doing (basically a pthread equivalent to TerminateThread). Unfortunately, my tests have been less than conclusive : pthread_setcanceltype doesn't seem to do anything at all. It tells me it succeeds, subsequent calls properly report the previous cancellation type as ASYNCHRONOUS. But pthread_join still hangs, and adding pthread_testcancel calls still makes it work... I'm working on a FreeBSD 4.4-release machine; I ran the same test under FreeBSD 4.3-release and got the same results. However, running it on a Linux box (Mandrake release, 2.4.x kernel), I get exactly the results I was expecting (that is, setting the cancellation type to asynchronous allows the thread to be cancelled at any time) see the end of this message for my test program So the questions are -am I doing something wrong or misinterpreting the man pages? -if not, is this a known bug? -if so, is there a workaround (or is it already fixed)? -if not, can someone investigate? (I once had a look at the libc_r code and ran away screaming) If this turns out to be a bug in libc_r, a suggestion for a work-around (even a hack) would be much appreciated, even if a proper fix is found and committed to CVS (requiring an upgrade from 4.4-release installations is something we'd rather avoid). now for some disclaimers : I'm aware that asynchronous cancellations (TerminateThread-style) are an Evil Thing To Do. Unfortunately I have no choice in the matter. I'm aware that there are some strict limitations on what a thread is allowed to do while its cancellation type is asynchronous. specifically, it should only call cancel-safe functions. Note that in my test program, the thread being cancelled doesn't call any functions at all after setting its cancellation type, so this shouldn't be an issue. now for the code : #include stdio.h #include pthread.h /* thread entry point */ void * thread_entry (void *arg) { int i; if(0!=pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS,NULL)) { fprintf(stderr,setcanceltype failed!\n); } fprintf(stderr,thread_entry entering loop\n); while(1) { i++; /* uncomment this to insert a cancellation point */ /* pthread_testcancel();*/ } /* if we see this, it would mean the loop has been optimized out... */ fprintf(stderr, after loop\n); } int main(void) { pthread_t thread; pthread_attr_t attr; void *pthread_param; pthread_attr_init(attr); fprintf(stderr,creating thread\n); pthread_create(thread,attr,thread_entry,NULL); fprintf(stderr,thread created; hit enter to cancel it...\n); getchar(); fprintf(stderr,cancelling...\n); if(0!=pthread_cancel(thread)) { fprintf(stderr,cancel failed!\n); } fprintf(stderr,after cancel, before join...\n); if(0!=pthread_join(thread,pthread_param)) { fprintf(stderr,join failed!\n); } fprintf(stderr,after join\n); } please ask if more details are needed Thanks in advance, Louis-Philippe Gagnon ~ [EMAIL PROTECTED] Macadamian Technologies Software experts for the world's leading technology companies. http://www.macadamian.com To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Possible libc_r pthread bug
On Fri, 30 Nov 2001, Louis-Philippe Gagnon wrote: If at first you don't succeed... I've encountered a problem using pthread_cancel, pthread_join and pthread_setcanceltype, I'm hoping someone can shed some light. (in a nutshell : pthread_setcanceltype doesn't seem to work in FreeBSD 4.4) (posted to -current and -hackers; if there's a more appropriate mailing list for this, please let me know) I recently encountered a situation where, after calling pthread_cancel to cancel a thread, the call to pthread_join hangs indefinitely. I quickly figured out that it was because the thread being cancelled was never reaching a cancellation point (in fact it was an infinite loop with no function calls at all). Sure enough, adding a pthread_testcancel() in the loop allowed pthread_join to return. However this solution isn't acceptable for my requirements. I discovered the pthread_setcanceltype function and its PTHREAD_CANCEL_ASYNCHRONOUS parameter, which looked like they would give me exactly what I needed : allow threads to be cancelled regardless of what they are doing (basically a pthread equivalent to TerminateThread). Unfortunately, my tests have been less than conclusive : pthread_setcanceltype doesn't seem to do anything at all. It tells me it succeeds, subsequent calls properly report the previous cancellation type as ASYNCHRONOUS. But pthread_join still hangs, and adding pthread_testcancel calls still makes it work... I'm working on a FreeBSD 4.4-release machine; I ran the same test under FreeBSD 4.3-release and got the same results. However, running it on a Linux box (Mandrake release, 2.4.x kernel), I get exactly the results I was expecting (that is, setting the cancellation type to asynchronous allows the thread to be cancelled at any time) see the end of this message for my test program So the questions are -am I doing something wrong or misinterpreting the man pages? No, not really. -if not, is this a known bug? Or feature? -if so, is there a workaround (or is it already fixed)? Not fixed. Work-around could be to use pthread_signal and exit the thread from there. -if not, can someone investigate? (I once had a look at the libc_r code and ran away screaming) Since your thread is compute bound, it is only woken up from the thread library's scheduling signal handler. In this case, it can only resume the thread from the interrupted context, and so there is no check for the thread being canceled. -- Dan Eischen To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message