Re: [PATCH] PPPOE can kfree SKB twice (was Re: kernel panic problem. (smp, iptables?))

2001-07-19 Thread kuznet
Hello! > However, could we not have dev_queue_xmit behave as such (not free > frame on failure)? If you need to hold original skb, you may hold its refcnt. However, this feature inevitably results in big troubles: dev_queue_xmit() is allowed to change skb and you cannot assume anything about

Re: [PATCH] PPPOE can kfree SKB twice (was Re: kernel panic problem. (smp, iptables?))

2001-07-19 Thread kuznet
Hello! SOme short comment on the patch: > - dev_queue_xmit(skb); > + /* The skb we are to transmit may be a copy (see above). If > + * this fails, then the caller is responsible for the original > + * skb, otherwise we must free it. Also if this fails we must > + * free

Re: softirq in pre3 and all linux ports

2001-06-20 Thread kuznet
Hello! > Soft irqs should definitely not be much heavier than an irq handler, > if they are then we have implemented them wrongly somehow. For example, all the networking nicely fits to this class. :-) Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body

Re: IPv6: the same address can be added multiple times

2001-05-18 Thread kuznet
Hello! > 2) no significant restrictions (==this) When user asks to create some object, the only required thing of any reasonable interface is to return an error when the object is not added. KAME's one is broken, ours is _one_ of right ones. Another example of bad mistake is mine: I have mad

Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified

2001-05-14 Thread kuznet
Hello! > Note that using dev->name during probe was always incorrect. Think > about the error case: ... > So, using interface name in this manner was always buggy because it > conveys no useful information to the user. I used to think about cases of success. 8) In any case the question follows:

Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified

2001-05-14 Thread kuznet
Hello! > Jeff has introduced `alloc_etherdev()' which allocates storage > for a netdev but doesn't register it. The one quirk with this > approach (and why it's vastly simpler than my thing) I do not see where it is simpler. The only difference is that name is unknown. 8) > Not many drivers

Re: skb->truesize > sk->rcvbuf == Dropped packets

2001-05-14 Thread kuznet
Hello! > Hmmm... I don't see how not touching buffer values can solve his > problem at all. His MTU is really HUGE, and in this case 300 byte > packet eats 10k or so space in receive buffer. Default rcvbuf is ~64K, it is enough to receive up to mtu of a bit less 64K. When application says rcvbu

Re: skb->truesize > sk->rcvbuf == Dropped packets

2001-05-13 Thread kuznet
Hello! > > Any suggestions on heuristics for this ? Not to set rcvbuf to ridiculously low values. The best variant is not to touch SO_*BUF options at all. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majord

Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified

2001-05-13 Thread kuznet
Hello! > I believe these events get sent to the cardmgr daemon and it does > all the ifconf magic to change the device state. Compare this also to the situation with netif_present(). After Linus said that it is called from thread context, I prepared corresponding code for netif_present (and for

Re: IPv6: the same address can be added multiple times

2001-05-13 Thread kuznet
Hello! > It appears you can add _exactly_ same IPv6 address on an interface many > times: Yes. BTW, look here: kuznet@dust:~ # ip -6 a ls sit0 7: sit0@NONE: mtu 1480 qdisc noqueue inet6 ::127.0.0.1/96 scope host inet6 ::193.233.7.100/96 scope global inet6 ::193.233.7.100/96

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! > If send_head doesn't point to skb then it is before it (and it cannot > advance under us of course because we hold the sock lock) and so in such > case we didn't clobbered the send_head at all in skb_entail, and so we > don't need to touch send_head in order to undo (we only need to unli

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! > zero and we are running in such slow path, it is obvious the send_head > _was_ NULL when we entered the critical section, so it's perfectly fine It is not only not obvious, it is not true almost always. On normally working tcp send_head is almost never NULL, it is NULL only when applica

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! > this is the strict fix: Andrea, you caught the problem! The fix is not right though (it is equivalent to straight tp->send_head=NULL, as you noticed. It also corrupts queue in an opposite manner.) Right fix is appended. Explanation: in do_fault we must undo effect of enqueueing new se

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread kuznet
Hello! > My current theory is that tcpblast does something erratic when the > error occurs. It has buffer size of 32K, so that it faults at enough large chunk sizes. Erratic errno is because this applet prints errno on partial write. Oops is apparently because I did something wrong in do_fault

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-21 Thread kuznet
Hello! > Im my case P-MTU discovery Sorry, I lied. Not pmtu discovery but exaclty opposite effect is important here: collapsing of small frames to larger ones. Each such merge results in loss of 1 "sack" in 2.2. > I only wrote that it was active when got stuck. It may be idle before - > I do

Re: CONFIG_PACKET_MMAP help

2001-04-20 Thread kuznet
Hello! > 1. for tp_frame_size, I dont want to truncate any data on ethernet, I > need 1514 bytes, is this the best way to do it and not waste space? To select small snapsize (obtained from later experiments), to set PACKET_COPY_THRESH to read larger packets via recvmsg(). > 2. what is tp_block_

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-11 Thread kuznet
Hello! >mtu 382 + keepalive yes -> loss >mtu 382 + keepalive no -> ok Well, I ignored this because it looked as full sense. Sorry. 8) > such a picture? If the answer is "yes", I am almost satisfied. :-) No, the answer is strict "no". Until keepalive is triggered the first time, it ca

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-11 Thread kuznet
Hello! > If your model does not cover such situation, pls, take it in mind. :) Taken. Is the machine UP? The only other known dubious place is smp specific... BTW if that cursed socket is still alive, try to make the experiment with filling window on it. It must stuck, or my theory is complet

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-11 Thread kuznet
Hello! > In my experiments linux simply sets mss=mtu-40 at the start of ethernet > connections. I do not know why, but belive it's ok. How the version of > kernel and configuration options can affect mss later? You can figure out this yourself. In fact you measured this. With mss=1460 the pr

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-11 Thread kuznet
Hello! > At last, I tried several MTUs on 3d computer, running "right" 2.2.17, and > could not find conditions, under which any loss of ACKs can be detected. 8)8)8) ppp also inclined to the mss/mtu bug, it allocates too large buffers and never breaks them. The difference between kernels looks

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-11 Thread kuznet
Hello! > > If my guess is right, you can easily put this socket to funny state > > just catting a large file and kill -STOP'ing ssh. ssh will close window, > > but sshd will not send zero probes. > > [1] I have checked your statement on 2 different machines, running 2.2.17. > No confirmation.

Re: Bug report: tcp staled when send-q != 0, timers == 0.

2001-04-10 Thread kuznet
Hello! > In brief: a stale state of the tcp send queue was observed for 2.2.17 > while send-q counter and connection window sizes are not zero: I think I pinned down this. The patch is appended. > diagnostic, I'll try to get it. In any case, I plan to run something through > this connecti

Re: [PATCH] Re: softirq buggy

2001-04-09 Thread kuznet
Hello! > Btw, you don't schedule the ksoftirqd thread if do_softirq() returns > from the 'if(in_interrupt())' check. ksoftirqd will not be switched to before the first schedule or ret form syscall, when softirqs will be processed in any case. So, wake up in this case would be mistake. > I assu

Re: softirq buggy [Re: Serial port latency]

2001-04-08 Thread kuznet
Hello! > But with a huge overhead. I'd prefer to call it directly from within the > idle functions, the overhead of schedule is IMHO too high. + if (current->need_resched) { + return 0; + } + if (softirq_active(smp_processor_id()) & softi

Re: TCP stack misbehaviour?

2001-04-08 Thread kuznet
Hello! > empty, except for occasional ACKs. The utilization of the channel is about 4%. 1. tcpdump is required. 2. exact vesion of used kernel is required too. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More maj

Re: new queuing discipline

2001-04-08 Thread kuznet
Hello! > packet in the queue. No other conditions i found. But i need repeatedly test > the top packet in the queue. > > How to accomplish it? Look into sch_tbf.c for example. Hint: timer. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: softirq buggy [Re: Serial port latency]

2001-04-08 Thread kuznet
Hello! > + if (softirq_active(smp_processor_id()) & softirq_mask(smp_processor_id())) { > + do_softirq(); > + return 0; BTW you may delete do_softirq()... schedule() will call this. > + * > + * Isn't this identical to default_idle with the 'no-hlt' boot > + * option

Re: udp <-> tcp connect

2001-03-31 Thread kuznet
Hello! > I want to bind to non-local IP and send/receive UDP packets. This is impossible, apparently. > but in tcp_v4_connect: > tmp = ip_route_connect(&rt, nexthop, sk->saddr, > RT_TOS(sk->ip_tos)|RTO_CONN|sk->localroute, sk->bound_dev_if); > ^^

Re: IP layer bug?

2001-03-31 Thread kuznet
Hello! > Hm. But comment in linux/skbuff.h says: The comment is about more difficult case: transmit path, where cb is used both by top level protocol and lower layers: f.e. TCP -> IP -> device. cb is dirty from the moment of skb creation in this case. Also, note that the second sentence in the

Re: IP layer bug?

2001-03-30 Thread kuznet
Hello! >For now I workarounded it with filling skb->cb with zeroes before >netif_rx(), This is right. For another examples look into tunnels. > but I believe it is a kludge and networking layer should be fixed instead. No. alloc_skb() creates skb with clean cb. ip_rcv() and other prot

Re: rsync over ssh on 2.4.2 to 2.2.18

2001-03-19 Thread kuznet
Hello! > Well, since I moved the rsync to 5pm, and then back to 9pm, I haven't > seen this problem - everything is again working as expected (touch wood) > with 2.2.15pre13 and 2.4.0. > > This is odd, since it wasn't a one-off problem, but something that happened > each and every day of a partic

Re: poll() behaves differently in Linux 2.4.1 vs. Linux 2.2.14 (POLLHUP)

2001-03-15 Thread kuznet
Hello! > Sure, workarounds exist, but they just complicates > things. Working around --- what? An example of application hitting the case is enough to make me completely agreed. But genarally we are not going to match any os and even yourselves yesterday or tomorrow in the cases when behaviour

Re: poll() behaves differently in Linux 2.4.1 vs. Linux 2.2.14 (POLLHUP)

2001-03-14 Thread kuznet
Hello! > True, this behavior was changed from 2.2.x. We now match the behavior > of other svr4 systems, in particular Solaris. Damn, we did not test behaviour on absolutely new clean never connected socket... Solaris really may return 0 on it. However, looking from other hand the issue looks a

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread kuznet
Hello! > freebsd-4.0 doesn't use direct transfers for PAGE_SIZE'd pipe write()s: > it uses MINDIRECT=8192. I see. > (and PIPE_BUF is 512, so 4096 was possible for > them) 8) I see. Thank you for patience. 8) Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread kuznet
Hello! > freebsd Very funny, the idea is borrowed from there. As you could understand your patch kills it. PAGE_SIZE is one of the most frequently used transfer unit. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED]

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread kuznet
Hello! > It returns immediately on all unix platforms I tested I see. It is essential moment. PAGE_SIZE was really bad threshold value. Sigh and alas. Alexey PS BTW "all unix" is unlikely to include freebsd. 8) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the b

Re: Feedback for fastselect and one-copy-pipe

2001-03-12 Thread kuznet
Hello! > * davem's patch breaks apps that assume that write(,PIPE_BUF) after > poll(POLLOUT) never blocks, even for blocking pipes. Pardon, but PIPE_BUF <= PAGE_SIZE yet, so that fears have no reasons. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body

Re: Incoming TCP TOS: A simple question, I would have thought...

2001-03-07 Thread kuznet
Hello! > I've scrolled through various code in net/ipv4, and I can't see how to query > the TOS of an incoming TCP stream (or at the least, the TOS of the SYN which > initiated the connection). No way. Formally it is IP_RECVTOS, followed by IP_PKTOPTIONS. But getting TOS via IP_PKTOPTIONS is no

Re: Inadequate documentation: sockets

2001-03-06 Thread kuznet
Hello! > The manual specifies the following flag to be returned by the > kernel > > #define POLLHUP 0x0010/* Hung up */ > > Hanging up is ambiguous. Does it mean that the client is dead, > that he closed his end of the socket, or that he shut down one or > both directions of the data flo

Re: Another rsync over ssh hang (repeatable, with 2.4.1 on both ends)

2001-03-03 Thread kuznet
Hello! > this kernel was compiled with GCC 2.95.2, This is a hint. Could you make the following things: 1. to disassemble tcp_poll() (the easiest way is to gdb vmlinux, to say x/i tcp_poll and to hold enter pressed long enough, copying screen to file) and to send the result to me. 2. to

Re: Another rsync over ssh hang (repeatable, with 2.4.1 on both ends)

2001-03-02 Thread kuznet
Hello! > same means its not the same bug? It is the same, I think. > If you still insist that it is purely a 2.2.15pre13 bug I never said this. I said that your strace is _wrong_, how can I be sure that tcpdump is not wrong too? You could understand this. 8) > together to put 2.2.18 on this

Re: Another rsync over ssh hang (repeatable, with 2.4.1 on both ends)

2001-03-02 Thread kuznet
Hello! > I've also reported The report by Scott Laird is sane unlike your one. It can be explained by bug rather than only by poltergeist. 8) > Thanks for confirming that 2.2.15pre13 is not the cause. Russel, you are warned that kernels<2.2.17 and rsync is an incompatible combination. Alexey

Re: What is 2.4 Linux networking performance like compared to BSD?

2001-03-01 Thread kuznet
Hello! > They know that iMimic's polymix performance on Linux 2.2.* is half what it is on > BSD. What is "iMimic's polymix"? I am almost sure, it is simply buggy and was not _debugged_ under linux. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

Re: rsync over ssh on 2.4.2 to 2.2.18

2001-02-28 Thread kuznet
Hello! > I'll see if I can strace it from the start until it hangs tomorrow. Please... Also, try to make binary tcpdump. > I was running at one point a 2.4.0-test kernel, but I didn't see these Yes, it did not result in full stall. Lost wakeups were recovered f.e. by any keyboard activity. 8

Re: rsync over ssh on 2.4.2 to 2.2.18

2001-02-28 Thread kuznet
Hello! > I've seen hanging rsync over ssh more than once, while sending much data > from an x86 running Linux (late 2.3.x) to Sparc/Solaris2.5.1 I remember this your report. However, recent news force to suspect that the reason was in Solaris yet. Actually, if you send tcpdump of failed session,

Re: rsync over ssh on 2.4.2 to 2.2.18

2001-02-27 Thread kuznet
Hello! > netstat on isdn-gw shows the following: > > Proto Recv-Q Send-Q Local Address Foreign Address State > tcp72868 0 isdn-gw.piltdown.a:1023 pilt-gw.piltdown.at:ssh ESTABLISHED plus > select(4, [3], [3], NULL, NULL) = 2 (in [3], out [3]) > Ma

Re: New net features for added performance

2001-02-27 Thread kuznet
Hello! > > 3) Enforce correct usage of it in all the networking :-) > > ,) -- the tricky part. No tricks, IP[v6] is already enforced to be clever; all the rest are free to do this, if they desire. And btw, driver need not to parse anything, but its internal stuff and even aligning eth II header

Re: possible bug x86 2.4.2 SMP in IP receive stack

2001-02-27 Thread kuznet
Hello! > Feb 23 12:42:30 rcc2 kernel: Warning: kfree_skb passed an skb still on a list (from >c01f58dc). BTW, that's didactic example of bug which results in similar behaviour. Alexey > From: [EMAIL PROTECTED] (Andrew Morton) > Subject: Re: Failed assertion > Date: 27 Feb 2001 04:15:01 +0300

Re: Very high bandwith packet based interface and performance problems

2001-02-23 Thread kuznet
Hello! > > Yes its a SHOULD in RFC1122, but in any normal environment pretty much a > > must and I know of no stack significantly violating it. > > I didn't know there was such a thing as a normal environment :) Jokes apart, such "normal" environments are rare today. >From tcpdumps it is clear

Re: 2.4.1 under heavy network load - more info

2001-02-21 Thread kuznet
Hello! > OK! I actually expected 2.4 to be somewhat selftuning. Defaults for these numbers (X,Y,Z) are very conservative. > Interesting you say that, I looked at the logs and I see over 5000 sockets > used, does'nt look peaceful to me. But you are absolutely right about the > orphans. The erro

Re: 2.4.1 under heavy network load - more info

2001-02-20 Thread kuznet
Hello! > of errors a bit but I'm not sure I fully understand the implications of > doing so. Until these numbers do not exceed total amount of RAM, this is exactly the action required in this case. Dumps, which you sent to me, show nothing pathological. Actually, they are made in some period of

Re: MTU and 2.4.x kernel

2001-02-19 Thread kuznet
Hello! > We are implementing an IP stack. Alan, please, tell me what is wrong. And we will repair this. The implementation follows RFCs and even relaxes their requirements in the cases, when they are far from reality. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-ker

Re: SO_SNDTIMEO: 2.4 kernel bugs

2001-02-19 Thread kuznet
Hello! > You are right - our sendfile() implementation is broken. I have fixed it Thank you! > Investigation shows that the Linux network layer is behaving oddly. It > seems that we are writing 4096 bytes to a socket. This proceeds in 4096 > byte chunks until the send buffer on the socket is f

Re: MTU and 2.4.x kernel

2001-02-18 Thread kuznet
Hello! > Please cite an exact RFC reference. Imagine, I found this reference yet. This is rfc1191, of course. 8) in the MSS option. The MSS option should be 40 octets less than the size of the largest datagram the host is able to reassemble (MMS_R, as defined in [1]); in many cases, t

Re: MTU and 2.4.x kernel

2001-02-18 Thread kuznet
Hello! > This smells bad. Datagram protocol send sizes are only limited by > socket buffer size, nothing more. Fragmentation makes it work. The thread was started from the observation that fragmented frames do _not_ pass through router. See? 8) Path mtu discovery exists exactly to help to sol

Re: MTU and 2.4.x kernel

2001-02-18 Thread kuznet
Hello! > Wouldn't it be simpler to just fix the bugs There are no bugs. There is phylosophical discussion about current state of internet communications. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordom

Re: MTU and 2.4.x kernel

2001-02-18 Thread kuznet
Hello! > Message size != MTU. Alan, you misunderstand _sense_ of the problem. Fragmentation does _not_ work on poor internet more. At all. Look at original report. It failed _only_ because his intemediate node failed to forward fragmented packets. Alexey - To unsubscribe from this list: send t

Re: SO_SNDTIMEO: 2.4 kernel bugs

2001-02-18 Thread kuznet
Hello! > .. unless that page was partially written, in which case a short write > count is returned (rather than a timeout error), and the loop goes around > again. sendfile() does not return on partial write and tries to push more until error. On fast link it most likely succeeds, so that it is

Re: SO_SNDTIMEO: 2.4 kernel bugs

2001-02-18 Thread kuznet
Hello! > So the actual timeout would be 2 * SO_SNDTIMEO. It will timeout if write of some page blocks for SO_SNDTIMEO. If transmission of any page never takes more than SO_SNDTIMEO it never times out. You can think about sendfile() as subroutine doing: for (;;) { read(4

Re: 2.2.x: TCP lockups with tcp_timestamps

2001-02-18 Thread kuznet
Hello! > Yes. The 5.6.7.8 machine is connected to the Internet via a Linksys > "router" that is also performing masquerade. > > I will be very angry if this turns out to be the culprit. I am afraid it is. It corrupts packets preserving their checksum. Look: > Trace taken from 1.2.3.4 machi

Re: SO_SNDTIMEO: 2.4 kernel bugs

2001-02-18 Thread kuznet
Hello! > Unfortunately, I discovered a bug with SO_SNDTIMEO/sendfile(): None of the options apply to sendfile(). It is not socket level operation. You have to use alarm for it. BTW, if you have enough fast network, you probably can observe that sendfile() is even not interrupted by signals. 8)

Re: SO_SNDTIMEO: 2.4 kernel bugs

2001-02-17 Thread kuznet
Hello! > Unfortunately, it seems to be very buggy. Here are two buggy scenarios. --- ../vger3-010210/linux/net/ipv4/tcp.cSat Feb 10 23:16:51 2001 +++ linux/net/ipv4/tcp.cSat Feb 17 23:27:43 2001 @@ -691,6 +691,8 @@ set_current_state(TASK_INTERRUPTIBLE); +

Re: MTU and 2.4.x kernel

2001-02-15 Thread kuznet
Hello! > I ran DNS reliably over AX.25 networks. They have an MTU of 216. They work. Please, Alan, distinguish two things: "works" and "works, until I ask X". The second is equal to "does not". 512 is maximal message size, which is transmitted without troubles, hardwired to almost all the datag

Re: strange tcp errors

2001-02-15 Thread kuznet
Hello! > Maybe someone want to say me what does it mean and how serious it is? It means that debugging messages are still not disabled in 2.4.x 8) > Any fixes? These ones can be ignored. Alexey - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: MTU and 2.4.x kernel

2001-02-15 Thread kuznet
Hello! > Please cite an exact RFC reference. No need to cite RFC, this is plain sillogism. A. Datagram protocols do not work with mtus not allowing to send 512 byte frames (even DNS). B. Accoutning, classification, resource reervation does not work on fragmented packets. -> IP suite is n

Re: MTU and 2.4.x kernel

2001-02-15 Thread kuznet
Hello! > Kernel 2.4.x apparently disregards my ppp options MTU setting of 552 > and sets mss=536 (=> MTU=576). Yes, default configuration is not allowed to advertise mss<536. The limit is controlled via /proc/sys/net/ipv4/route/min_adv_mss, you can change it to 256. Default of 536 is sadistic (

Re: 2.4.1 errors under heavy network load

2001-02-12 Thread kuznet
Hello! > Then we did the following tuning attempt: Please, cat /proc/net/tcp and /proc/net/sockstat and send result to me (gzipped), together with /proc/sys/net/ipv4/tcp_* values > echo "20480" > /proc/sys/net/ipv4/tcp_max_orphans Also, you want to increase memory allowed for TCP echo "X/2 X/

Re: BUG: SO_LINGER + shutdown() does not block?

2001-02-11 Thread kuznet
Hello! > I'm not seeing shutdown(2) block on a TCP socket. This is Linux kernel > 2.2.16 (RH7.0). Is this a kernel bug, a documentation bug, Man page is wrong. What's about kernel... Hmm, actually, it is worth to test genuine bsd. Such feature could be useful. Alexey - To unsubscribe from this

Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()

2001-02-08 Thread kuznet
Hello! > Here is a patch which may not solve the underlying This does not. refcnt cannot be <1 at this point. > assuming that the latter messages aren't serious? They are fatal. Machine must be rebooted after them. > I hope the networking gurus can find the real bugs here. Well, someone fo

Re: 2.4.1 tcp ack bug ?

2001-02-07 Thread kuznet
Hello! > 20:56:26.172532 ppp0 < mc105-v-2.royaume.com.6699 > ppp12.shiny.it.33148: . > 88073:89533(1460) ack 77 win 8684 (DF) > Ok, it has just received the missing part, so why it does not ack 98313 ? Apparently, because this segment has not been received. Look at checksum errros statisti

Re: Bug in tcp_time_to_recover

2001-02-07 Thread kuznet
Hello! >/* Not-A-Trick#2 : Classic rule... */ >if (tcp_fackets_out(tp) > tp->reordering) > ^ >return 1; ... > Shouldn't it be a >= instead of > ? No. fackets_out is equivalent of Reno dupacks+1. F.e. look at the most common case, where FACK is equiv

Re: TCP_NOPUSH on FreeBSD, TCP_CORK on Linux (was: Is sendfile all that

2001-02-06 Thread kuznet
Hello! > > How close is TCP_NOPUSH to behaving identically to TCP_CORK now? They have not so much of common. TCP_NOPUSH enables T/TCP and its presense used to mean that T/TCP is possible on this system. Linux headers cannot even contain TCP_NOPUSH. Alexey - To unsubscribe from this list: send

Re: AF_UNIX hangs

2001-02-01 Thread kuznet
Hello! > Looking at net/core/datagram.c:wait_for_packet the code will return 0 Yep... Damn, specially split errno and ready values and forgot to use this. 8) Sorry. Alexey --- ../vger3-010130/linux/net/core/datagram.c Thu Dec 28 22:44:08 2000 +++ linux/net/core/datagram.c Thu Feb 1 22:4

Re: PROBLEM: small socket send/receive buffers on TCP stream result in data not being transferred

2001-02-01 Thread kuznet
Hello! > 1. small socket send/receive buffers result in data not being transferred I know why your test does not work in 2.4 (sort of bug, fix is appended). But I have no idea, why it does not work with 2.2. Please, make tcpdump in this case. > and SO_RCVBUF (see small attached programs), I f

Re: SBF queueing?

2001-01-27 Thread kuznet
Hello! > Has anyone decided to code a SFB (Stochastic Fair Blue) queue implementation > for Linux? I did not hear anything about this. > (http://www.eecs.umich.edu/~wuchang/blue/). The paper for it shows it > performing very well in comparison to RED. Yes, the algorithm looks interesting. Ale

Re: [UPDATE] Zerocopy patches, against 2.4.1-pre10

2001-01-27 Thread kuznet
Hello! > verify this? The only way I can think of is to verify that the checksum > field is zero initially, correct? It is not zero. It contains checksum of pseudoheader. > fits the new Linux model a bit better, as it has one descriptor per > packet, not one per fragment (like the current impl

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-27 Thread kuznet
Hello! > Why is it a bug to accept the ACK from it? RFC793 page 69 says > > If the RCV.WND is zero, no segments will be acceptable, but > special allowance should be made to accept valid ACKs, URGs and > RSTs. 8) This obscure place is discussed for ages. The question is: What is "

Re: [UPDATE] Zerocopy patches, against 2.4.1-pre10

2001-01-26 Thread kuznet
Hello! > drivers use it at this time, I see a grand total of 2 (hamachi and hme) in Plus acenic in zerocopy. Plus patch to do this is available for eepro100. > I'm just wondering, if a card supports sg but *not* TX csum, is it worth > it to make use of sg? eepro100 falls into this category..

Re: [UPDATE] Zerocopy patches, against 2.4.1-pre10

2001-01-25 Thread kuznet
Hello! > Starfire card does, maybe the 3com is different. :-) 3com _is_ different. 8) I is not an issue, we do not make zerocopy on IP fragments. > Are we even bothering with the partial checksums at this point, or > are we falling back to CPU checksumming if the packet is fragmented? Of cou

Re: [UPDATE] Zerocopy, last one today I promise :-)

2001-01-25 Thread kuznet
Hello! > > What exaclty were the issues with the intel cards and sg+csum? > > > > Any idea how much work it'd require to surmount them? > > Getting Intel to release full specs on how to make use of > TX hardware checksum assist with the eepro100. It simply does not exist for 82559* in all t

Re: [UPDATE] Zerocopy patches, against 2.4.1-pre10

2001-01-25 Thread kuznet
Hello! > no problems. I simply mounted an NFS server with rsize=wsize=8192 > and read a few files - I assume this is sufficient? This is orthogonal. Only TCP uses this and you need not to do something special to test it. Any TCP connection going through 3c tests it. > rather than using the I

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-25 Thread kuznet
Hello! I take my words back. Manfred is right, this requirement is not a MUST. Real problem is much worse, and it is wholly on the shame of solaris. Tcpdump shows at least two different bugs there. 2060 16:31:42.879337 eth0 < dynamic.ih.lucent.com.39406 > static.8664: . 675 80:67580(0) ack

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-24 Thread kuznet
Hello! > must be). Is there another RFC? It is exactly this place. As soon as BSD uses this feature, it is must for us. > Could you check what happened in line 2066 of this tcpdump? > 2066 16:31:43.108759 eth0 > static.8664 > dynamic.ih.lucent.com.39406: > . 1583720:1583720(0) ack 69041

Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync

2001-01-24 Thread kuznet
Hello! > I read through the tcpdump, and it seems that Linux completely ignores > packets with out-of-window sequence numbers: Yes, Linux is __very__ not right doing this. RFC requires to accept ACK, URG and RST on any segment adjacent to window, even if window is zero. Solaris also does thing,

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-21 Thread kuznet
Hello! > So now the question is: when does this new nagle algorithm delay packets in the > write queue? It _must_ do something, otherwise TCP_NODELAY would obviously be a > noop. It allows _one_ incomplete segment to fly. Minshall and BSD behave absolutely similarly in all the curcumstances exce

Re: Is sendfile all that sexy?

2001-01-21 Thread kuznet
Hello! > "struct page" tricks, some macros etc WILL NOT WORK. In particular, we do > not currently have a good "page_to_bus/phys()" function. That means that > anybody trying to do DMA to this page is currently screwed, simply because > he has no good way of getting the physical address. We alre

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread kuznet
Hello! > > write(10*MSS) > > write(1) > > write(1) ... > As far as I can tell, the second "write(1)" will always merge with the > first one This would be true, if Andrea wrote not exactly 10*MSS, but 10*MSS+1 or just write(). In some exceptional situations (sort of writi

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread kuznet
Hello! > So this mean if I do: Yes. It is cost, which we have to pay. Look into Minshall's draft, by the way (draft-minshall-nagle-*), it discusses pros and contras. Much saner behaviour wrt latency (and perfect clarity) overweights a bit worse coalescing. Alexey - To unsubscribe from this li

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread kuznet
Hello! > semantics of snd_sml), maybe it makes the difference but then I don't see how. It makes. One small packet is allowed to fly, not depending on packets_out. This is idea of Minshall. "Classic" Nagle also does not prohibit this, but it is difficult to formulate it in terms of presegmented

Re: Is sendfile all that sexy?

2001-01-20 Thread kuznet
Hello! > Actually, as long as there is no "struct page" there _are_ problems. > This is why the NUMA stuff was brought up - it would require that there > be a mem_map for the PCI pages.. (to do ref-counting etc). I see. Is this strong "no-no-no"? What is obstacle to allow "struct page" to sit o

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread kuznet
Hello! > is there really > much value in the second request flowing to the server before the first > byte of the reply has hit? Yes, of course, it has lots of sense: f.e. all the icons, referenced parent page are batched to single well-coales

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread kuznet
Hello! > My argument applies to 2.4. The uncork _won't_ push on the wire the last > not mss-sized fragment until it's the last one in the write queue even once > cwnd and receiver window allows that. I think Look at the code again. You misread it. > wouldn't be setting nonalge unconditionally

Re: Is sendfile all that sexy?

2001-01-19 Thread kuznet
Hello! > It's about direct i/o from/to pages, Yes. Formally, there are no problems to send to tcp directly from io space. But could someone explain me one thing. Does bus-mastering from io really work? And if it does, is it enough fast? At least, looking at my book on pci, I do not understand

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-19 Thread kuznet
Hello! > the business about the last 1100ish bytes of a 4096 byte send being > delayed by nagle only implies that the stack's implementation of nagle > was broken and interpreting it on a per-segment rather than a per-send > basis. + > software, or the host TCP stack. otherwise, the persistent c

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-19 Thread kuznet
Hello! > The "uncork" won't push the last skb on the wire if there is not acknowledged > data in the write_queue and the payload of the last skb in the write_queue > isn't large MSS. This because the `uncork' will only re-evaluate the > write_queue in function of the _nagle_ algorithm, quite corr

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-19 Thread kuznet
Hello! > I thought setsockopt is meant to set an option in the socket, It is not. setsockopt() is simply a bit more clever extension to ioctl(), which is adapted (in bsd style though) to understand layering and has an explicit length to data. It is prefered for all the operations on sockets, a

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-18 Thread kuznet
Hello! > So if I understand all this correctly... > > The difference in ACK generation CORK does not affect receive direction and, hence, ACK geneartion. The problem is that TCP does not know, when full request is received and it must ack instantly at connection start and after some idle peri

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-18 Thread kuznet
Hello! > Doing PUSH from setsockopt(TCP_CORK) looked obviously wrong because it isn't > setting any socket state, ? 8) > and also because the SIOCPUSH has nothing specific > with TCP_CORK, as said it can be useful also to flush the last fragment of data > pending in the send queue without havi

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-18 Thread kuznet
Hello! > Ingo, you should realize Ingo did not try to argue. I do not too. This is right, no doubts. > Mantra: "everything is a stream of bytes". Repeat until enlightened. ... but devil invented record marks and pushes, seduced mankind and we was evicted from the paradise. 8) Alexey - To un

Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-18 Thread kuznet
Hello! > I'm all for TCP_CORK but it has the disavantage of two syscalls for doing the MSG_MORE was invented to allow to collapse this to 0 of syscalls. 8) > A new ioctl on the socket should be able to do that (and ioctl looks ligther > than a setsockopt, ok ignoring actually the VFS is grabbi

  1   2   >