Re: unresolved symbols on SPARC with depmod -ae
Jeff Layton writes: > Anyway here's what I get, should I be concerned about this? ... > caladan:~# /sbin/depmod -ae -F /boot/System.map-2.4.2 > depmod: *** Unresolved symbols in > /lib/modules/2.4.2/kernel/drivers/block/loop.o > depmod: .div > depmod: .urem > depmod: .umul > depmod: .udiv > depmod: .rem > depmod: *** Unresolved symbols in Try to load one of the modules which show the problem, does it work? If so, it is a bug in depmod's handling of these ".foo" symbols. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: asm/unistd.h
SardaƱ[EMAIL PROTECTED], Eliel writes: > I'm taking a look at the linux code and I don't understand how do you > programm...mmm (?) may be i'm a stupid why in include/asm/unistd.h in some > macros you use this: Two reasons: 1) Empty statements give a warning from the compiler so this is why you see "#define FOO do { } while(0)" 2) It gives you a basic block in which to declare local variables. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: asm/unistd.h
Steve Grubb writes: > It would seem to me that after hearing how the macros are used in practice, > wouldn't turning them into inline functions be an improvement? This is > something gcc supports, it accomplishes the same thing, and has the added > advantage of type checking. > http://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_4.html#SEC92 Two reasons: 1) Sometimes I don't want any type checking because it would create the necessity of adding a new include to a file --> a circular dependency to resolve. Macros hide the types except in the cases where they are actually invoked :-) 2) Historically GCC was very bad with code generation with inline functions, so at that time the GCC manual statement "inline functions are just like a macro" was technically false :-) Yes, I know this is much different in today's gcc tree, but there hasn't been a gcc release in over 2 years so... Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.3 tcp window id causes problems talking to windows clients
Kevin Stone writes: > Is there any plan to include the zerocopy patches into the stock kernel? > The win2k dial-up/window id problem is really a showstopper but hasn't > generated much traffic on lkml or the digests. I submitted the patch to Linus, it will likely go into 2.4.4 but if not I'll submit the ID patch seperately. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] __init functions called by non-__init
Rusty Russell writes: > It's incredibly poor taste, though, and if we ever implement __init > dropping for modules (Keith?), Jakub Jelinek implemented this about 2 years ago, right before we hit 2.2.x, Linus thought it was too late at the time so we dropped that work from our trees. It was really good at finding __init bugs though... Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: goodbye
Rik van Riel writes: > Anyway, since linux-kernel has chosen to not receive email from me Funny how this posting went through then... If it is specifically when you are sending mail from some other place, state so, don't make blanket statements which obviously are not wholly true. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CHECKER] amusing copy_from_user bug
Petru Paler writes: > On Tue, Apr 10, 2001 at 06:41:28AM -0400, Jakub Jelinek wrote: > > some architectures don't care at all, because verify_area is a noop > > (sparc64). > > Why (and how) is this? On sparc64, the user lives in an entirely different address space. The user cannot even generate addresses in kernel space. Basically, addresses are prefix'd by an 8-bit tag called an ASI (Address Space Identifier), which tells the cpu which TLB context to use etc. When running in user space or accessing user space in kernel mode we make the cpu use the special userspace ASI. In fact the user can be given the complete 32-bit or 64-bit virtual address space, the kernel takes up none of it. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [race][RFC] d_flags use
Alexander Viro writes: > If nobody objects I'll go for test_bit/set_bit/clear_bit here. Be sure to make d_flags an unsigned long when you do this! :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache
Alexander Viro writes: > OK, how about wider testing? Theory: prune_dcache() goes through the > list of immediately killable dentries and tries to free given amount. > It has a "one warning" policy - it kills dentry if it sees it twice without > lookup finding that dentry in the interval. Unfortunately, as implemented > it stops when it had freed _or_ warned given amount. As the result, memory > pressure on dcache is less than expected. The reason the code is how it is right now is there used to be a bug where that goto spot would --count but not check against zero, making count possibly go negative and then you'd be there for a _long_ time :-) Just a FYI... Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.5 Workshop RealVideo streams -- next time, please get better audio.
Miles Lane writes: > There is one major shortcoming of the recordings. > Usually, only the comments of the presenter(s) > can be heard. The problem is that nobody wants to wait for one of the microphones to go across the entire room before they can begin speaking, this is what was happening. Sometimes there was a dialogue going on between three people sitting at tables, there were 2 microphones to go around... One solution I've seen sort of work is to have 2 standing fixed microphones in the isles, but this only really functions correctly for a Q&A type session after a presentation. It does not work in a relaxed "people sit at tables and comment at arbitrary points in time during a talk" setting such as the kernel summit. Besides putting a microphone at every table (which isn't all that practical honestly) I can't come up with a solution. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Jesse S Sipprell writes: > A patch will be coming out soon, as it is a fairly trivial fix. Thank you for tracking this down. One more subtle note, for the case of error handling. There is a change to sendfile() in the zerocopy patches which causes sendfile() to act more like sendmsg() when errors occur. Specifically, sendmsg() works roughly like the following when an error happens: handle_error: if (sent_something) return how_much_we_sent; else return ERROR_CODE; So when an error happens, and the kernel was able to send some of the data, you see something like this in the trace: sendmsg() = N ... sendmsg() = ERROR_CODE sendfile() used to act differently, and this made it difficult to directly transform a sendmsg()+local_buffer based server into a sendfile() one because the error handling was so different. Previously, sendfile() wouldn't give you the partial transfer length, you'd just get the error _regardless_ of whether any data was sent successfully during that call. Alexey, myself, and others considered this behavior bogus and inconsistent. So it was changed. The long and short of it is that sendfile() now acts just like sendmsg() when errors happen mid-send. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible problem with zero-copy TCP and sendfile()
Jesse S Sipprell writes: > On error, -1 is returned in the usual fashion and offset is purported to be > updated to point to the next byte following the last one sent. > > Will the zerocopy patches break this? No, they should not. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Very bad behavior of kswapd
Rik van Riel writes: > > Watch top: when this program needs the memory that kswapd keep > > in cache they go both at 100% cpu (on SMP) but still the size of > > the program only grows at about 100KB/s, why is kswapd releasing > > it so slowly and taking so much CPU ? > > Because kswapd still has to scan all the (unfreeable) memory > of the big process to determine it isn't freeable. This is not the only badly performing case actually. When one of ones swap partitions gets close to full, kswapd basically sits endlessly in get_swap_page() due to all the broken linear scan algorithms. I've tried to fix some of this with the patch below. This may not be the case Laurent is seeing but it is a problem that needs fixing. --- ../vanilla/linux/include/linux/swap.h Fri Apr 13 17:08:16 2001 +++ include/linux/swap.hSat Apr 14 00:10:02 2001 @@ -43,8 +43,22 @@ #define SWAP_CLUSTER_MAX 32 -#define SWAP_MAP_MAX 0x7fff -#define SWAP_MAP_BAD 0x8000 +#define SWAPFILE_CLUSTER 256 + +struct swap_cluster_struct { + struct list_headlist; + int nr_free;/* 0 --> SWAPFILE_CLUSTER */ + unsigned intstart_offset; +}; + +#define SWAP_MAP_MAX 0x7fff +#define SWAP_MAP_BAD 0x8000 + +struct swap_map_struct { + struct list_headlist; + unsigned intoffset; + unsigned intcount; +}; struct swap_info_struct { unsigned int flags; @@ -52,11 +66,15 @@ spinlock_t sdev_lock; struct dentry * swap_file; struct vfsmount *swap_vfsmnt; - unsigned short * swap_map; - unsigned int lowest_bit; - unsigned int highest_bit; - unsigned int cluster_next; - unsigned int cluster_nr; + struct swap_map_struct * swap_maps; + struct list_head swap_map_free_list; + + struct swap_cluster_struct * swap_clusters; + struct list_head swap_cluster_free_list; + + struct swap_cluster_struct * curr_cluster; + unsigned int cluster_offset_next; + int prio; /* swap priority */ int pages; unsigned long max; --- ../vanilla/linux/mm/swapfile.c Thu Mar 22 09:22:15 2001 +++ mm/swapfile.c Sat Apr 14 01:07:40 2001 @@ -24,62 +24,60 @@ struct swap_info_struct swap_info[MAX_SWAPFILES]; -#define SWAPFILE_CLUSTER 256 - -static inline int scan_swap_map(struct swap_info_struct *si, unsigned short count) +static unsigned int scan_swap_map(struct swap_info_struct *si, unsigned int count) { - unsigned long offset; - /* -* We try to cluster swap pages by allocating them -* sequentially in swap. Once we've allocated -* SWAPFILE_CLUSTER pages this way, however, we resort to -* first-free allocation, starting a new cluster. This -* prevents us from scattering swap pages all over the entire -* swap partition, so that we reduce overall disk seek times -* between swap pages. -- sct */ - if (si->cluster_nr) { - while (si->cluster_next <= si->highest_bit) { - offset = si->cluster_next++; - if (si->swap_map[offset]) - continue; - si->cluster_nr--; - goto got_page; - } - } - si->cluster_nr = SWAPFILE_CLUSTER; + struct swap_map_struct *map; + struct list_head *head, *tmp; - /* try to find an empty (even not aligned) cluster. */ - offset = si->lowest_bit; - check_next_cluster: - if (offset+SWAPFILE_CLUSTER-1 <= si->highest_bit) - { - int nr; - for (nr = offset; nr < offset+SWAPFILE_CLUSTER; nr++) - if (si->swap_map[nr]) - { - offset = nr+1; - goto check_next_cluster; - } - /* We found a completly empty cluster, so start -* using it. + /* Any swap entries left at all? */ + if (list_empty(&si->swap_map_free_list)) + return 0; + +get_from_cluster: + + /* Currently allocating from a cluster? */ + if (si->curr_cluster != NULL) { + struct swap_cluster_struct *cluster = si->curr_cluster; + unsigned int offset = si->cluster_offset_next; + + /* Note that this test cannot be made with cluster->nr_free +* because it is possible for a swap entry to be freed before +* we are done allocating from this cluster. */ - goto got_page; + if (si->cluster_offset_next++ == SWAPFILE_CLUSTER) + si->curr_cluster = NULL; + + cluster->nr_free--; + + map = &si->swap_maps[offset]; + goto finish_alloc; } - /* No luck, so now go finegrined
Re: [PATCH] IP forwarded checksum, kernel 2.2.18-19
Martin Gadbois writes: > Hi there! > I realized that some tests were failing due to dropped IP packets. I > traced and discovered the following: Thanks, I've put your patch into my 2.2.x source and will push this to Alan once he starts doing 2.2.20pre patches. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Let init know user wants to shutdown
Grover, Andrew writes: > IMHO an abstracted interface at this point is overengineering. ACPI is the epitome of overengineering. An abstracted interface would allow simpler systems to avoid all of the bloated garbage ACPI brings with it. Sorry, Alan hit it right on the head, ACPI is not much more than keeping speedstep proprietary. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sk->state_chage is not called for listening sockets
Pete Zaitcev writes: > With that in mind, would the following chage have any ill effects? > It does not seem to break anything obvious, but I am worried about > a performance degradation for some retarded benchmark. > > diff -u -U 4 linux-2.4.3/net/ipv4/tcp_input.c linux-2.4.3-nfs/net/ipv4/tcp_input.c > --- linux-2.4.3/net/ipv4/tcp_input.c Fri Feb 9 11:34:13 2001 > +++ linux-2.4.3-nfs/net/ipv4/tcp_input.c Thu Apr 12 23:23:59 2001 I've applied this patch, thanks. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] generic rw_semaphores, compile warnings patch
D.W.Howells writes: > This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained > when using the generic rwsem implementation. Have a look at pre5, this is already fixed. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ANNOUNCE New Open Source X server
James Simmons writes: > The Linux GFX project grew out the need for a higher performance X > server that has a much faster developement cycle. In the last few years > the graphics card and multimedia environments have grow at such a rate > the current X solutions can no longer keep pace nor do they focus on > producing high performance X servers specifically for linux. Also the > community has demanded for specific functionality which has never come to > light. And this specific functionality is? I think this is not a worthwhile project at all. The X tree, it's assosciated protocols and APIs, are complicated enough as it is, and the xfree86 project has some of the most talented and capable people in this area. It would be a step backwards to do things outside of xfree86 development. If the issue is that "things don't happen fast enough in the xfree86 tree", why not lend them a hand and submitting patches to them instead of complaining? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] generic rw_semaphores, compile warnings patch
David Howells writes: > There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It > needs to go in the gap below "#include ". Otherwise the > declarations for the contention handling functions will give warnings about > the struct being declared in the parameter list. Indeed, I didn't see this in my setup on sparc64 for some reason. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Longstanding elf fix (2.4.3 fix)
Eric W. Biederman writes: > In building a patch for 2.4.3 I also discovered that we are not taking > the mmap_sem around do_brk in the exec paths. Does that really matter? Who else can get at the address space? We are a singly referenced address space at that point... perhaps ptrace? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: All architecture maintainers: pgd_alloc()
Russell King writes: > There are various options here: > > 1. Either I can fix up all architectures, and send a patch to this list, or Fixup all the architectures and send this and the ARM bits to Linus. I really would wish folks would not choose Alan as the first place to send the patch. I'm not directly accusing anyone of it, but it does appear that often AC is used as a "back door" to get a change in. While this scheme most of the time, often it unnecessarily overworks Alan which I think is unfair. Sending it to Linus first also eliminates 2 levels of indirection each time Linus wants something done differently in the change. person --> alan --> linus --> needs change alan BCC's person, person codes new version person --> alan --> linus --> etc. etc. Sure Alan could fix it up himself, but... My main point is that for changes like this, sending stuff to Alan first is often an ineffective mechanism. If someone were to reply to this "Linus is hard to push changes too, or takes too long" my reply is "if this is really the problem, should the burdon should be entirely placed on Alan's shoulders?" The AC patches are huge, but they have substantially decreased in size during the recent 2.4.4-preX series. And sure, Alan makes conscious decisions to apply patches and eventually work to push them to Linus, but honestly people should consider ways to help decrease his load. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.3ac13
Alan Cox writes: > 2.4.3-ac13 > oSwitch to NOVERS symbols for rwsem (me) > | Called from asm blocks so they can't be versioned Yes they most certainly can be versioned inside of an asm. Use the "i" constraint, we've been doing this on sparc64 for ages. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.2.19pre10 doesn't compile on alphas (sunrpc)
Alan Cox writes: > I suspect adding > > #define BUG() __asm__ __volatile__("call_pal 129 # bugchk") > > to include/asm-alpha/page.h will do the right thing, since it works on 2.4 You have to add a few bits to arch/alpha/kernel/traps.c I could be wrong though... Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://vger.kernel.org/lkml/
Re: [UPDATE] zerocopy patch against 2.4.2-pre2
Andrew Morton writes: > Changing the memory copy function did make some difference > in my setup. But the performance drop on send(8k) is only approx 10%, > partly because I changed the way I'm testing it - `cyclesoak' is > now penalised more heavily by cache misses, and amount of cache > missing which networking causes cyclesoak is basically the same, > whether or not the ZC patch is applied. Ok ok ok, but are we at the point where there are no sizable "over the wire" performance anomalies anymore? That is what is important, what are the localhost bandwidth measurements looking like for you now with/without the patch applied? I want to reach a known state where we can conclude "over the wire is about as good or better than before, but there is a cpu/cache usage penalty from the zerocopy stuff". This is important. It lets us get to the next stage which is to use your tools, numbers, and some profiling to see if we can get some of that cpu overhead back. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[UPDATE] zerocopy + powder rule
The only change is to update things to 2.4.2-pre3: ftp://ftp.kernel.org/pub/linux/kernel/people/davem/zerocopy-2.4.2p3-1.diff.gz All the reports I am getting now appear to be consistent, and they all basically show me that: 1) There are no known bugs (as in things that crash the kernel or corrupt data) 2) The loopback etc. raw performance anomalies have been killed by the P-II Mendocino unaligned memcpy workaround. 3) The acenic/gbit performance anomalies have been cured by reverting the PCI mem_inval tweaks. 4) The zerocopy patches have a small yet non-neglible cpu usage cost for normal write/send/sendmsg. If this truly is the current state of affairs, then I am pretty happy as this is where I wanted things to be when I first began to publish these zerocopy diffs. The next step is to begin profiling things heavily to see if we can back some of that extra cpu usage the pages SKBs afford us. Due to the powder rule (Lake Tahoe received 6 or so feet of snow this past weekend) I will be a bit quiet until Friday night. However, I'll be doing my own profiling of the zerocopy stuff on my laptop while I'm up there. Later, David Snowboard Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] starfire reads irq before pci_enable_device.
Jeff Garzik writes: > And in another message, On Mon, 12 Feb 2001, David S. Miller wrote: > > 3) The acenic/gbit performance anomalies have been cured > >by reverting the PCI mem_inval tweaks. > > > Just to be clear, acenic should or should not use MWI? > > And can a general rule be applied here? Newer Tulip hardware also > has the ability to enable/disable MWI usage, IIRC. I think this is an Acenic specific issue. The second processor on the Acenic board is only there to work around bugs in their DMA controller. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MTU and 2.4.x kernel
[EMAIL PROTECTED] writes: > A. Datagram protocols do not work with mtus not allowing to send >512 byte frames (even DNS). This smells bad. Datagram protocol send sizes are only limited by socket buffer size, nothing more. Fragmentation makes it work. If you are really talking about side effects of UDP path-mtu, then I will turn off UDP path-mtu by default in 2.4.x because it is obviously very broken either conceptually or in our implementation. :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[UPDATE] Zerocopy BETA 1, against 2.4.2-pre4
I'm calling this "BETA 1" because I currently feel that all performance and other issues have been addressed and that the patch is up for serious consideration for inclusion into a future 2.4.x release: ftp://ftp.kernel.org/pub/linux/kernel/people/davem/zerocopy-2.4.2p4-1.diff.gz Besides merging to 2.4.2-pre4 the main change in this release is a totally revamped paged-SKB sendmsg implementation by Alexey. I truly believe now that bandwidth/latency is back to where we were before the zerocopy patches, and preliminary testing done by Andrew Morton supports this. (actually, in my own testing, latency over loopback seems to have improved) Some verbose TCP debugging is enabled in this release, most of the messages are harmless %99 of the time. If these messages bother you just set "FASTRETRANS_DEBUG" back to "1" in include/net/tcp.h Thanks. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B))
Ookhoi writes: > We have exactly the same problem but in our case it depends on the > following three conditions: 1, kernel 2.4 (2.2 is fine), 2, windows ip > header compression turned on, 3, a free internet access provider in > Holland called 'Wish' (which seemes to stand for 'I Wish I had a faster > connection'). > If we remove one of the three conditions, the connection is oke. It is > only tcp which is affected. > A packet on its way from linux server to windows client seems to get > dropped once and retransmitted. This makes the connection _very_ slow. :-( I hate these buggy systems. Does this patch below fix the performance problem and are the windows clients win2000 or win95? --- include/net/ip.h.~1~Mon Feb 19 00:12:31 2001 +++ include/net/ip.hWed Feb 21 02:56:15 2001 @@ -190,9 +190,11 @@ static inline void ip_select_ident(struct iphdr *iph, struct dst_entry *dst) { +#if 0 if (iph->frag_off&__constant_htons(IP_DF)) iph->id = 0; else +#endif __ip_select_ident(iph, dst); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with 2.2.19pre9 (Connection closed.)
Alan Cox writes: > Dave - any ideas, shall we back it out and work on it for 2.2.20 ? The one change which is probably causing this is non-critical, so let me study things quickly tonight and if I come up with nothing I'll show you what you can revert safely. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B))
Jordan Mendelson writes: > Now, if it didn't have the side effect of dropping packets left and > right after ~4000 open connections (simultaneously), I could finally > move our production system to 2.4.x. There is no reason my patch should have this effect. All of this is what appears to be a bug in Windows TCP header compression, if the ID field of the IPv4 header does not change then it drops every other packet. The change I posted as-is, is unacceptable because it adds unnecessary cost to a fast path. The final change I actually use will likely involve using the TCP sequence numbers to calculate an "always changing" ID number in the IPv4 headers to placate these broken windows machines. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[UPDATE] Zerocopy BETA 2 against 2.4.2 final.
Usual place: ftp://ftp.kernel.org/pub/linux/kernel/people/davem/zerocopy-2.4.2-1.diff.gz Besides merging to the 2.4.2-final release there are two bug fixes: 1) New TCP receive queue collapser could trigger assertion failures in tcp_recvmsg(), reason: uninitialized skb->used field in fresh SKB allocated for collapsing. 2) IP header IDs are generated differently on big vs. little endian systems, added htons() to fix. Some have asked why this isn't pushed to Alan for his AC patches yet, the reason is that I want to fully resolve the final few performance issues that remain (1.5K mtu on gbit still has some warts). Once those are cleared and everyone involved is satisfied that there are no performance regressions against vanilla 2.4.2, I will ask Alan to consider including it. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/net/sunhme.c, unbalanced and unchecked ioremap()
Andrey Panin writes: > I found that sunhme.c doesn't check ioremap() return value and doesn't > call iounmap() on module unload. Attached patch (for 2.4.1-ac20) should fix it, > compiles clearly, but untested (I have no such hardware). Thanks I've applied this patch. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch] 2.4.2: af_unix.c warnings
Russell King writes: > The following patch fixes these warnings: Thanks, applied. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ipv4: 2.4.2: unused static variables
Russell King writes: > With CONFIG_SYSCTL=n, I get the following warnings: > > sysctl_net_ipv4.c:50: warning: `tcp_retr1_max' defined but not used > sysctl_net_ipv4.c:52: warning: `ip_local_port_range_min' defined but not used > sysctl_net_ipv4.c:53: warning: `ip_local_port_range_max' defined but not used > > These are defined static in sysctl_net_ipv4.c, and appear to only be > exported via procfs. In other words, you can set them to whatever you > like and the IPv4 stack couldn't care less. > > Why do we have them? If they're not used, can we either eliminate them, > or else move their definition within the '#ifdef CONFIG_SYSCTL' to > eliminate the warning? They aren't set to anything because they are not sysctl "values", they are sysctl "limits". Ie. they tell the sysctl layer what legal range the user's setting of a particular sysctl must reside in. The fix is to enclose these things in CONFIG_SYSCTL, which I have done in my tree, thanks for bringing this to my attention. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[UPDATE] zerocopy BETA 3
Usual spot: ftp://ftp.kernel.org/pub/linux/kernel/people/davem/zerocopy-2.4.2-2.diff.gz Changes since last installment: 1) More errors in TCP receive queue collapser are discovered and fixed. 2) Several URG handling details on receive side are made more consistent and sane. 3) Workaround for win2000/95 VJ header compression bugs is implemented. 4) Update to latest 3c59x driver from Andrew, this should cure some link type detection problems. 5) IP conntrack fix from Rusty. Please test, to my knowledge the only issue remaining now are the gbit performance issues, which are being discussed by Pekka and Alexey. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
A plea for help, forwarded message from postmaster@morotsmedia.se
Unless someone can tell me who is the recipient on the linux-kernel list generating these bogus virus bounces back to me, I am going to have no choice but to unsubscribe the entire *.se domain to try and get rid of this guy. Thanks. Your mail was recieved, but looked like it might contain a virus and was not delivered. Please do not respond to this mail, it is only an autoreply.
Re: A plea for help, forwarded message from postmaster@morotsmedia.se
Mohammad A. Haque writes: > >From autoreplay headers... > Message-Id: <[EMAIL PROTECTED]> > From: [EMAIL PROTECTED] > Sender: [EMAIL PROTECTED] > > Other posts from jborg... > From: Jakob Borg <[EMAIL PROTECTED]> > .. > -- > Jakob Borgmailto:[EMAIL PROTECTED] (personal) > UNIX/network adminmailto:[EMAIL PROTECTED](development) > systems programmermailto:[EMAIL PROTECTED] (work) >http://jakob.borg.pp.se/ Thanks a lot, he's gone. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: possible bug x86 2.4.2 SMP in IP receive stack
Sounds like a bug wrt. SKB allocations in the Myrinet driver. You're the author of most of that code, so I'm sure you're the best one to audit it :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: UDP attack? How to suppress kernel msgs?
This should fix your problem: --- include/net/sock.h.~1~ Thu Feb 22 21:12:12 2001 +++ include/net/sock.h Sun Feb 25 21:26:16 2001 @@ -1279,7 +1279,7 @@ * Enable debug/info messages */ -#if 0 +#if 1 #define NETDEBUG(x)do { } while (0) #else #define NETDEBUG(x)do { x; } while (0) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE] zerocopy BETA 3
Chris Wedgwood writes: > --- linux-2.4.2/include/net/ip.h Sun Feb 25 01:15:19 2001 > +++ linux-2.4.2+zc-2/include/net/ip.hSun Feb 25 01:53:52 2001 You need to part that adds "id" to the sock struct too. This won't build "as-is". Besides, I'd like people to have to test the zerocopy stuff for me, they'll get the ID fix if they do that :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New net features for added performance
Jeff Garzik writes: > 1) Rx Skb recycling. ... > Advantages: A de-allocation immediately followed by a reallocation is > eliminated, less L1 cache pollution during interrupt handling. > Potentially less DMA traffic between card and host. ... > Disadvantages? It simply cannot work, as Alexey stated, in normal circumstances netif_rx() queues until the user reads the data. This is the whole basis of our receive packet processing model within softint/user context. Secondly, I can argue that skb recycling can give _worse_ cache performance. If the next use and access by the card to the skb data is deferred, this gives the cpu a chance to displace those lines in it's cache naturally via displacement instead of being forced quickly to do so when the device touches that data. If the device forces the cache displacement, those cache lines become empty until filled with something later (smaller utilization of total cache contents) whereas natural displacement puts useful data into the cache at the time of the displacement (larger utilization of total cache contents). It is an NT/windows driver API rubbish idea, and it is full crap. > 2) Tx packet grouping. ... > Disadvantages? See Torvalds vs. world discussion on this list about API entry points which pass multiple pages at a time versus simpler ones which pass only a single page at a time. :-) > 3) Slabbier packet allocation. ... > Disadvantages? Doing this might increase cache pollution due to > increased code and data size, but I think the hot path is much improved > (dequeue a properly sized, initialized, skb-reserved'd skb off a list) > and would help mitigate the impact of sudden bursts of traffic. I don't know what I think about this one, but my hunch is that it will lead to worse data packing via such an allocator. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE] zerocopy.. While working on ip.h stuff
Benjamin C.R. LaHaise writes: > Since the ip header fits in the cache of some CPUs (like the P4), > this becoming a cheaper operation than ever before. At gigapacket rates, it becomes an issue. This guy is talking about tinkering with new IP _options_, not just the header. So even if the IP header itself fits totally in a cache line, the options afterwardsd likely will not and thus require another cache miss. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [UPDATE] zerocopy.. While working on ip.h stuff
Michael Peddemors writes: > A few things.. why is ip.h not part of the linux/include/net rather than > linux/include/linux hierachy? Exported to older userlands... > Defined items that are not used anywhere in the source.. > Can any of them be deleted now? > So what, userland makes use of them :-) > Also, I was looking into some RFC 1812 stuff. (Thanks for nothing Dave :) and > was looking at 4.2.2.6 where it mentions that a router MUST implement the End > of Option List option.. Havent' figured out where that is implememented yet.. egrep "IPOPT_END" net/ipv4/ip_options.c You just aren't looking hard enough. > Also was trying to figure out some things. > I want to create a new ip_option for use in some DOS protection experiments. > I have a whole 40 bytes (+/-) to share... Now although I don't see anything > explicitly prohibiting the use of unused IP Header option space, I know that > it really was designed for use by the sending parties, and not routers in > between.. Has anyone seen any RFC that explicitly says I MUST NOT? Not to my knowledge. Routers already change the time to live field, so I see no reason why they can't do smart things with special IP options either (besides efficiency concerns :-). Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New net features for added performance
Andi Kleen writes: > 4) Better support for aligned RX by only copying the header Andi you can make this now: 1) Add new "post-header data pointer" field in SKB. 2) Change drivers to copy into aligned headroom as you mention, and they set this new post-header pointer as appropriate. For normal drivers without alignment problem, generic code sets the pointer up just like it does the rest of the SKB header pointers now. 3) Enforce correct usage of it in all the networking :-) I would definitely accept such a patch for the 2.5.x series. It seems to be a nice idea and I currently see no holes in it. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4 tcp very slow under certain circumstances (Re: netdev issues (3c905B))
Simon Kirby writes: > Has such a patch gone in to the kernel yet? Yep, it is in both the zerocopy and AC patches. (Linus is away at the moment) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New net features for added performance
Jeff Garzik writes: > I only want to know if more are coming, not actually pass multiples.. Ok, then my only concern is that the path from "I know more is coming" down to hard_start_xmit invocation is long. It would mean passing a new piece of state a long distance inside the stack from SKB origin to device. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New net features for added performance
Andi Kleen writes: > Or did I misunderstand you? What is wrong with making methods, keyed off of the ethernet protocol ID, that can do the "I know where/how-long headers are" stuff for that protocol? Only cards with the problem call into this function vector or however we arrange it, and then for those that don't have these problems at all we can make NULL a special value for this "post-header" pointer. You can pick some arbitrary number, sure, that is another way to do it. Such a size would need to be chosen very carefully though. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: vmalloc improvements
Reto Baettig writes: > The RPC server needs lots of 2MB receive buffers which are > allocated using vmalloc because the NIC has its own pagetables. Why not just allocate the page seperately and keep track of where they are, since the NIC has all the page tabling facilities on it's end, the cpu side is just a software issue. You can keep an array of pages how ever large you need to keep track of that. vmalloc() was never meant to be used on this level and doing so is asking for trouble (it's also deadly expensive on SMP due to the cross-cpu tlb invalidates using vmalloc() causes). Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.1 network (socket) performance
Richard B. Johnson writes: > > unix socket sends eat into memory reserved for atomic allocs. OK (Manfred is being quoted here, to be clear). I'm still talking with Alexey about how to fix this, I might just prefer killing this fallback mechanism of skb_alloc_send_skb then make AF_UNIX act just like everyone else. This was always just a performance hack, and one which makes less and less sense as time goes on. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rsync over ssh on 2.4.2 to 2.2.18
Russell King writes: > Please note: although I am using 2.2.15pre13, it is _not_ the cause of > this problem How do you know this? There are so many deadly TCP bugs fixed since 2.2.15pre13 I don't know how you can assert this. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: rx_copybreak value for non-i386 architectures
Jun Sun writes: > I notice that many net drivers set rx_copybreak to 1518 (the max packet size) > for non-i386 architectures. Once I thought I understood it and it seems > related to cache line alignment. However, I am not sure exactly about the > reason now. Can someone enlighten me a little bit? Most non-x86 architectures take a large hit for unaligned accesses. If the ethernet chip cannot land the beginning of the packet at an arbitrary byte offset (a modulo 2 offset for ethernet is needed for an aligned IP header) then the rx_copybreak is set to the ethernet MTU so that all packets get copied into new buffers where they can have their header aligned. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] pci_dma_set_mask()
Zach Brown writes: > > extremely minor nit that I think pci_set_dma_mask should return ENODEV > > or EIO or something on error, and zero on success. > > I agree, though I'd like to leave the decision up to people who live and > breathe this stuff. > > please feel free to make minor adjustments and submit :) Jeff/Zach, I agree, I'm fully for such a patch, but please update the documentation! It is the most important part of the patch. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Dan Malek writes: > "David S. Miller" wrote: > > > I played around with something akin to this, and some of the necessary > > Xfree86-4.0.x hackery needed, some time ago. But I never finished > > this. > > Sounds pretty sweet. How about we finish it? Any complaints (well > reasonable ones :-) or concerns that came out of discussions or > your testing we need to consider? There is only one sticking point, and that is how to convey to the mmap() call whether you want I/O or Memory space. In the end, my analysis came up with basically an ioctl() on the same PCI device node to set this, and you could keep track of this state in the filp private area. I thought originally you could do this with the lower bits of the mmap() offset, but that won't work in 2.4.x because they are stripped out and you only get a page number by the time the driver mmap call runs. I really like this solution because it does not involve any new syscalls to be added to glibc and/or the Xfree86 arch/os specific code. Just opening files, mmap, and an ioctl number or two. All of this can be shared between ports. As a side note, Alpha has a special PCI syscall to get the "PCI controller number" a given PCI device is behind. We could add another ioctl number which does the same thing on /proc/bus/pci/*/* nodes. This way sparc64 and Alpha could have the same user visible API for this as well. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel is unstable
Andrea Arcangeli writes: > If it happened to be buggy it didn't looked unfixable from a design standpoint > and I think it was a very worthwhile feature, not just for memory but also to > avoid growing the size of the avl that we would have to pay later all the time > at each page fault. Linus didn't find it to be such a gain, and in fact the one place that does gain from such merging (sys_brk()) does the merging by hand :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Benjamin Herrenschmidt writes: > Also, the problem of finding where the legacy ISA IOs of a given PCI bus > are is a bit different that simply mmap'ing a BAR. Some video cards > require some access to their VGA IOs without having a BAR covering them, > in some case it's necessary to switch the chip from VGA to MMIO mode. Many platforms, sparc64 included, do not have an ISA IO space nor do they provide VGA accesses at all. If things such as XFree86 are coded for such platforms to not require VGA accesses (the 'ati' driver is already like this when certain build time defines are set), this could become a non-issue in this case. > So what would be a preferred way ? Create that fake ISA bus number and > provide functions for looking them up, getting their IO and mem bases, > and eventually mapping PCI busses to ISA busses ? Or does someone have a > better idea ? The goal is to try not to change the semantics of inb/outb > and friends so that most legacy drivers can still work using the > "default" IO bus if they are not upgraded to the new scheme. There is no 'fake' ISA bus number you need. There is a 'real' one, the one on which the PCI-->ISA bridge lives, why not use that one :-) Then you could find such an ISA bridge, open that PCI device, then finally perform the PCI_IOCTL_GETIOBASE thingy on it, but I don't like this get-iobase idea at all, see my next email in this thread for why. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] pci_set_dma_mask() + doc :)
Zach Brown writes: > please feel free to flame or apply, I'm not sure I'm really fond of the > code example.. Seems fine to me. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Benjamin Herrenschmidt writes: > I'm, of course open to any comments about this (in fact, I'd really like > some feedback). One thing is that we also need to find a way to pass > those infos to userland. Currently, we implement an arch-specific syscall > that allow to retreive the IO physical base of a given PCI bus. That may > be enough, but we may also want something that match more closely what we > do in the kernel. Same problem on sparc64. Using a special PCI syscall is fine, _if_ we all end up using the same one. However, I would prefer another mechanism... I think a cleaner scheme is to allow mmap() on /proc/bus/pci/${BUS}/${DEVICE} nodes, that is much cleaner and solves transparently any "different word size between userland and kernel" issues (specifically 32-bit userlands executing on 64-bit kernels). I played around with something akin to this, and some of the necessary Xfree86-4.0.x hackery needed, some time ago. But I never finished this. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Dan Malek writes: > It actually caused me to think of something elseI have cards > with multiple memory and I/O spaces (rare, but I have them). So what? All such bar's within mem/io space are part of unique regions of the total MEM/IO space. Thus you can pass non-conflicting offset/size pairs, based upon the BAR value of interest, to mmap and everything is fine. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Grant Grundler writes: > A nice side effect of this bloat is it will discourage use of I/O > Port space. That's good for everyone, AFAICT. (I know some devices > *only* support I/O port space and I personnally don't care about > them. If someone who does care about one wants to talk to me about > it...fine...I'll help) There is another case you are ignoring. Some devices support memory space as well as I/O space, but only operate reliably when their I/O space window is used to access it. It just sounds to me like the hppa pci controllers are crap, especially the GSC one. At least the rope one does something reasonable when you have a 64-bit kernel. The horrors you've told me about the IOMMUs and stream-caches on these chips further confirms my theory :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Benjamin Herrenschmidt writes: > Also, an ioctl to retreive the iobase would be useful too No, the whole point of my suggested mmap() interface is to _ENTIRELY_ eliminate any reason for the user to even see what the physical addressing of the machine looks like. If you start pushing iobases to the user, you break this. I do not want an interface where the user still has to do grotty stuff like mmap() on /dev/{mem,kmem}, this was the core of the problem I had with the syscall idea, don't bring it back. Make mmap()'s on a PCI-->ISA bridge do something special, for example. The user doesn't need to know anything about physical addressing of the machine, it all can and should be abstracted away. This is why I really detest the XFree86 PCI bus probing layer, it should not need to poke around at so much of the config space information of devices :-( It is the reason why, at least still today in Xfree86 CVS, it simply cannot cope with multiple PCI controllers in a machine because it assumes a flat MEM/IO space. They know about the problem and are working on fixes, but my point is that making this overly knowledgable PCI prober in the first place is what created these problems. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Q: explicit alignment control for the slab allocator
Manfred, why are you changing the cache alignment to SMP_CACHE_BYTES? If you read the original SLAB papers and other documents, the code intends to color the L1 cache not the L2 or subsidiary caches. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.2 TCP window shrinking
Jim Woodward writes: > This has probably been covered but I saw this message in my logs and > wondered what it meant? > > TCP: peer xxx.xxx.1.11:41154/80 shrinks window 2442047470:1072:2442050944. > Bad, what else can I say? > > Is it potentially bad? - Ive only ever seen it twice with 2.4.x We need desperately to know exactly what OS the xxx.xxx.1.14 machine is running. Because you've commented out the first two octets, I cannot check this myself using nmap. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Benjamin Herrenschmidt writes: > What I call ISA IOs here doesn't necessarily mean there's an ISA bridge > on the PCI. Ok. > On PPC, we don't have an "IO" space neither, all we have is a range of > memory addresses that will cause IO cycles to happen on the PCI bus. This is precisely what the "next MMAP is XXX space" ioctl I've suggested is for. I think I've addressed this concern in my proposal already. Look: fd = open("/proc/bus/pci/${BUS}/${DEV}", ...); if (fd < 0) return -errno; err = ioctl(fd, PCI_MMAP_IO, 0); if (err < 0) { close(fd); return -errno; } ptr = mmap(NULL, pdev->bar[3].size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, pdev->bar[3].start); Something like that. > Without that, we need to create new versions of inb/outb that take a bus > number. No, don't do this, it is evil. Use mappings, specify the device related info somehow when creating the mapping (in the userspace variant you do this by openning a specific device to mmap, in the kernel variant you can encode the bus/dev/etc. info in the device's resource and decode this at ioremap() time, see?). Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: The IO problem on multiple PCI busses
Benjamin Herrenschmidt writes: > There is still the need, in the ioctl we use the "select" what need to be > mapped by the next mmap, to ask for the "legacy IO range of the bus where > the card reside" (if it exist of course). That would be the 0-64k (or less, > actually a couple of pages would probably be enough) that generates IO cycles > in the "low" addresses used for VGA registers on the card. As I've stated in another email, this is perfectly fine and is precisely the kind of thing implied by my original proposal in this thread. You can even have arch-specific "next mmap is" ioctl values to do "special things". The generic part of the ioctl()/mmap() bits the PCI driver will have added won't care about these ioctl's all that much, the include/asm/pcimmap.h header will deal with all such details. This header is also where the physical address and the actual creation of the page table mappings will occur. The generic PCI code will only provide the skeletal parts of the mmap() method and call into the arch-specific hooks coded in asm/pcimmap.h Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: TCP Congestion Window Bug?
Mark Reginald James writes: > TCP only sends a packet if: > > tcp_packets_in_flight(tp) < tp->snd_cwnd > > (function tcp_snd_test in include/net/tcp.h) > > but regards transmission as application-limited if > > tp->packets_out < tp->snd_cwnd > > (function tcp_cwnd_validate in include/net/tcp.h) > > So the kernel _always_ thinks the connection is > application-limited Why? After the final "send a packet if" test, tp->packets_out will be incremented and thus be equal to tp->snd_cwnd, marking the connection as _not_ application limited. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] tiny MM performance and typo patches for 2.4.2
Ulrich Kunitz writes: > patch-uk6In 2.4.x _page_hashfn divides struct address_space pointer > with a parameter derived from the size of struct > inode. Deriving this parameter from the size of struct > address_space makes more sense -- at least for me. The address_space is %99 of the time (unless swapping, and in that case the address is constant :-)) inside of an inode struct so this change actually makes the hash worse. I looked at this one time myself... Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SLAB vs. pci_alloc_xxx in usb-uhci patch
Russell King writes: > A while ago, I looked at what was required to convert the OHCI driver > to pci_alloc_consistent, and it turns out that the current interface is > highly sub-optimal. It looks good on the face of it, but it _really_ > does need sub-page allocations to make sense for USB. > > At the time, I didn't feel like creating a custom sub-allocator just > for USB, and since then I haven't had the inclination nor motivation > to go back to trying to get my USB mouse or iPAQ communicating via USB. > (I've not used this USB port for 3 years anyway). Gerard Roudier wrote for the sym53c8xx driver the exact thing UHCI/OHCI need for this. I think people are pissing their pants over the pci_alloc_consistent interface for no reason. It gives PAGE
Re: So, what about kwhich on RH6.2?
Date:Wed, 03 Jan 2001 22:08:33 -0800 From: Pete Zaitcev <[EMAIL PROTECTED]> Are we going to use Miquel's patch? I cannot build fresh 2.2.x on plain RH6.2 without it. The 2.2.19-pre6 comes out without it. Or is "install new bash" the official answer? Alan? I do not understand, I just got a working 2.2.19-pre6 build on one of my 6.2 Sparc64 systems, what kind of failure do you see? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] TLB flush changes for S/390
From: Ulrich Weigand <[EMAIL PROTECTED]> Date:Mon, 1 Jan 2001 23:15:26 +0100 (MET) * Is there some reason why ptep_test_and_clear_young should *not*, after all, flush the TLB? Yes, because the accuracy of that state bit is not required to be %100 perfect. Less SMP tlb flushing traffic from vmscan runs is desirable, thus no flush. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.0 on sparc64 build problems
The sparc64 config should never allow you to build the amd7930 and dbri sbus sound drivers, that is a bug, and I'll fix that. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: 2.4.0 Kernel Fails to compile when CONFIG_IP_NF_FTP is selected
You need to enable both CONNTRACK and full NAT in your configuration. Rusty, why doesn't the Config stuff just enforece this if it is necessary when enabling FTP support etc.? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Error building 2.4.0-prerelease
The netfilter configuration allowed you to illegally specify FTP support as non-modular, yet NAT support modular. That cannot work. I would suggest changing NAT support to be non-modular if you want FTP support non-modular. Rusty, I think this is another case where the netfilter config should be more stringent and disallow illegal combinations such as this one. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.0 on sparc64 build problems
Date:Fri, 5 Jan 2001 16:00:21 -0800 From: Joshua Uziel <[EMAIL PROTECTED]> Basically, those two should be removed from the config options for sparc64... and in the meantime, you should build without 'em. :) Note that 2.2.x has this exact fix already, and that 2.2.x fix came from a similar bug report from Horst von Brand :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: reset_xmit_timer errors with 2.4.0
Date:Fri, 5 Jan 2001 19:22:39 +0100 From: Arkadiusz Miskiewicz <[EMAIL PROTECTED]> On/Dnia Fri, Jan 05, 2001 at 06:52:52AM -0800, Patrick Michael Kane wrote > With 2.4.0 installed, I've started to see the following errors: > > reset_xmit_timer sk=cfd889a0 1 when=0x3b4a, caller=c01e0748 > reset_xmit_timer sk=cfd889a0 1 when=0x3a80, caller=c01e0748 > the same problem here Does the following patch fix this for people? --- net/ipv4/tcp_input.c.~1~Wed Dec 13 10:31:48 2000 +++ net/ipv4/tcp_input.cFri Jan 5 17:01:53 2001 @@ -1705,7 +1705,7 @@ if ((__s32)when < (__s32)tp->rttvar) when = tp->rttvar; - tcp_reset_xmit_timer(sk, TCP_TIME_RETRANS, when); + tcp_reset_xmit_timer(sk, TCP_TIME_RETRANS, min(when, TCP_RTO_MAX)); } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.0 TCP SYN problem
From: Marek Gresko <[EMAIL PROTECTED]> Date:Fri, 5 Jan 2001 18:16:34 +0100 When I initiate connection from Solaris machine everything goes OK. TCP/SYN,ACK segments are OK. Can anyone help me? Does: bash# echo "0" >/proc/sys/net/ipv4/tcp_ecn Fix the problem? If so, please send a bug report to Sun telling them that they improperly discard IP packets using ECN. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hashed device lookup (Does NOT meet Linus' sumission policy!)
Unified diffs only please... Thanks. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] single copy pipe rewrite
Date: Sun, 07 Jan 2001 00:25:16 +0100 From: Manfred <[EMAIL PROTECTED]> Last march David Miller proposed using kiobuf for these data transfers, I've written a new patch for 2.4. (David's original patch contained 2 bugs: it doesn't protect properly against multiple writers and it causes a BUG() in pipe_read() when data is stored in both the kiobuf and the normal buffer) A couple months ago David posted a revised version of his patch which fixed both these and some other problems. Most of the fixes were done by Alexey Kuznetsov. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: ip_conntrack locks up hard on 2.4.0 after about 10 hours
Date:Sat, 06 Jan 2001 10:37:54 -0500 From: safemode <[EMAIL PROTECTED]> Jan 6 06:18:10 icebox kernel: reset_xmit_timer sk=c17fd040 1 when=0x5d9e, caller=c01a6bf1 I posted a fix for this on Linux-kernel yesterday, had you tested it you would have seen at least this part of your problem report go away. I'm reposting the fix for your convenience: --- net/ipv4/tcp_input.c.~1~Wed Dec 13 10:31:48 2000 +++ net/ipv4/tcp_input.cFri Jan 5 17:01:53 2001 @@ -1705,7 +1705,7 @@ if ((__s32)when < (__s32)tp->rttvar) when = tp->rttvar; - tcp_reset_xmit_timer(sk, TCP_TIME_RETRANS, when); + tcp_reset_xmit_timer(sk, TCP_TIME_RETRANS, min(when, TCP_RTO_MAX)); } } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PROBLEM: 2.4.0 Kernel Fails to compile when CONFIG_IP_NF_FTP is selected
From: Rusty Russell <[EMAIL PROTECTED]> Date: Sat, 06 Jan 2001 13:40:35 +1100 CONFIG_IP_NF_FTP controls BOTH the ftp connection tracking and NAT code. The correct fix is below (untested, but you get the idea). I've applied this, it seems fine. (I've also adapted it to the pending IRC stuff, so you don't need to send me a fix for that under seperate cover). Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] single copy pipe rewrite
Date: Sun, 07 Jan 2001 01:36:22 +0100 From: Manfred <[EMAIL PROTECTED]> Do you still have that patch? I think so, see below. Was it posted to linux-kernel? Yes, it was. I just found a copy, enjoy: diff -ur ../vger3-001101/linux/fs/pipe.c linux/fs/pipe.c --- ../vger3-001101/linux/fs/pipe.c Sat Oct 14 18:38:24 2000 +++ linux/fs/pipe.c Wed Nov 1 21:39:53 2000 @@ -8,6 +8,8 @@ #include #include #include +#include +#include #include #include @@ -22,6 +24,18 @@ * -- Julian Bradfield 1999-06-07. */ +#define PIPE_UMAP(inode) ((inode).i_pipe->umap) +#define PIPE_UMAPOFF(inode)((inode).i_pipe->umap_offset) +#define PIPE_UMAPLEN(inode)((inode).i_pipe->umap_length) + +#define PIPE_UMAP_EMPTY(inode) \ + ((PIPE_UMAP(inode) == NULL) || \ +(PIPE_UMAPOFF(inode) >= PIPE_UMAPLEN(inode))) + +#define PIPE_EMPTY(inode) \ + ((PIPE_LEN(inode) == 0) && PIPE_UMAP_EMPTY(inode)) + + /* Drop the inode semaphore and wait for a pipe event, atomically */ void pipe_wait(struct inode * inode) { @@ -36,6 +50,65 @@ } static ssize_t +pipe_copy_from_kiobuf(char *buf, size_t count, struct kiobuf *kio, int kio_offset) +{ + struct page **cur_page; + unsigned long cur_offset, remains_this_page; + char *cur_buf; + int kio_remains; + + kio_remains = kio->length; + cur_page = kio->maplist; + cur_offset = kio->offset; + while (kio_offset > 0 && kio_remains > 0) { + remains_this_page = PAGE_SIZE - cur_offset; + if (kio_offset < remains_this_page) { + cur_offset += kio_offset; + kio_remains -= kio_offset; + break; + } + kio_offset -= remains_this_page; + kio_remains -= remains_this_page; + cur_offset = 0; + cur_page++; + } + + cur_buf = buf; + while (kio_remains > 0) { + unsigned long kvaddr; + int err; + + remains_this_page = PAGE_SIZE - cur_offset; + if (remains_this_page > count) + remains_this_page = count; + if (remains_this_page > kio_remains) + remains_this_page = kio_remains; + + kvaddr = kmap(*cur_page); + err = copy_to_user(cur_buf, (void *)(kvaddr + cur_offset), + remains_this_page); + kunmap(*cur_page); + + if (err) + return -EFAULT; + + cur_buf += remains_this_page; + count -= remains_this_page; + if (count <= 0) + break; + + kio_remains -= remains_this_page; + if (kio_remains <= 0) + break; + + cur_offset = 0; + cur_page++; + } + + return cur_buf - buf; +} + +static ssize_t pipe_read(struct file *filp, char *buf, size_t count, loff_t *ppos) { struct inode *inode = filp->f_dentry->d_inode; @@ -84,29 +157,44 @@ /* Read what data is available. */ ret = -EFAULT; - while (count > 0 && (size = PIPE_LEN(*inode))) { - char *pipebuf = PIPE_BASE(*inode) + PIPE_START(*inode); - ssize_t chars = PIPE_MAX_RCHUNK(*inode); - - if (chars > count) - chars = count; - if (chars > size) - chars = size; + if (PIPE_UMAP(*inode)) { + ssize_t chars; - if (copy_to_user(buf, pipebuf, chars)) + chars = pipe_copy_from_kiobuf(buf, count, + PIPE_UMAP(*inode), + PIPE_UMAPOFF(*inode)); + if (chars < 0) goto out; read += chars; - PIPE_START(*inode) += chars; - PIPE_START(*inode) &= (PIPE_SIZE - 1); - PIPE_LEN(*inode) -= chars; count -= chars; buf += chars; - } + PIPE_UMAPOFF(*inode) += chars; + } else { + while (count > 0 && (size = PIPE_LEN(*inode))) { + char *pipebuf = PIPE_BASE(*inode) + PIPE_START(*inode); + ssize_t chars = PIPE_MAX_RCHUNK(*inode); - /* Cache behaviour optimization */ - if (!PIPE_LEN(*inode)) - PIPE_START(*inode) = 0; + if (chars > count) + chars = count; + if (chars > size) + chars = size; + + if (copy_to_user(buf, pipebuf, chars)) + goto out; + + read += chars; + PIPE_START(*inode) += chars; + PIPE_START(*inode) &= (PIPE_SIZE - 1); +
Re: [PATCH] hashed device lookup (Does NOT meet Linus' sumission policy!)
Date: Sat, 06 Jan 2001 21:06:54 -0700 From: Ben Greear <[EMAIL PROTECTED]> "David S. Miller" wrote: > > Unified diffs only please... Thanks. Hrm, here's one with a -u option, this what you're looking for? Yes, thanks a lot. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hashed device lookup (Does NOT meet Linus' sumission policy!)
On Sat, Jan 06, 2001 at 02:33:27PM -0700, Ben Greear wrote: I'm hoping that I can get a few comments on this code. It was added to (significantly) speed up things like 'ifconfig -a' when running with 4000 or so VLAN devices. It should also help other instances with lots of (virtual) devices, like FrameRelay, ATM, and possibly virtual IP interfaces. It probably won't help 'normal' users much, and in it's final form, should probably be a selectable option in the config process. Ben, if ifconfig uses /proc/net/dev to list devices, how can your changes speed up ifconfig? Andi mentioned in another email how he has fixed the quadratic behavior in ifconfig, you should check if it fixes your problem. Jamal has suggested dumping ifconfig and making a dummy "ifconfig" which just wrappers around "ip". I like this idea the most. Really, what I'm concerned about is what calls dev_get_by_{name,index} so often and in such critical places that optimizing it makes any sense? I don't mind optimizing stuff like this where needed, in fact I'm the most guilty of this, check out the complex TCP hash tables we have :-) But if it's only a problem because of poorly implemented user applications, let's fix the apps instead of adding the complexity to the kernel. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hashed device lookup (Does NOT meet Linus' sumissionpolicy!)
Date: Sat, 6 Jan 2001 23:00:10 -0500 (EST) From: jamal <[EMAIL PROTECTED]> I think someone should just flush ifconfig down some toilet. a wrapper around "ip" to to give the same look and feel as ifconfig would be a good thing so that some stupid program that depends on ifconfig look and feel would be a good start. I could not agree more. This reminds me to do something I could not justify before, making netlink be enabled in the kernel and non-configurable. I could almost, but not quite, justify it right now just because "ip" is becomming standard and needs it. Not to stray from the subject, Ben's effort is still needed. I think real numbers are useful instead of claims like it "displayed faster" See my previous email, if it's just slow because of some poorly coded version of ifconfig, it does not justify the patch. If only a forcefully created "benchmark" can show some performance problem, that is not an acceptable reason to champion this patch. Ok? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [little bit OT] ip _IS_ _NOT_ ifconfig and route ! (was Re: [PATCH] hashed device lookup (Does NOT meet Linus' sumission policy!))
Date:Sun, 7 Jan 2001 11:40:10 + (UTC) From: "Henning P. Schmiedehausen" <[EMAIL PROTECTED]> As long as "man ip" on my machines returns "ip(7) - ip - Linux IPv4 protocol implementation", using "ip" exclusively instead of ifconfig and route is IMHO not an option for anyone else than bleeding edge hackers and linux gurus. As long as "man printf" gives me that damn shell command manpage, I will not use printf in my C applications. :- Yes, I do understand, "ip" needs some more documentation perhaps. Nobody has suggested getting rid of ifconfig, rather we have suggested to implement it in terms of "ip" because, as you even mention, "ip" is powerful and can do everything ifconfig can do thus ifconfig can be implemented as a wrapper on top of "ip". Nobody has suggested to use "ip" exclusively, you will not invoke "ip" with the suggestion I am making. Ifconfig indirectly will, but you won't even notice nor should you care. They will be packaged together, so even that won't be an issue. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hashed device lookup (Does NOT meet Linus' sumission policy!)
Date: Mon, 8 Jan 2001 01:13:08 +1300 From: Chris Wedgwood <[EMAIL PROTECTED]> OK, I'm a liar -- bind does handle this. Cool. Standard BSD allows it, what do you expect :-) This is good news, because it means there is a precedent for multiple addresses on a single interface so we can kill the : syntax in favor of the above which is cleaner of more accurately represents what is happening. If this is really true, 2.5.x is an appropriate time to make this, no sooner. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
[PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
I've put a patch up for testing on the kernel.org mirrors: /pub/linux/kernel/people/davem/zerocopy-2.4.0-1.diff.gz It provides a framework for zerocopy transmits and delayed receive fragment coalescing. TUX-1.01 uses this framework. Zerocopy transmit requires some driver support, things run as they did before for drivers which do not have the support added. Currently sg+csum driver support has been added to Acenic, 3c59x, sunhme, and loopback drivers. We had eepro100 support coded at one point, but it was removed because we didn't know how to identify the cards which support hw csum assist vs. ones which could not. I would like people to test this hard and report bugs they may discover. _PLEASE_ try to see if 2.4.0 without this patch produces the same problem, and if so report it is a 2.4.0 bug _not_ as a bug in the zerocopy patch. Thank you. In particular, I am interested in hearing about any new breakage caused by the zerocopy patches when using netfilter. When reporting bugs, please note what networking cards you are using as whether the card actually is using hw csum assist and sg support is an important data point. Finally, regardless of networking card, there should be a measurable performance boost for NFS clients with this patch due to the delayed fragment coalescing. KNFSD does not take full advantage of this facility yet. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.0-ac3 write() to tcp socket returning errno of -3 (ESRCH: "Nosuch process")
Date:Sun, 7 Jan 2001 23:55:28 -0600 (CST) From: Paul Cassella <[EMAIL PROTECTED]> [1.] One line summary of the problem: write() returns -1 and sets errno non-sensically. 2.4.0{,-ac[23]} What you describe I can only say is "impossible". There are only four cases when _ANY_ part of the ipv4 networking stack can return ESRCH. These four cases are: 1) Adding a route 2) Deleting a route 3) Adding a FIB routing rule 3) Removing a FIB routing rule None of them can occur via TCP socket writes (only netlink socket operations or socket control calls). Therefore I suspect you are perhaps getting rather some form of memory corruption or similar, really, please search the networking code for ESRCH value usage, you will see. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hashed device lookup (New Benchmarks)
Date:Mon, 08 Jan 2001 01:12:21 -0700 From: Ben Greear <[EMAIL PROTECTED]> http://grok.yi.org/~greear/hashed_dev.png (If you can't get to it, let me know and I'll email it to you...some cable modem networks have I firewalled.) It just seems that this shows that the implementation of ifconfig can be improved, since "ip" can do the same thing several orders of magnitude better (ie. non-quadratic system time complexity). This is the argument I started with when this thread began, so my position hasn't changed, it has in fact been well supported by your tests :-) Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.0-ac3 write() to tcp socket returning errno of -3 (ESRCH:"No such process")
Date: Mon, 8 Jan 2001 01:16:27 -0600 (CST) From: Paul Cassella <[EMAIL PROTECTED]> Would it be more helpful if I were to check something like socki_lookup(file->f_dentry->f_inode)->ops == tcp_prot instead? No, helpful would be for you to present us with a test case program and the network device configuration you are using. Are you using netfilter? Are you using tunneling, these sorts of things. Basically, the things we would need need to know to be able to duplicate your precise setup here locally in hopes of triggering the problem ourselves. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
Date: Mon, 8 Jan 2001 11:39:15 +0100 From: Christoph Hellwig <[EMAIL PROTECTED]> don't you think the writepage file operation is rather hackish? Not at all, it's simply direct sendfile support. It does not try to be any fancier than that. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
[Linux-IrDA]Re: Delay in authentication.
Date:Mon, 08 Jan 2001 18:39:34 +0500 From: Ansari <[EMAIL PROTECTED]> I just installed Redhat 6.0. When i run "su" command it takes much time to apper passwd prompt. Its also taking much time in authentication after entering the password. This definitely seems like the classic "/etc/nsswitch.conf is told to look for YP servers and you are not using YP", so have a look and fix nsswitch.conf if this is in fact the problem. Later, David S. Miller [EMAIL PROTECTED] ___ Linux-IrDA mailing list - [EMAIL PROTECTED] http://www.pasta.cs.UiT.No/mailman/listinfo/linux-irda
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
From: Jes Sorensen <[EMAIL PROTECTED]> Date: 08 Jan 2001 23:32:48 +0100 All I am asking is that someone lets me know if they make major changes to my code so I can keep track of whats happening. We have not made any major changes to your code, in lieu of this not being code which is actually being submitted yet. If it bothers you that publicly someone has published changes to your driver which you disagree with, oh well... :-) This "please check things out" phase is precisely what you are asking of us, it is how we are saying "here is what we need to do with your driver, please comment". Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
Date: Mon, 8 Jan 2001 16:05:23 -0200 (BRDT) From: Rik van Riel <[EMAIL PROTECTED]> I really think the zerocopy network stuff should be ported to kiobuf proper. That is how it could be done in 2.5.x, sure. But this patch is intended for 2.4.x so "minimum impact" applies. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
From: Jes Sorensen <[EMAIL PROTECTED]> Date: 08 Jan 2001 22:56:48 +0100 I don't think it's too much to ask that one actually tries to communicate with an author of a piece of code before making such major changes and submitting them opting for inclusion in the kernel. Jes, I have not submitted this for inclusion into the kernel. This is the "everyone, including driver authors, take a look" part of the development process. We _had_ to change some drivers to show how to support this new SKB api for transmit sg+csum support. If you can think of a way for us to effectively do this work without changing at least a few drivers as examples (and proof of concept), please let us know. In the process we hit real bugs in your driver, and tried to deal with them as best we could so that we could continue testing and debugging our own code. As a side note, as much as you may hate some of Alexey's changes to your driver, several things he does fixes long standing real bugs in the Acenic driver that you've been papering over with workarounds for quite some time. I would even go so far as to say that in many regards Alexey understands the Acenic much better than you, and you would be wise to work with Alexey and not against him. Thanks. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Delay in authentication.gy
Date:Mon, 8 Jan 2001 22:01:26 + (GMT) From: Alan Cox <[EMAIL PROTECTED]> > Solaris and other systems act identically. And have identical bad problems with auth failures. Actually, I believe their sunrpc library uses an extended error facility via the streams APIs that works similar to what is available under Linux to solve this problem. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Delay in authentication.
Date: Mon, 08 Jan 2001 15:24:55 -0600 From: "M.H.VanLeeuwen" <[EMAIL PROTECTED]> Was this behavior intentionally changed and why? Looks like 2.2.X gives ECONNREFUSED, but 2.4.X doesn't and times out. It was intentionally changed because there is no way for the "ICMP port unreachable" message coming back to be uniquely matched to that UDP socket. It can reset sockets illegally in high load scenerios. Solaris and other systems act identically. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
Date: Mon, 8 Jan 2001 17:43:56 -0500 From: Stephen Frost <[EMAIL PROTECTED]> Perhaps you missed it, but I believe Dave's intent is for this to only be a proof-of-concept idea at this time. Thank you Stephen, this is the point Jes continues to miss. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
Date: Tue, 9 Jan 2001 11:31:45 +0100 From: Christoph Hellwig <[EMAIL PROTECTED]> Yuck. A new file_opo just to get a few benchmarks right ... I hope the writepages stuff will not be merged in Linus tree (but I wish the code behind it!) It's a "I know how to send a page somewhere via this filedescriptor all by myself" operation. I don't see why people need to take painkillers over this for 2.4.x. I think f_op->write is stupid, such a special case file operation just to get a few benchmarks right. This is the kind of argument I am hearing. Orthogonal to f_op->write being for specifying a low-level implementation of sys_write, f_op->writepage is for specifying a low-level implementation of sys_sendfile. Can you grok that? Linus has already seen this. Originally he had a gripe because in an older revision of the code used to allow multiple pages to be passed in an array to the writepage(s) operation. He didn't like that, so I made it take only one page as he requested. He had no other major objections to the infrastructure. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
Date: Tue, 9 Jan 2001 12:28:10 +0100 From: Christoph Hellwig <[EMAIL PROTECTED]> Sure. But sendfile is not one of the fundamental UNIX operations... It's a fundamental Linux interface and VFS-->networking interface. An alloc_kiovec before and an free_kiovec after the actual call and the memory overhaed of a kiobuf won't hurt so much that it stands against a clean interface, IMHO. This whole exercise is pointless unless it performs well. The overhead _DOES_ matter, we've tested and profiled all of this with full specweb99 runs, zerocopy ftp server loads, etc. Removing one word of information from anything involved in these code paths makes enormous differences. Have you run such tests with your suggested kiobuf scheme? Know what I really hate? People who are talking, "almost done", and "designing" the "real solution" to a problem and have no code to show for it. Ie. a total working implementation. Often they have not one line of code to show. Then the folks who actually get off their lazy asses and make something real, which works, and in fact exceeded most of our personal performance expectations, are the ones who are getting told that what they did was crap. What was the first thing out of people's mouths? Not "nice work", but "I think writepage is ugly and an eyesore, I hope nobody seriously considers this code for inclusion." Keep designing... like Linus says, "show me the code". Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/