Re: [9fans] GCC/G++: some stress testing
On 2008-Mar-1, at 08:41 , ron minnich wrote: very litlle f77 left in my world, maybe somebody else has some. And also in response to Pietro's comments ... I have lots of dusty but still valid F77 code I use for antenna and RF circuit design (i.e. NEC and SPICE). Yes, there are newer versions of this stuff in C, but none of the C code fixes any bugs I'm not aware of or cannot otherwise deal with. Or more importantly, my models are not already aware of and compensate for. And then there is Dungeon. Netlib's f2c + libraries are ready to go with Plan 9, given the correct compiler defines. Building a colossal cave out of old card decks does not require outsourcing to the dwarves ;-) --lyndon
Re: [9fans] GCC/G++: some stress testing
> Yes. Although I work for a company that prides itself on its cache > coherence know-how, I'm very much a believer in networked > multiprocessors, even on a chip. I like Cell better than Opteron, > for example. They are harder to program up front, however, which > causes difficulties in adoption. Flip-side, once you've overcome > your startup hurdles the networked model seems to provide more > predictable performance management. tell me about it. a certain (nameless) vendor makes a pcie ethernet chipset with its descriptor rings in system memory, not pci space. it's bizarre watching the performance vs. the number of buffers loaded into the ring between head ptr updates. slight tweeks to the algorithm can result in 35% performance differences. suprisingly, another (also nameless) vendor makes a similar chipset with rings in pci space. this chipset has very stable performance in the face of tuning of the reloading loop. this chip performs just as well as the former though each 32-bit write to the ring buffer results in a round trip over the pcie bus to the card. - erik
Re: [9fans] plan9port build failure on Linux (debian)
We tracked this down off-list. Given these types: char *buf; uint len; gcc-4.2 assumes that buf+len >= buf. The test for wraparound when computing len in sprint looks like: len = 1<<30; /* big number, but sprint is deprecated anyway */ /* * on PowerPC, the stack is near the top of memory, so * we must be sure not to overflow a 32-bit pointer. */ if(buf+len < buf) len = -(uintptr)buf-1; gcc-4.2 compiles this away. Adding some uintptr casts works around the problem. This change is checked into the plan9port code and will be in tomorrows tar file. Because David is running 32-bit code on a 64-bit kernel, the stack is near the very top of 32-bit address space and tickles the gcc-4.2 bug. It would not surprise me if there are some exploitable buffer overflows (in standard code, not p9p) now that gcc is silently discarding checks like this one. Russ
Re: [9fans] GCC/G++: some stress testing
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mar 3, 2008, at 1:12 AM, Philippe Anel wrote: So, does this mean the latency is only required by the I/O system of your program ? If so, maybe I'm wrong, what you need is to be able to interrupt working cores and I'm afraid libthread doesn't help here. If not and your algorithm requires (a lot of) fast IPC, maybe this is the reason why it doesn't scale well ? No, the whole simulation has to run in the low-latency space - it's a video game and its rendering engine, which are generally highly heterogeneous workload. And that heterogeneity means that there are many points of contact between various subsystems. And the (semi-) real-time constraint means that you can't just scale the problem up to cover overhead costs. I don't know what you mean by "CSP system itself takes care about memory hierarchy". Do you mean that the CSP implementation does something about it, or do you mean that the code using the CSP approach takes care of it? Both :) I agree with you about the fact programming for the memory hierarchy is way more important than optimizing CPU clocks. But I also think synchronization primitives used in CSP systems are the main reason why CSP programs do not scale well (excepted bad designed algorithm of course). I meant that a different CSP implementation, based on different synchronisation primitive (IPI), can help here. I'm more interested just now in working with lock-free algorithms; I've not made any good measurements of how badly our kernels would hit channels as the number of threads increases. Perhaps some could be mitigated through a better channel implementation. IPI isn't free either - apart from the OS switch, it generates bus traffic that competes with the cache coherence protocols and memory traffic; in a well designed compute kernel that saturates both compute and bandwidth the latency hiccups so introduced can propagate really badly. This is very interesting. For sure IPI is not free. But I thought the bus traffic generated by IPI was less important than cache coherence protocols such as MESI, mainly because it is a one way message. It depends immensely on the hardware implementation of your IPI. If you wind up having to pay for MESI as well, then the advantage becomes less. I think now IPI are sent through the system bus (local APIC used to talk through a separate bus), so I agree with you about the fact it can saturate the bandwidth. But I wonder if locking primitive are not worse. It would be interesting to test this. Agreed! Paul -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (Darwin) iD8DBQFHzLSSpJeHo/Fbu1wRAkv/AKDKK4fuuWyYCqXv4JqbWWj+RXQd0wCfSFoS b9E6X/a13bg6AzUGT5dLSqU= =ppoF -END PGP SIGNATURE-
Re: [9fans] GCC/G++: some stress testing
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mar 3, 2008, at 4:49 PM, erik quanstrom wrote: really? to out-predict the cache hardware, you have to have pretty complete knowlege of everything running on all cores and be pretty good at guessing what will want scheduling next. not to mention, you'd need to keep close tabs on which memory is cachable/wc/etc. On the flip side, ignoring the cache leads to algorithms whose working sets can't fit in cache, wasting a considerable amount of processing to cache misses. Being able to parameterize your algorithms to work comfortably in one or two ways of your cache can bring *huge* performance improvements without dropping to the traditional assembly. I'm arguing that being aware of the caches lets the system better schedule your work because you aren't preventing it from doing something smart. aren't these arguments for networked rather than shared memory multiprocessors? Yes. Although I work for a company that prides itself on its cache coherence know-how, I'm very much a believer in networked multiprocessors, even on a chip. I like Cell better than Opteron, for example. They are harder to program up front, however, which causes difficulties in adoption. Flip-side, once you've overcome your startup hurdles the networked model seems to provide more predictable performance management. Paul - erik -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (Darwin) iD8DBQFHzLEdpJeHo/Fbu1wRAkNNAKC54Me95evWld4cUlUb0Wd9NXQF7QCfZ9zn QZ5kn6JLfK3EXocNz+plF4c= =xSzm -END PGP SIGNATURE-
Re: [9fans] Xen and new venti
> I'm trying to install plan9 under Xen 3.2.0 with venti > but the kernel avaiable on the web is too old to support nventi. perhaps you mean that this kernel has an old venti linked in? > I had a problem of type in xendat.h fixed by replacing > uint8 with uint at line 1540 i suspect you mean uchar. (or uvlong if they're counting bytes.) > mk now gives two errors which I do not know how to fix: > ...omissis... > size 9xenpcf > v4parsecidr: undefined: memcpy in v4parsecidr replace memcpy with memmove. > _strayintrx: _ctype: not defined _ctype is used by the is* functions like isascii. - erik
Re: [9fans] GCC/G++: some stress testing
> In fact the more I think about it, the more it seems like having > a direct way of manipulating L1/L2 caches would be more of a benefit > than a curse at this point. Prefetches are nothing but a patchwork > over the fundamental need for programming memory hierarchy in an > efficient way. But, I guess, there's more to it on the hardware side > than just crafting additional opcodes. > really? to out-predict the cache hardware, you have to have pretty complete knowlege of everything running on all cores and be pretty good at guessing what will want scheduling next. not to mention, you'd need to keep close tabs on which memory is cachable/wc/etc. maybe system management mode makes too much of an impression on me. but on an intel system, there's on way to prevent the cpu/bios/ipmi from issuing an smm interrupt anytime it pleases and taking over your hardware. my conclusion is we don't have as much control over the hardware as we think we do. > > decompositions that keep the current working set in cache (at L3, L2, > > or L1 granularity, depending), while simultaneously avoiding having > > multiple processors chewing on the same data (which leads to vast > > amounts of cache synchronization bus traffic). Successful algorithms > > in this space work on small bundles of data that either get flushed > > back to memory uncached (to keep more cache for streaming in), or in > > small bundles that can be passed from compute kernel to compute > > kernel cheaply. Having language structures to help with these > > decompositions and caching decisions is a great help - that's one of > > the reasons why functional programming keeps rearing its head in this > > space. Without aliasing and global (serializing) state it's much > > easier to analyze the program and chose how to break up the > > computation into kernels that can be streamed, pipelined, or > > otherwise separated to allow better cache utilization and parallelism. aren't these arguments for networked rather than shared memory multiprocessors? - erik
Re: [9fans] awk, not utf aware...
> > On the LINUX machines running utf-8 the ä is coded as $C3A4 which is > > in utf-8 equal to the character E4. The ä occupies in that way 2 bytes. > > > > I was very astonished, when I copied a mac-filename, pasted into a > > texteditor and looked at the file: > > > > In the mac-filename the letter ä is coded as: $61CC88, which in utf-8 > > means the letter "a" followed by a $0308. (Combining diacritical marks) > > So the Mac combines the letter a with the two points above it instead > > using the E4 letter > > Now the things are clear: The filenames are different, in spite of > > looking equally. > > So, if folding codepoints is a reasonable tactic, how many > representations do you need to fold? How many binary representations > are needed to fold íïìîi -> i? i didn't make my point very well. in this case i was suggesting a -f flag for grep that would map a codepoints into their base codepoint. the match result would be the original text --- in the manner of the -i flag. seperately, however ... utf combining characters are a really unfortunate choice, imho. there is no limit to the number of combining codepoints one can add to a base codepoint. you can, for example build a single letter like this U+0061 U+0302 ... U+0302 i don't think it's possible to build legible glyphs from bitmaps using combining diacriticals. therefore, i would argue for reducing letters made up of base+combiners to a precombined codepoint whenever possible. it would be helpful if tcs did this. infortunately some transliterations of russian into the roman alphabet use characters with no precombined form in unicode. rob probablly has a more informed opinion on this than i. - erik
Re: [9fans] awk, not utf aware...
On Thu, Feb 28, 2008 at 6:10 AM, erik quanstrom <[EMAIL PROTECTED]> wrote: > perhaps it would be more effective to break down the concept > a bit. instead of a general locale hammer, why not expose some > operations that could go into a locale? for example, have a base- > character folding switch that allows regexps to fold codpoints into > base codepoints so that íïìîi -> i. this information is in the unicode > tables. perhaps the language-dependent character mapping should > be specified explictly. &c. Loosely-related tangent: http://www.mail-archive.com/[EMAIL PROTECTED]/msg20395.html > On the LINUX machines running utf-8 the ä is coded as $C3A4 which is > in utf-8 equal to the character E4. The ä occupies in that way 2 bytes. > > I was very astonished, when I copied a mac-filename, pasted into a > texteditor and looked at the file: > > In the mac-filename the letter ä is coded as: $61CC88, which in utf-8 > means the letter "a" followed by a $0308. (Combining diacritical marks) > So the Mac combines the letter a with the two points above it instead > using the E4 letter > Now the things are clear: The filenames are different, in spite of > looking equally. So, if folding codepoints is a reasonable tactic, how many representations do you need to fold? How many binary representations are needed to fold íïìîi -> i? -Jack
[9fans] Xen and new venti
Hi, I'm trying to install plan9 under Xen 3.2.0 with venti but the kernel avaiable on the web is too old to support nventi. I decided to try to compile it but there were some troubles I was able to fix (with the help of #plan9). Now there's a problem (or two) I do not know how to solve; I must inform you that I'm not an expert of both plan9 and C. This is what I did: 9fs sources cpr /n/sources/xen/xen3/9 /sys/src/9 cd /sys/src/9/xen3 then I got /usr/local/xen from my xen installation and put it under xen-public I edited xenpcf to remove il support mk I had a problem of type in xendat.h fixed by replacing uint8 with uint at line 1540 mk I had some other problems which I fixed by 1) adding void mfence(void); into fns.h 2) editing line 286 in sdxen.c to look like xenbio(SDunit* unit, int lun, int write, void* data, long nb, uvlong bno) 3) adding /$objtype/lib/libip.a\ to LIB in mkfile 4) adding in l.s TEXT mfence(SB), $0 BYTE $0x0f BYTE $0xae BYTE $0xf0 RET mk now gives two errors which I do not know how to fix: ...omissis... size 9xenpcf v4parsecidr: undefined: memcpy in v4parsecidr _strayintrx: _ctype: not defined mk: 8c -FVw '-DKERNDATE='`{date ... : exit status=rc 7851: 8l 7855: error any idea? Thank you S.
Re: [9fans] migrating from oventi to nventi
dont know, but this are 2 physical different machines... i'm not putting a nventi on oventis partitions... this is all from scratch... before every try i reformated any arenas, isects and bloom-filters of nventi. but i'll try the rd/wrarena thing again and this time rebuilding the whole index first. thansk :-) cinap --- Begin Message --- On Mon, Mar 3, 2008 at 3:26 PM, <[EMAIL PROTECTED]> wrote: > first tried to extract arenas with oventis venti/rdarena and then > pumping them into nventi with nventis venti/wrarena. > when tried to format fossil with the last root-score, it failed to > find the block. > I have found the need to rebuild the index a couple of times with nventi since i switched from old to new. The first was just after switching in-place from old to new. The second was a month later when I started to see block-rot again, although most likely caused by something I did. --- End Message ---
Re: [9fans] migrating from oventi to nventi
On Mon, Mar 3, 2008 at 3:26 PM, <[EMAIL PROTECTED]> wrote: > first tried to extract arenas with oventis venti/rdarena and then > pumping them into nventi with nventis venti/wrarena. > when tried to format fossil with the last root-score, it failed to > find the block. > I have found the need to rebuild the index a couple of times with nventi since i switched from old to new. The first was just after switching in-place from old to new. The second was a month later when I started to see block-rot again, although most likely caused by something I did.
Re: [9fans] plan9port build failure on Linux (debian)
On Mon, Mar 03, 2008 at 02:02:04PM -0500, erik quanstrom wrote: > > > > For what its worth, I just added the following lines to > > yacc.c at the top of the file: > > > > #include > > #define sprint sprintf > > > > The build of plan9port just completed with no errors, the > > problem is somewhere in sprint(). > > > > I'll try and find time tonight to test out the plan9port > > build to verify it works. Let me know if I can provide any > > other useful information. I might try tracking down the bug > > later this week, but not certain I'll have much time to do > > so. > > it is very likely that you have broken yacc in a different > way by doing this. stdio formats are not compatable with > plan 9 print formats. for example, u is a flag when used > with sprint but a verb when used with printf. > > (not to mention the fact that other programs than yacc > use sprint.) Just ran a quick test. While the applications compiled, they were non-functional as you suspected. I tried replacing the content of sprint() with vsprintf(). All applications compiled and the functionality I've tried so far (that used by the wmii window manager) seems to work. Entirely possible, though, I've just been lucky in not hitting a string parsed differently by sprint(). > have you verified that a standalone program with a > similar print statement has the same problems? I just gave it a try using the following: #define FILED "tab.h" #define stem"bc" int main(int argc, char** argv) { /* Lines copied from yacc.c */ char buf[256]; int result = sprint(buf, "%s.%s", stem, FILED); printf("%i: %s\n", result, buf); /* End code from yacc.c */ return 0; } The result was the same as in yacc: return value of 0 and 'buf' is empty. --David
[9fans] migrating from oventi to nventi
I have problems migrating oventi data to nventi on another machine. first tried to extract arenas with oventis venti/rdarena and then pumping them into nventi with nventis venti/wrarena. when tried to format fossil with the last root-score, it failed to find the block. i tried oventis venti/copy with -f mode... this time formating fossil with the rootscore worked, but when doing du -a and venti read error appeared. the missing scores had the last digit always set to zero iirc... the same happend with nventis venti/copy -m. (exact same read errors) then tried with -m -r (rewrite) option and this got me to where i started... fossil cant find the rootscore. the old venti/fossil system where i copying from runs just fine, no missing blocks or disk errors. any further ideas? cinap
Re: [9fans] [OT] interesting hardware project for gsoc
Alexander Sychev wrote: In the Russian army: Sgt.: Who is a painter here? Soldier: I'm the painter. Sgt.: Well, take this axe and draw me a stack of firewood in the morning. That one sounds like it was written by Milligan. Martin
Re: [9fans] plan9port build failure on Linux (debian)
> > For what its worth, I just added the following lines to > yacc.c at the top of the file: > > #include > #define sprint sprintf > > The build of plan9port just completed with no errors, the > problem is somewhere in sprint(). > > I'll try and find time tonight to test out the plan9port > build to verify it works. Let me know if I can provide any > other useful information. I might try tracking down the bug > later this week, but not certain I'll have much time to do > so. it is very likely that you have broken yacc in a different way by doing this. stdio formats are not compatable with plan 9 print formats. for example, u is a flag when used with sprint but a verb when used with printf. (not to mention the fact that other programs than yacc use sprint.) have you verified that a standalone program with a similar print statement has the same problems? - erik
[9fans] o/mero and o/live
I just updated the distrib of the octopus as exported from lsub, to include o/mero and o/live (the file system for omero and the viewer). I made today a silly package to use it standalone on Plan 9 (without the rest of the octopus) only to discover that somehow mounting via #s made omero go s l o o o w. Thus, I wont bother to publish a stand alone package for use on Plan 9 and will fix this problem instead. In any case, if anyone wanted to try it, the version as distributed today has been working for me and man pages are updated. Once the octopus tar has been extracted on inferno (mostly /dis/o files), running o/mero is a matter of executing: o/ports # event delivery o/mero # the file system proper mkdir /mnt/ui/s0# create a screen o/x # ~ to acme and sam language o/live s0 # the viewer (at screen s0) olive(1) is an introduction and is needed to learn how to use our weird menus. The next millenium, once this thing runs fast enough, I'll drop here a line to let others know.
Re: [9fans] plan9port build failure on Linux (debian)
On Mon, Mar 03, 2008 at 11:17:47AM -0500, Russ Cox wrote: > > I am trying to install plan9port on a Linux system (Debian), > > and am getting the following error: > > > > 9 yacc -d -s bc bc.y > > > > fatal error:can't create , :1 > > mk: 9 yacc -d ... : exit status=exit(1) > > mk: for i in ... : exit status=exit(1) > > The is just because the error has happened > very early and yacc hasn't opened the input file yet. > If you poke around in the code you'll find that > it was trying to create bc.tab.h (or should have been) > but somehow this code (stem="bc", FILED = "tab.h"): > > sprint(buf, "%s.%s", stem, FILED); > fdefine = Bopen(buf, OWRITE); > if(fdefine == 0) > error("can't create %s", buf); > > ended up with an empty string in buf instead of bc.tab.h. At that point in the code, stem is set to "bc" as expected. > > So, any ideas on how to fix the build process? The problem > > stems from yacc.c at line #2173 in the sprint() function. > > Could I replace that with the standard library sprintf() > > function as a stop-gap measure? > > It would be interesting to know if that makes it work, > but more interesting would be why the Plan 9 sprint > is broken. This is a pretty simple sprint call and should work. For what its worth, I just added the following lines to yacc.c at the top of the file: #include #define sprint sprintf The build of plan9port just completed with no errors, the problem is somewhere in sprint(). I'll try and find time tonight to test out the plan9port build to verify it works. Let me know if I can provide any other useful information. I might try tracking down the bug later this week, but not certain I'll have much time to do so. > Can you reproduce the prolem if you just run: > > cd /usr/local/plan9/src/cmd > 9 yacc -s bc bc.y > > ? Yes, I get the exact same output. > I'm also interested to see the output of: > > nm /usr/local/plan9/bin/yacc | grep sprint Here is the result: 0804fbe0 T sprint > @erik: > > once that is fixed, it would be interesting to see if yacc > > prints a usage statement instead of printing the garbage. > > The command line passed in the mkfile has worked in > thousands of other builds. Even if stem was nil, buf > should at least end up being ".tab.h" or ".tab.h" > or at the very worst, if %s was broken, ".". Makes sense. > I doubt the command line is being misparsed, but > I don't have any justifiable alternate theories. My first thought was also the command-line was not being parsed, but gdb shows stem is set to "bc" as expected. Its why I suspect the problem somewhere under sprint(). --David
Re: [9fans] plan9port build failure on Linux (debian)
On Tue, Mar 04, 2008 at 12:09:27AM +0900, sqweek wrote: > On Mon, Mar 3, 2008 at 5:46 PM, David Morris <[EMAIL PROTECTED]> wrote: > > This time the build worked, so looks like some lenny/sid > > packages don't work well together. Hmm, or another > > possibility occurs to me. I use the AMD64 kernel > > (2.6.22-3-amd64) on my desktop, but i686 on the laptop > > (2.6.22-3-i686). Any chance that could cause a problem? > > I run p9p on x86_64 at work (CentOS), so no. There are some problems > with 9pfuse under x86_64 (which look like fuse's fault to me), but the > only problem I had at build was missing dependencies (some X11 > development packages). > I'll try and remember to cvs update and see if it still builds. Good to know. I was thinking more along the lines of a problem because I'm using a 64-bit kernel in a 32-bit userspace, a setup I've had other applications have problems with, though it was a binary distribution I simply had to recompile. --David
Re: [9fans] plan9port build failure on Linux (debian)
> I am trying to install plan9port on a Linux system (Debian), > and am getting the following error: > > 9 yacc -d -s bc bc.y > > fatal error:can't create , :1 > mk: 9 yacc -d ... : exit status=exit(1) > mk: for i in ... : exit status=exit(1) The is just because the error has happened very early and yacc hasn't opened the input file yet. If you poke around in the code you'll find that it was trying to create bc.tab.h (or should have been) but somehow this code (stem="bc", FILED = "tab.h"): sprint(buf, "%s.%s", stem, FILED); fdefine = Bopen(buf, OWRITE); if(fdefine == 0) error("can't create %s", buf); ended up with an empty string in buf instead of bc.tab.h. > So, any ideas on how to fix the build process? The problem > stems from yacc.c at line #2173 in the sprint() function. > Could I replace that with the standard library sprintf() > function as a stop-gap measure? It would be interesting to know if that makes it work, but more interesting would be why the Plan 9 sprint is broken. This is a pretty simple sprint call and should work. Can you reproduce the prolem if you just run: cd /usr/local/plan9/src/cmd 9 yacc -s bc bc.y ? I'm also interested to see the output of: nm /usr/local/plan9/bin/yacc | grep sprint @erik: > once that is fixed, it would be interesting to see if yacc > prints a usage statement instead of printing the garbage. The command line passed in the mkfile has worked in thousands of other builds. Even if stem was nil, buf should at least end up being ".tab.h" or ".tab.h" or at the very worst, if %s was broken, ".". I doubt the command line is being misparsed, but I don't have any justifiable alternate theories. Russ
Re: [9fans] plan9port build failure on Linux (debian)
On Mon, Mar 3, 2008 at 5:46 PM, David Morris <[EMAIL PROTECTED]> wrote: > This time the build worked, so looks like some lenny/sid > packages don't work well together. Hmm, or another > possibility occurs to me. I use the AMD64 kernel > (2.6.22-3-amd64) on my desktop, but i686 on the laptop > (2.6.22-3-i686). Any chance that could cause a problem? I run p9p on x86_64 at work (CentOS), so no. There are some problems with 9pfuse under x86_64 (which look like fuse's fault to me), but the only problem I had at build was missing dependencies (some X11 development packages). I'll try and remember to cvs update and see if it still builds. -sqweek
Re: [9fans] plan9port build failure on Linux (debian)
> install.log was no help, the message I quoted was everything > relevant. > > I took a stab at running gdb through yacc, but the compiler > optimized the code to the point finding the problem was > nearly impossible.best I can say is its somewhere in the > dofmt() function (lib9/fmt/dofmt.c) or something it calls. i trust you ran yacc under gdb not gdb through yacc. :-) the problem is unlikely to be with the print. it likely occurred in argument parsing. one thing that should be fixed in p9p is the ARGF() calls should be replaced with EARGF(usage()) in setup(). the definition of usage should be void usage(void) { fprint(2, "usage: yacc [-Dn] [-vdS] [-o outputfile] [-s stem] grammar\n"); exits("usage"); } once that is fixed, it would be interesting to see if yacc prints a usage statement instead of printing the garbage. assuming that things are still broken, i would suggest adding fprint(2, "...") statements in setup to understand where things are going wrong. - erik
Re: [9fans] [OT] interesting hardware project for gsoc
In the Russian army: Sgt.: Who is a painter here? Soldier: I'm the painter. Sgt.: Well, take this axe and draw me a stack of firewood in the morning. On Mon, 03 Mar 2008 09:32:32 +0300, Skip Tavakkolian <[EMAIL PROTECTED]> wrote: here's an idea from brucee for a gsoc project. he did a proof-of-concept earlier today. basically take something like this: http://www.rangboom.com/images/brucee/heap.jpg and transform it into this: http://www.rangboom.com/images/brucee/emptyarena.jpg with this as the byproduct: http://www.rangboom.com/images/brucee/contentaddressablestacks.jpg he is searching for the right graduate student to mentor for this fine art. -- Best regards, santucco
Re: [9fans] GCC/G++: some stress testing
Please note I'm not an expert in this domain. I am only interested in this area, and have only read a few papers. It is interesting to talk with you about this 'real world' problems. Latency is quite important in the application domain I have to target: the target is to produce a new image every 60th of a second, including all the simulation effort to get there. In addition, we have user input which needs to be processed, and usually network delays to worry about as well. Every bit of latency between user input and display breaks the illusion of control. And though TVs are getting better, it's not atypical to see 4-6 frames of latency introduced by the display subsystem, once you've finished generating a frame buffer. So, does this mean the latency is only required by the I/O system of your program ? If so, maybe I'm wrong, what you need is to be able to interrupt working cores and I'm afraid libthread doesn't help here. If not and your algorithm requires (a lot of) fast IPC, maybe this is the reason why it doesn't scale well ? I don't know what you mean by "CSP system itself takes care about memory hierarchy". Do you mean that the CSP implementation does something about it, or do you mean that the code using the CSP approach takes care of it? Both :) I agree with you about the fact programming for the memory hierarchy is way more important than optimizing CPU clocks. But I also think synchronization primitives used in CSP systems are the main reason why CSP programs do not scale well (excepted bad designed algorithm of course). I meant that a different CSP implementation, based on different synchronisation primitive (IPI), can help here. IPI isn't free either - apart from the OS switch, it generates bus traffic that competes with the cache coherence protocols and memory traffic; in a well designed compute kernel that saturates both compute and bandwidth the latency hiccups so introduced can propagate really badly. This is very interesting. For sure IPI is not free. But I thought the bus traffic generated by IPI was less important than cache coherence protocols such as MESI, mainly because it is a one way message. I think now IPI are sent through the system bus (local APIC used to talk through a separate bus), so I agree with you about the fact it can saturate the bandwidth. But I wonder if locking primitive are not worse. It would be interesting to test this. Phil;
Re: [9fans] GCC/G++: some stress testing
Ron, I thought Paul was talking about cache coherent system on which a high-contention lock can become a huge problem. Although the work did by Jim Taft on the NASA project looks very interesting (and if you have pointers to papers about locking primitive on such system, I would appreciate), it seems this system is memory coherent, not cache coherent (coherency maintained by SGI NUMALink interconnect fabric). And I agree with you. I also think (global) shared memory for IPC is more efficient than passing copied data across the nodes, and I suppose several papers tend to confirm this is the case: today's interconnect fabrics are lot of faster than memory memory access. My conjecture (I only have access to a simple dual core machines) is about locking primitive used in CSP (and IPC), I mean libthread which is based on rendezvous system call (which does use locking primitives 9/proc.c:sysrendezvous() ). I think this is the only reason why CSP would not scale well. Regarding my (other) conjecture about IPI, please read my answer to Paul. Phil; If CSP system itself takes care about memory hierarchy and uses no synchronisation (using IPI to send message to another core by example), CSP scales very well. Is this something you have measured or is this conjecture? Of course IPI mechanism requires a switch to kernel mode which costs a lot. But this is necessary only if the destination thread is running on another core, and I don't think latency is very important in algorigthms requiring a lot of cpus. same question. For a look at an interesting library that scaled well on a 1024-node SMP at NASA Ame's, by Jim Taft. Short form: use shared memory for IPC, not data sharing. he's done very well this way. ron
Re: [9fans] plan9port build failure on Linux (debian)
On Mon, Mar 03, 2008 at 01:23:06PM +0800, Hongzheng Wang wrote: > Hi, > > I have just cvsed and rebuilt the whole system. No error > occurs. And my system is Debian sid. So, I think the > problem you encountered might due to some missing or > mismatched packages on your debian box. Perhaps, the > install.log in /usr/local/plan9/ would be helpful to > discover what's wrong during installation. Well, one step forward, one step back install.log was no help, the message I quoted was everything relevant. I took a stab at running gdb through yacc, but the compiler optimized the code to the point finding the problem was nearly impossible.best I can say is its somewhere in the dofmt() function (lib9/fmt/dofmt.c) or something it calls. So I pulled out my VERY slow laptop and spent a few hours letting it compile plan9port. This time the build worked, so looks like some lenny/sid packages don't work well together. Hmm, or another possibility occurs to me. I use the AMD64 kernel (2.6.22-3-amd64) on my desktop, but i686 on the laptop (2.6.22-3-i686). Any chance that could cause a problem? I tried copying the "pure lenny" install to the main system file structure, but that clearly does not work because wmii (the application I need plan9port for) does not run. So, any ideas on how to fix the build process? The problem stems from yacc.c at line #2173 in the sprint() function. Could I replace that with the standard library sprintf() function as a stop-gap measure? --David