Sorry to kick this rotting horse but I just got back
You've got to feed in 2 hours of source material - 820Gb per stream, how?
I suppose some sort of parallel bus of wires or optic fibres.
we call that hand waving
If I have
massively parallel processing I would want massively
I wrote:
I calculated roughly that encoding a 2-hour video could be parallelized by a
factor of perhaps 20 trillion, using pipelining and divide-and-conquer
On Tue, Oct 20, 2009 at 03:16:22AM +0100, matt wrote:
I know you are using video / audio encoding as an example and there are
probably
Can you give one example of a slow task that you think cannot benefit much
from
parallel processing?
Rebuilding a venti index is almost entirely I/O bound.
You can have as many cores as you want and they
will all be sitting idle waiting for the disks. Parallel
processing helps only to the
On Wed, Oct 21, 2009 at 09:11:10AM -0700, Russ Cox wrote:
Can you give one example of a slow task that you think cannot benefit much
from parallel processing?
Rebuilding a venti index is almost entirely I/O bound.
Perhaps I should have specified a processor-bound task. I don't know much
On Wed, Oct 21, 2009 at 8:43 AM, Sam Watkins s...@nipl.net wrote:
People do this stuff every day.
Have you heard of a render-farm?
Yes, and some of them are on this list, and have actually done this
sort of work, as you clearly have not. Else you would understand where
the limits on
Add into that the datarate of full 10 bit uncompressed 1920x1080/60i HD
is 932Mbit so your 1Ghz clockspeed might not be fast enough to play it :)
Not sure I agree, I think its worse than that:
1920pixels * 1080lines * 30 frames/sec * 20bits/sample in YUV
= 1.244Gbps
Also, if you want to
At the hardware level we do have message passing between a
processor and the memory controller -- this is exactly the
same as talking to a shared server and has the same issues of
scaling etc. If you have very few clients, a single shared
server is indeed a cost effective solution.
just to
The misinterpretation of Moore's Law is to blame here, of course: Moore
is a smart guy and he was talking about transistor density, but pop culture
made is sound like he was talking speed up. For some time the two were
in lock-step. Not anymore.
I ran the numbers the other day based on
On Fri, Oct 16, 2009 at 12:18:47PM -0600, Latchesar Ionkov wrote:
How do you plan to feed data to these 31 thousand processors so they
can be fully utilized? Have you done the calculations and checked what
memory bandwidth would you need for that?
I would use a pipelining + divide-and-conquer
I would use a pipelining + divide-and-conquer approach, with some RAM on chip.
Units would be smaller than a 6502, more like an adder.
you mean like the Thinking Machines CM-1 and CM-2?
it's not like it hasn't been done before :)
On Mon, Oct 19, 2009 at 8:26 AM, Sam Watkins s...@nipl.net wrote:
On Fri, Oct 16, 2009 at 12:18:47PM -0600, Latchesar Ionkov wrote:
How do you plan to feed data to these 31 thousand processors so they
can be fully utilized? Have you done the calculations and checked what
memory bandwidth would
On Sat, Oct 17, 2009 at 07:45:40PM +0100, Eris Discordia wrote:
Another embarrassingly parallel problem, as Sam Watkins pointed out, arises
in digital audio processing.
The pipelining + divide-and-conquer method which I would use for parallel
systems is much like a series of production lines
On Sun, Oct 18, 2009 at 01:12:58AM +, Roman Shaposhnik wrote:
I would appreciate if the folks who were in the room correct me, but if I'm
not mistaken Ken was alluding to some FPGA work/ideas that he had done
and my interpretation of his comments was that if we *really* want to
make things
Details of the calculation: 7200 seconds * 30fps * 12*16 (50*50 pixel chunks)
*
50 elementary arithmetic/logical operations in a pipeline (unrolled).
7200*30*12*16*50 = 20 trillion (20,000,000,000,000) processing units.
This is only a very rough estimate and does not consider all the
I ran the numbers the other day based on sped doubles every 2 years, a
60Mhz Pentium would be running 16Ghz by now
I think it was the 1ghz that should be 35ghz
you motivated me to find my copy of _high speed
semiconductor devices_, s.m. sze, ed., 1990.
there might be one our two little
My point is, one can design systems to solve practical problems that use
almost
arbitrarily large numbers of processing units running in parallel.
design != build
russ
erik quanstrom wrote:
you motivated me to find my copy of _high speed
semiconductor devices_, s.m. sze, ed., 1990.
which motivated me to dig out the post I made elsewhere :
Moore's law doesn't say anything about speed or power. It says
manufacturing costs will lower from technological
Eris Discordia wrote:
Moore's law doesn't say anything about speed or power.
But why'd you assume people in the wrong (w.r.t. their understanding
of Moore's law) would measure speed in gigahertz rather than MIPS or
FLOPS?
because that's what the discussion I was having was about
you motivated me to find my copy of _high speed
semiconductor devices_, s.m. sze, ed., 1990.
which motivated me to dig out the post I made elsewhere :
Moore's law doesn't say anything about speed or power. It says
manufacturing costs will lower from technological improvements such
this is quite an astounding thread. you brought
up clock speed doubling and now refute yourself.
i just noted that 48ghz is not possible with silicon
non-quantium effect tech.
- erik
I think I've been misunderstood, I wasn't asserting the clock speed
increase in the first place, I was
I'm a tiny fish, this is the ocean. Nevertheless, I venture: there are
already Cell-based expansion cards out there for real-time
H.264/VC-1/MPEG-4 AVC encoding. Meaning, 1080p video in, H.264 stream out,
real-time.
Interesting, 1080p? you have a link?
-Steve
On Thu, Oct 15, 2009 at 12:50:48PM +0100, Richard Miller wrote:
It's easy to write good code that will take advantage of arbitrarily many
processors to run faster / smoother, if you have a proper language for the
task.
... and if you can find a way around Amdahl's law (qv).
The speedup
There is a vast range of applications that cannot
be managed in real time using existing single-core technology.
please name one.
Your apparent lack of imagination surprises me.
Surely you can see that a whole range of applications becomes possible when
using a massively parallel system,
On Thu, Oct 15, 2009 at 04:21:16PM +0100, roger peppe wrote:
BTW it seems the gates quote is false:
http://en.wikiquote.org/wiki/Bill_Gates
maybe the Ken quote is false too - hard to believe he's that out of touch
How do you plan to feed data to these 31 thousand processors so they
can be fully utilized? Have you done the calculations and checked what
memory bandwidth would you need for that?
There are reasons Pentium 4 has the performance you mention, but these
reasons don't necessary include the great
ron minnich wrote:
Insignificant
bits of code that were not even visible suddenly dominate the time.
Reminds me of some project development teams.
Maybe Marvin Minsky was on to something.
Instantaneous building of a complex project from source.
(I'm defining instantaneous as less than 1 second for this.)
Depends on how complex. I spent two years retrofitting a commercial
parallel make (which only promises a 20x speedup, even with dedicated
hardware) into the build system of a
i missed this the first time
On Fri Oct 16 17:19:36 EDT 2009, jason.cat...@gmail.com wrote:
Instantaneous building of a complex project from source.
(I'm defining instantaneous as less than 1 second for this.)
Depends on how complex.
good story. it's hard to know when to rewrite.
gcc
maybe the Ken quote is false too - hard to believe he's that out of touch
The whole table was ganging up on Roman and his crazy idea, I believe
;). The objection mostly was to Intel dumping the complexity of
another core on the programmer after it ran out of steam in containing
parallelism
Richard Miller wrote:
It's easy to write good code that will take advantage of arbitrarily many
processors to run faster / smoother, if you have a proper language for the
task.
... and if you can find a way around Amdahl's law (qv).
http://www.cis.temple.edu/~shi/docs/amdahl/amdahl.html
There is a vast range of applications that cannot
be managed in real time using existing single-core technology.
I'm sorry to interrupt your discussion, but what is real time?
On Thu Oct 15 06:55:24 EDT 2009, s...@nipl.net wrote:
task. With respect to Ken, Bill Gates said something along the lines of who
would need more than 640K?.
on the other hand, there were lots of people using computers with 4mb
of memory when bill gates said this. it was quite easy to see how
On Thu Oct 15 08:01:29 EDT 2009, w...@conducive.org wrote:
Richard Miller wrote:
It's easy to write good code that will take advantage of arbitrarily many
processors to run faster / smoother, if you have a proper language for the
task.
... and if you can find a way around Amdahl's law
On Thu Oct 15 09:41:29 EDT 2009, 9f...@hamnavoe.com wrote:
in fact, i believe i used an apple ][ around
that time that had ~744k.
Are you sure that was an apple II? When I bought mine I remember
wrestling with the decision over whether to get the standard 48k of
RAM or upgrade to the full
in fact, i believe i used an apple ][ around
that time that had ~744k.
Are you sure that was an apple II? When I bought mine I remember
wrestling with the decision over whether to get the standard 48k of
RAM or upgrade to the full 64k. This was long before the IBM PC.
On Thu, Oct 15, 2009 at 6:11 AM, hiro 23h...@googlemail.com wrote:
There is a vast range of applications that cannot
be managed in real time using existing single-core technology.
I'm sorry to interrupt your discussion, but what is real time?
Real time just means fast enough to work
On Thu, Oct 15, 2009 at 6:52 AM, erik quanstrom quans...@quanstro.netwrote:
On Thu Oct 15 09:41:29 EDT 2009, 9f...@hamnavoe.com wrote:
in fact, i believe i used an apple ][ around
that time that had ~744k.
Are you sure that was an apple II? When I bought mine I remember
wrestling
it sounds like the kernel (L4-like, supposedly tuned to the specific
hardware) and the monitor (userland, portable) are shared, from
the paper.
I'm confused what you mean by shared.
ugh, I completely botched that.. I meant replicated not shared.
-eric
Tim Newsham
I think this is an interesting approach.
There are several interesting ideas being pursued here. The focus of
the discussion has been on the multikernel approach, which I think has
merit.
Something that has not been discussed here is the wide use of DSLs for
systems programming, and using
Rethinking multi-core systems as distributed heterogeneous
systems. Thoughts?
http://www.sigops.org/sosp/sosp09/papers/baumann-sosp09.pdf
Tim Newsham
http://www.thenewsh.com/~newsham/
On Wed, Oct 14, 2009 at 12:09 PM, Tim Newsham news...@lava.net wrote:
Rethinking multi-core systems as distributed heterogeneous
systems. Thoughts?
Somehow this feels related to the work that came out of Berkeley a year
or so ago. I'm still not convinced what is the benefits of multiple
Somehow this feels related to the work that came out of Berkeley a year
or so ago. I'm still not convinced what is the benefits of multiple
kernels. If you are managing a couple of 100s of cores a single kernel
would do just fine, once the industry is ready for a couple dozen of
thousands PUs --
I'm not familiar with the berkeley work.
Me either. Any chance of some references to this?
And how does one deal with heterogeneous cores and complex on chip
interconnect topologies? Barrelfish also gas a nice benefit in that
it could span coherence domains.
There's no real evdence that single kernels do well with hundreds of
real cores (as opposed to hw threads) - in fact most
http://ramp.eecs.berkeley.edu/
Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a
bit different. They are consciously avoiding the networking issue as
well(they've been asked to extend their messaging model to the network
and have actively said they're not interested).
On Wed,
http://ramp.eecs.berkeley.edu/
Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a
bit different. They are consciously avoiding the networking issue as
well(they've been asked to extend their messaging model to the network
and have actively said they're not interested).
Have you read the paper? I don't think you understand the difference
in scope or goals here.
On Wed, Oct 14, 2009 at 11:45 PM, erik quanstrom quans...@coraid.com wrote:
http://ramp.eecs.berkeley.edu/
Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a
bit different. They are
On Oct 14, 2009, at 3:42 PM, Noah Evans wrote:
http://ramp.eecs.berkeley.edu/
Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a
bit different. They are consciously avoiding the networking issue as
well(they've been asked to extend their messaging model to the network
and
Do want.
On Thu, Oct 15, 2009 at 12:10 AM, Eric Van Hensbergen eri...@gmail.com wrote:
On Oct 14, 2009, at 3:42 PM, Noah Evans wrote:
http://ramp.eecs.berkeley.edu/
Tim: Andrew Baumann is aware of Plan 9 but their approach is quite a
bit different. They are consciously avoiding the
Did you find any ideas there particularly engaging?
I'm still digesting it. My first thoughts were that if my pc is a
distributed heterogeneous computer, what lessons it can borrow from earlier
work on distributed heterogeneous computing (ie. plan9).
I found the discussion on cache
On Wed, Oct 14, 2009 at 2:21 PM, Tim Newsham news...@lava.net wrote:
I'm not familiar with the berkeley work.
Sorry I can't readily find the paper (the URL is somewhere on IMAP @Sun :-()
But it got presented at the Birkeley ParLab overview given to us by
Dave Patterson.
They were talking thin
And how does one deal with heterogeneous cores and complex on chip
interconnect topologies?
Good question. Do they have to be heterogeneous? My oppinion is that the
future of big multicore will be more Cell-like.
There's no real evdence that single kernels do well with hundreds of real
cores
52 matches
Mail list logo