Re: Opportunity for speedup

2009-03-11 Thread Bobby Powers
On Wed, Mar 11, 2009 at 4:13 PM, Daniel Drake  wrote:

> 2009/3/1 Bobby Powers :
> > I can't seem to get ul-warning to come up properly, so if anyone can
> > tell me what I'm doing wrong that would be great.  I've got it to work
> > by manually placing some symlinks in /etc/rc0.d and /etc/rc6.d, but
> > neither Scott's nor my chkconfig comments seem to work.
>
> Here's a fixed ul-warning initscript.
>

thanks, the fix is pushed.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-03-11 Thread Daniel Drake
2009/3/1 Bobby Powers :
> I can't seem to get ul-warning to come up properly, so if anyone can
> tell me what I'm doing wrong that would be great.  I've got it to work
> by manually placing some symlinks in /etc/rc0.d and /etc/rc6.d, but
> neither Scott's nor my chkconfig comments seem to work.

Here's a fixed ul-warning initscript.


ul-warning
Description: Binary data
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-03-03 Thread Gary C Martin
Hi Bobby,

On 1 Mar 2009, at 21:44, Bobby Powers wrote:

> I've fixed a few issues, packaged up bootanim-2.3-1, and (finally)
> actually ran some benchmarks.  Results (all times in seconds):
>
> fresh os801, from pressing the power button to appearance of sugar's
> prompt for name screen
> 80
> 79
> 78
>
> with rhgb-client renamed so that init can't find it:
> 69
> 68
>
> and with bootanim-2.(1-3) rpm installed:
> 67
> 67
> 67
> 68
> 67
>
> If anyone is unconvinced, I could run more tests, but this seems
> pretty good to me.  Its a 15% overall speedup in the boot process.

I've just run a test here with candidate 801; average over 5 runs;  
starting on button press, stopping when XO first appears in users  
colours:

Before bootanim-2.3-1.i386.rpm:

85.9 seconds

After patching:

74.6 seconds

Booting in ugly text mode (includes the 3 sec ok wait):

72.2 seconds

So, if this 10 sec boot saving gets accepted in a future build, you've  
just gained the world 1,400 extra hours of XO usage from the time this  
patch lands, and for every day thereafter (assumes a conservative 500K  
kids boot their XO just once a day on average).

Fantastic work, what an impressive butterfly effect!! :-)

--Gary

> Interesting notes:
> chkconfig doesn't like binary services - it parses services in
> /etc/init.d to look for metadata in comments, and the mechanism to
> override this data (sticking a file with the same name in
> /etc/chkconfig.d with appropriate comments) doesn't seem to work if
> the original script can't be parsed.  So I had to make small wrappers
> for ul-warning, boot-anim-start and boot-anim-stop.  This doesn't seem
> to affect performance.
>
> I can't seem to get ul-warning to come up properly, so if anyone can
> tell me what I'm doing wrong that would be great.  I've got it to work
> by manually placing some symlinks in /etc/rc0.d and /etc/rc6.d, but
> neither Scott's nor my chkconfig comments seem to work.
>
> source:
> http://dev.laptop.org/git?p=users/bobbyp/bootanim
> koji-built rpms:
> http://dev.laptop.org/~bobbyp/bootanim/
> (koji task https://koji.fedoraproject.org/koji/taskinfo? 
> taskID=1211738 )
>
> I don't know if this could make it into 8.2.1, or what the process
> would be toward getting it at least in the Rawhide/SOAS images, but it
> seems pretty low risk (assuming someone can tell me what I'm doing
> wrong w.r.t. ul-warning).
>
> yours,
> Bobby
>
> On Thu, Feb 19, 2009 at 3:03 AM, Mitch Bradley  wrote:
>> Cool!
>>
>> Bobby Powers wrote:
>>>
>>> On Wed, Feb 11, 2009 at 2:01 AM, Mitch Bradley   
>>> wrote:
>>>

 I just measured the time taken by the boot animation by the simple
 technique of renaming /usr/bin/rhgb-client so the initscripts  
 can't find
 it.

>>>
>>> how did you measure exactly? stopwatch? I'd like to recreate the
>>> tests.  It sounds like you did this on a freshly flashed system?
>>>
>>
>> Yes on both counts.  Stopwatch on freshly-flashed os7.img .
>>
>>
>>>

 With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60
 seconds from first dot (indicating OFW transfer to Linux) to Sugar
 "prompt for your name".   Without it, 53 seconds.  I repeated the  
 test
 several times with consistent results.

 Clearly, it should be possible to display that amount of  
 information in
 much less than 7 seconds.

 The boot animation code is in the OLPC domain, not the upstream  
 domain,
 so replacing it should be relatively free of upstream politics.

 So if anybody is interested in implementing a relatively simple
 boot-time speedup, I offer this as low-hanging fruit.

 I suggest 1 second (differential time between animation and no- 
 animation
 cases) as a reasonable target goal, assuming images of the  
 complexity of
 the current ones.  Arbitrary full-screen graphics might require  
 more
 time, but speeding up the baseline case is a good starting point.

 Go wild.

>>>
>>> So I've taken a first cut at this, implemented with the following
>>> design considerations (mostly from a conversation with Mitch)
>>> - the Python client/server was reimplemented as several standalone C
>>> programs (boot-anim-start, boot-anim-client, and some cleanup in
>>> boot-anim-stop)
>>> - a client and server was used before because there is state
>>> information that needs to be saved: we need to keep track of where  
>>> in
>>> the animation we are.  We can keep track of this by using offscreen
>>> memory in the framebuffer (its 16MB in size, and only the first 2ish
>>> MB is used for the onscreen graphics (my terminology might be off
>>> here)).  For state we really only need to keep track of 2 integers,
>>> one for the current frame number and another to store the offset of
>>> the next diff to apply.
>>> - on startup we load an initial image into the framebuffer (the  
>>> first
>>> 1200*900*2 bytes, since we use 2 bytes per pixel 

Re: Opportunity for speedup

2009-03-01 Thread James Cameron
On Sun, Mar 01, 2009 at 04:44:01PM -0500, Bobby Powers wrote:
> I can't seem to get ul-warning to come up properly, so if anyone can
> tell me what I'm doing wrong that would be great.

What actually goes wrong?  Is ul-warning executed?

-- 
James Cameronmailto:qu...@us.netrek.org http://quozl.netrek.org/
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-03-01 Thread Gary C Martin
On 1 Mar 2009, at 21:44, Bobby Powers wrote:

> I've fixed a few issues, packaged up bootanim-2.3-1, and (finally)
> actually ran some benchmarks.  Results (all times in seconds):
>
> fresh os801, from pressing the power button to appearance of sugar's
> prompt for name screen
> 80
> 79
> 78
>
> with rhgb-client renamed so that init can't find it:
> 69
> 68
>
> and with bootanim-2.(1-3) rpm installed:
> 67
> 67
> 67
> 68
> 67
>
> If anyone is unconvinced, I could run more tests, but this seems
> pretty good to me.  Its a 15% overall speedup in the boot process.

Hey Bobby, that sounds great, many thanks for putting the effort in!  
I'll try your rpm on one of the XOs here and ping back with some  
additional measurements.

Regards,
--Gary

> Interesting notes:
> chkconfig doesn't like binary services - it parses services in
> /etc/init.d to look for metadata in comments, and the mechanism to
> override this data (sticking a file with the same name in
> /etc/chkconfig.d with appropriate comments) doesn't seem to work if
> the original script can't be parsed.  So I had to make small wrappers
> for ul-warning, boot-anim-start and boot-anim-stop.  This doesn't seem
> to affect performance.
>
> I can't seem to get ul-warning to come up properly, so if anyone can
> tell me what I'm doing wrong that would be great.  I've got it to work
> by manually placing some symlinks in /etc/rc0.d and /etc/rc6.d, but
> neither Scott's nor my chkconfig comments seem to work.
>
> source:
> http://dev.laptop.org/git?p=users/bobbyp/bootanim
> koji-built rpms:
> http://dev.laptop.org/~bobbyp/bootanim/
> (koji task https://koji.fedoraproject.org/koji/taskinfo? 
> taskID=1211738 )
>
> I don't know if this could make it into 8.2.1, or what the process
> would be toward getting it at least in the Rawhide/SOAS images, but it
> seems pretty low risk (assuming someone can tell me what I'm doing
> wrong w.r.t. ul-warning).
>
> yours,
> Bobby
>
> On Thu, Feb 19, 2009 at 3:03 AM, Mitch Bradley  wrote:
>> Cool!
>>
>> Bobby Powers wrote:
>>>
>>> On Wed, Feb 11, 2009 at 2:01 AM, Mitch Bradley   
>>> wrote:
>>>

 I just measured the time taken by the boot animation by the simple
 technique of renaming /usr/bin/rhgb-client so the initscripts  
 can't find
 it.

>>>
>>> how did you measure exactly? stopwatch? I'd like to recreate the
>>> tests.  It sounds like you did this on a freshly flashed system?
>>>
>>
>> Yes on both counts.  Stopwatch on freshly-flashed os7.img .
>>
>>
>>>

 With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60
 seconds from first dot (indicating OFW transfer to Linux) to Sugar
 "prompt for your name".   Without it, 53 seconds.  I repeated the  
 test
 several times with consistent results.

 Clearly, it should be possible to display that amount of  
 information in
 much less than 7 seconds.

 The boot animation code is in the OLPC domain, not the upstream  
 domain,
 so replacing it should be relatively free of upstream politics.

 So if anybody is interested in implementing a relatively simple
 boot-time speedup, I offer this as low-hanging fruit.

 I suggest 1 second (differential time between animation and no- 
 animation
 cases) as a reasonable target goal, assuming images of the  
 complexity of
 the current ones.  Arbitrary full-screen graphics might require  
 more
 time, but speeding up the baseline case is a good starting point.

 Go wild.

>>>
>>> So I've taken a first cut at this, implemented with the following
>>> design considerations (mostly from a conversation with Mitch)
>>> - the Python client/server was reimplemented as several standalone C
>>> programs (boot-anim-start, boot-anim-client, and some cleanup in
>>> boot-anim-stop)
>>> - a client and server was used before because there is state
>>> information that needs to be saved: we need to keep track of where  
>>> in
>>> the animation we are.  We can keep track of this by using offscreen
>>> memory in the framebuffer (its 16MB in size, and only the first 2ish
>>> MB is used for the onscreen graphics (my terminology might be off
>>> here)).  For state we really only need to keep track of 2 integers,
>>> one for the current frame number and another to store the offset of
>>> the next diff to apply.
>>> - on startup we load an initial image into the framebuffer (the  
>>> first
>>> 1200*900*2 bytes, since we use 2 bytes per pixel for color
>>> information), and then load in a series of changes to the  
>>> framebuffer
>>> image (<300KB).  This takes the form of a series of diffs
>>> - for each update (a valid call to boot-anim-client) we apply the  
>>> next
>>> diff in the series to the onscreen image and update our state
>>> information
>>> - after applying the last diff we have (the end in the animation
>>> series), freeze the DCON (when I first attempted to freeze the DCON
>>> when z-boot-anim-stop was

Re: Opportunity for speedup

2009-03-01 Thread Bobby Powers
I've fixed a few issues, packaged up bootanim-2.3-1, and (finally)
actually ran some benchmarks.  Results (all times in seconds):

fresh os801, from pressing the power button to appearance of sugar's
prompt for name screen
80
79
78

with rhgb-client renamed so that init can't find it:
69
68

and with bootanim-2.(1-3) rpm installed:
67
67
67
68
67

If anyone is unconvinced, I could run more tests, but this seems
pretty good to me.  Its a 15% overall speedup in the boot process.

Interesting notes:
chkconfig doesn't like binary services - it parses services in
/etc/init.d to look for metadata in comments, and the mechanism to
override this data (sticking a file with the same name in
/etc/chkconfig.d with appropriate comments) doesn't seem to work if
the original script can't be parsed.  So I had to make small wrappers
for ul-warning, boot-anim-start and boot-anim-stop.  This doesn't seem
to affect performance.

I can't seem to get ul-warning to come up properly, so if anyone can
tell me what I'm doing wrong that would be great.  I've got it to work
by manually placing some symlinks in /etc/rc0.d and /etc/rc6.d, but
neither Scott's nor my chkconfig comments seem to work.

source:
http://dev.laptop.org/git?p=users/bobbyp/bootanim
koji-built rpms:
http://dev.laptop.org/~bobbyp/bootanim/
(koji task https://koji.fedoraproject.org/koji/taskinfo?taskID=1211738 )

I don't know if this could make it into 8.2.1, or what the process
would be toward getting it at least in the Rawhide/SOAS images, but it
seems pretty low risk (assuming someone can tell me what I'm doing
wrong w.r.t. ul-warning).

yours,
Bobby

On Thu, Feb 19, 2009 at 3:03 AM, Mitch Bradley  wrote:
> Cool!
>
> Bobby Powers wrote:
>>
>> On Wed, Feb 11, 2009 at 2:01 AM, Mitch Bradley  wrote:
>>
>>>
>>> I just measured the time taken by the boot animation by the simple
>>> technique of renaming /usr/bin/rhgb-client so the initscripts can't find
>>> it.
>>>
>>
>> how did you measure exactly? stopwatch? I'd like to recreate the
>> tests.  It sounds like you did this on a freshly flashed system?
>>
>
> Yes on both counts.  Stopwatch on freshly-flashed os7.img .
>
>
>>
>>>
>>> With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60
>>> seconds from first dot (indicating OFW transfer to Linux) to Sugar
>>> "prompt for your name".   Without it, 53 seconds.  I repeated the test
>>> several times with consistent results.
>>>
>>> Clearly, it should be possible to display that amount of information in
>>> much less than 7 seconds.
>>>
>>> The boot animation code is in the OLPC domain, not the upstream domain,
>>> so replacing it should be relatively free of upstream politics.
>>>
>>> So if anybody is interested in implementing a relatively simple
>>> boot-time speedup, I offer this as low-hanging fruit.
>>>
>>> I suggest 1 second (differential time between animation and no-animation
>>> cases) as a reasonable target goal, assuming images of the complexity of
>>> the current ones.  Arbitrary full-screen graphics might require more
>>> time, but speeding up the baseline case is a good starting point.
>>>
>>> Go wild.
>>>
>>
>> So I've taken a first cut at this, implemented with the following
>> design considerations (mostly from a conversation with Mitch)
>> - the Python client/server was reimplemented as several standalone C
>> programs (boot-anim-start, boot-anim-client, and some cleanup in
>> boot-anim-stop)
>> - a client and server was used before because there is state
>> information that needs to be saved: we need to keep track of where in
>> the animation we are.  We can keep track of this by using offscreen
>> memory in the framebuffer (its 16MB in size, and only the first 2ish
>> MB is used for the onscreen graphics (my terminology might be off
>> here)).  For state we really only need to keep track of 2 integers,
>> one for the current frame number and another to store the offset of
>> the next diff to apply.
>> - on startup we load an initial image into the framebuffer (the first
>> 1200*900*2 bytes, since we use 2 bytes per pixel for color
>> information), and then load in a series of changes to the framebuffer
>> image (<300KB).  This takes the form of a series of diffs
>> - for each update (a valid call to boot-anim-client) we apply the next
>> diff in the series to the onscreen image and update our state
>> information
>> - after applying the last diff we have (the end in the animation
>> series), freeze the DCON (when I first attempted to freeze the DCON
>> when z-boot-anim-stop was called it left the screen in an inconsistent
>> state, I believe because of X startup)
>> - its designed to be as light as possible, using syscalls instead of
>> libc functions as much as possible (the only thing we use libc for is
>> string comparison, which could be replaced with a local function).
>> while its written like this, I haven't worked on cutting down the
>> linking (I need some guidance for that)
>>
>
> To reduce the execution footprint, you could tr

Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
da...@lang.hm wrote:
>
> d) compile the delta set into the client program.

That works, but

1) It requires more work from the VM system on each invocation of the 
client program, which is now 1.x MB instead of 4K.
2) If a deployment wants to change the image set, it needs a compiler 
toolchain instead of a (small) delta-encoding program.

Speed-wise, (d) might be a wash, or perhaps even a slight win.  It 
depends on how efficient the VM system is, and the effectiveness of the 
filesystem buffer cache at preventing re-reads of the client process 
image (paging directly from JFFS2 is not possible).

The framebuffer hack avoids numerous assumptions about the effectiveness 
of clever but complex subsystems (e.g. the VM system, the filesystem 
buffer cache, the shared library mechanisms, zlib, JFFS2 compression, ...).

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread david
On Thu, 19 Feb 2009, Mitch Bradley wrote:

> da...@lang.hm wrote:
>> 
>> right, but why read the current framebuffer? you don't touch most of it, 
>> you aren't going to do anything different based on what's there (you are 
>> just going to overlay your new info there) so all you really need to do is 
>> to write the parts tha need to change.
>
> You don't read the on-screen part of the framebuffer.  You copy delta data 
> from off-screen framebuffer memory to portions of the on-screen framebuffer 
> memory.
>
> On-screen vs. off-screen is irrelevant to the speed - read access to the 
> memory that is reserved for display controller use is similarly "slow" in 
> both cases.  But considering that the delta data is small compared to the 
> full images, it's worth it to store the deltas there, thus avoiding the 
> overhead of the other alternatives for maintaining the context from one call 
> to the next.
>
> Those alternatives are:
>
> a) Server process maintains context on behalf of repeatedly-executed client 
> process.  This incurs the complexity of client-server architectures - 
> setup/teardown, library overhead, interprocess communication, scheduling.
>
> b) Client program reads new delta data from a file on each invocation.  This 
> incurs the filesystem overhead of opening a file on each invocation (in 
> comparison, the off-screen framebuffer solution requires only a single open() 
> and a single read() on the first invocation.
>
> c) Client program reads delta set into a shared memory segment and then 
> reattaches to that segment on subsequent invocations.  This is similar to the 
> framebuffer approach except that it uses faster memory for the persistent 
> storage.  It might be a win from a speed perspective, but it is a bit more 
> complex, requiring the program to deal with two memory objects instead of 
> just one.  The total amount of time that it could possibly save is about 50 
> mS, since that it the time it takes to read the delta set from the off-screen 
> framebuffer.  And if we use the RLE encoding suggested by Wade, the amount of 
> off-screen data is halved, so the best-case savings are reduced to 25 mS 
> total.

d) compile the delta set into the client program.

does this really need to be a general-purpose solution here? or is this 
really only used for this specific purpose.

David Lang
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
da...@lang.hm wrote:
>
> right, but why read the current framebuffer? you don't touch most of 
> it, you aren't going to do anything different based on what's there 
> (you are just going to overlay your new info there) so all you really 
> need to do is to write the parts tha need to change.

You don't read the on-screen part of the framebuffer.  You copy delta 
data from off-screen framebuffer memory to portions of the on-screen 
framebuffer memory.

On-screen vs. off-screen is irrelevant to the speed - read access to the 
memory that is reserved for display controller use is similarly "slow" 
in both cases.  But considering that the delta data is small compared to 
the full images, it's worth it to store the deltas there, thus avoiding 
the overhead of the other alternatives for maintaining the context from 
one call to the next.

Those alternatives are:

a) Server process maintains context on behalf of repeatedly-executed 
client process.  This incurs the complexity of client-server 
architectures - setup/teardown, library overhead, interprocess 
communication, scheduling.

b) Client program reads new delta data from a file on each invocation.  
This incurs the filesystem overhead of opening a file on each invocation 
(in comparison, the off-screen framebuffer solution requires only a 
single open() and a single read() on the first invocation.

c) Client program reads delta set into a shared memory segment and then 
reattaches to that segment on subsequent invocations.  This is similar 
to the framebuffer approach except that it uses faster memory for the 
persistent storage.  It might be a win from a speed perspective, but it 
is a bit more complex, requiring the program to deal with two memory 
objects instead of just one.  The total amount of time that it could 
possibly save is about 50 mS, since that it the time it takes to read 
the delta set from the off-screen framebuffer.  And if we use the RLE 
encoding suggested by Wade, the amount of off-screen data is halved, so 
the best-case savings are reduced to 25 mS total.


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread david
On Thu, 19 Feb 2009, Mitch Bradley wrote:

> da...@lang.hm wrote:
>> 
>> if you have the diff of the images, do you need to read from the 
>> framebuffer at all? since you know what you put there, and know what you 
>> want to change, can't you just write your changed information to the right 
>> place?
>
> The framebuffer in this case is serving as persistent shared memory, thus 
> avoiding the extra complexity of a client/server architecture to maintain the 
> sequencing state.
>
> The extremely-tiny (4K - 1 memory page) client program initially reads the 
> first frame into the on-screen framebuf and the delta set into off-screen 
> framebuffer memory.  On subsequent invocations, the client copies another 
> delta into the on-screen framebuf.
>
> If it is statically linked and uses only direct syscalls, the exec() overhead 
> is minimal - no shell process instantiation, no script startup, no ld.so 
> invocations, no mapping in shared libraries, no relocation.

right, but why read the current framebuffer? you don't touch most of it, 
you aren't going to do anything different based on what's there (you are 
just going to overlay your new info there) so all you really need to do is 
to write the parts tha need to change.

David Lang
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
da...@lang.hm wrote:
>
> if you have the diff of the images, do you need to read from the 
> framebuffer at all? since you know what you put there, and know what 
> you want to change, can't you just write your changed information to 
> the right place?

The framebuffer in this case is serving as persistent shared memory, 
thus avoiding the extra complexity of a client/server architecture to 
maintain the sequencing state.

The extremely-tiny (4K - 1 memory page) client program initially reads 
the first frame into the on-screen framebuf and the delta set into 
off-screen framebuffer memory.  On subsequent invocations, the client 
copies another delta into the on-screen framebuf.

If it is statically linked and uses only direct syscalls, the exec() 
overhead is minimal - no shell process instantiation, no script startup, 
no ld.so invocations, no mapping in shared libraries, no relocation.

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread david
On Thu, 19 Feb 2009, Bobby Powers wrote:

> On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian  wrote:
>> I'd suggest just uncompressing the various image files and re-timing
>> as a start.  The initial implementation was uncompressed, but people
>> complained about space usage on the emulator images (which are
>> uncompressed).  The current code supports both uncompressed and
>> compressed image formats.  For uncompressed images, putting the bits
>> on the screen is an mmap and memcpy, so I can't imagine any
>> implementation being faster than that (it's possible, of course, that
>> what's stealing CPU is the shell's invocation of the client program;
>> recoding just that little part in C should be trivial, since it does
>> nothing but write to a socket IIRC.)
>>
>> Anyway, further benchmarking of the current implementation is probably
>> worthwhile before a complete reimplementation is called for.  But if
>> you want to reimplement it from scratch, go nuts.
>>  --scott
>
> I already re-implemented it - it was a fun optimization project and
> introduction to lower level systems programming.  Using Mitch's D565
> format to keep track of only the parts of the image that change cut
> down the implementation size significantly.  Its now only 2
> uncompressed images (frame00.565 and ul-warning.565), and <300KB of
> differences for the animation sequence.  I understand reads from video
> memory (which I think is what the framebuffer is?) can be extremely
> slow, so it could turn out faster to open a D565 file, mmap it and
> mcpy the several tens of kilobytes of differences to the framebuffer
> than it is to read those differences from one part of video memory to
> another.
>
> This is where benchmarking should give some clearer answers.

if you have the diff of the images, do you need to read from the 
framebuffer at all? since you know what you put there, and know what you 
want to change, can't you just write your changed information to the right 
place?

David Lang
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
Bobby Powers wrote:
> On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd  wrote:
>   
>> RLE (run length encoding) compresses sequences of identical pixels ("runs")
>> as value/count pairs.
>> So abbccc would be stored as 1a 10b 3c.
>> The decompressor looks like:
>> while (cur < end)
>> {
>>unsigned short count = *cur++;
>>unsigned short value = *cur++;
>>while (count--)
>>   *dest++ = value;
>> }
>> This can be faster than memcpy because you are reading significantly less
>> memory than you would with memcpy, thus fewer cache misses are incurred.
>> Because the startup images are mostly spans solid colors, this kind of
>> compression works very well.  If that were not the case, say if there were a
>> left-to-right gradient in the background, RLE would probably make things
>> worse, thus you have to be careful when choosing it.
>> But the smaller size on disk and in memory would probably improve
>> performance in other ways as well.
>> Best,
>> Wade
>> 
>
> thanks, that makes sense
>   
We are already getting some portion of the possible compression by doing 
the "iframe style" delta encoding of the second and subsequent frames, 
but the rle is still of some use.  It does a good job of shrinking the 
first frame, and it halves the size of the delta wad.

The first-frame-shrink could also be accomplished by the trick of 
assuming an initial solid background and representing the first frame as 
a delta from that.

In either case, it looks like rle decoding might be a nice addition, as 
it reduces the size of the frames on disk from 1.2 MB to about 140 KB.



___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Bobby Powers
On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd  wrote:
> RLE (run length encoding) compresses sequences of identical pixels ("runs")
> as value/count pairs.
> So abbccc would be stored as 1a 10b 3c.
> The decompressor looks like:
> while (cur < end)
> {
>unsigned short count = *cur++;
>unsigned short value = *cur++;
>while (count--)
>   *dest++ = value;
> }
> This can be faster than memcpy because you are reading significantly less
> memory than you would with memcpy, thus fewer cache misses are incurred.
> Because the startup images are mostly spans solid colors, this kind of
> compression works very well.  If that were not the case, say if there were a
> left-to-right gradient in the background, RLE would probably make things
> worse, thus you have to be careful when choosing it.
> But the smaller size on disk and in memory would probably improve
> performance in other ways as well.
> Best,
> Wade

thanks, that makes sense
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Wade Brainerd
Oh, and you can feed one of the 565 files through my 'rle.c' program to see
the compression ratio firsthand.

On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd  wrote:

> RLE (run length encoding) compresses sequences of identical pixels ("runs")
> as value/count pairs.
> So abbccc would be stored as 1a 10b 3c.
>
> The decompressor looks like:
>
> while (cur < end)
> {
>unsigned short count = *cur++;
>unsigned short value = *cur++;
>while (count--)
>   *dest++ = value;
> }
>
> This can be faster than memcpy because you are reading significantly less
> memory than you would with memcpy, thus fewer cache misses are incurred.
>
> Because the startup images are mostly spans solid colors, this kind of
> compression works very well.  If that were not the case, say if there were a
> left-to-right gradient in the background, RLE would probably make things
> worse, thus you have to be careful when choosing it.
>
> But the smaller size on disk and in memory would probably improve
> performance in other ways as well.
>
> Best,
> Wade
>
>
> On Thu, Feb 19, 2009 at 1:49 PM, Bobby Powers wrote:
>
>> 2009/2/19 Wade Brainerd :
>> > On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian 
>> wrote:
>> >>
>> >> I'd suggest just uncompressing the various image files and re-timing
>> >> as a start.  The initial implementation was uncompressed, but people
>> >> complained about space usage on the emulator images (which are
>> >> uncompressed).  The current code supports both uncompressed and
>> >> compressed image formats.  For uncompressed images, putting the bits
>> >> on the screen is an mmap and memcpy, so I can't imagine any
>> >> implementation being faster than that (it's possible, of course, that
>> >> what's stealing CPU is the shell's invocation of the client program;
>> >> recoding just that little part in C should be trivial, since it does
>> >> nothing but write to a socket IIRC.)
>> >
>> > I implemented a RLE compressor specifically for these 16bit image files
>> the
>> > last time this question came up.  This can certainly be faster than
>> memcpy
>> > since we are talking memory performance.
>>
>> Can you explain this?  I don't think I have enough knowledge to
>> evaluate your claim.
>>
>> bobby
>>
>
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Wade Brainerd
RLE (run length encoding) compresses sequences of identical pixels ("runs")
as value/count pairs.
So abbccc would be stored as 1a 10b 3c.

The decompressor looks like:

while (cur < end)
{
   unsigned short count = *cur++;
   unsigned short value = *cur++;
   while (count--)
  *dest++ = value;
}

This can be faster than memcpy because you are reading significantly less
memory than you would with memcpy, thus fewer cache misses are incurred.

Because the startup images are mostly spans solid colors, this kind of
compression works very well.  If that were not the case, say if there were a
left-to-right gradient in the background, RLE would probably make things
worse, thus you have to be careful when choosing it.

But the smaller size on disk and in memory would probably improve
performance in other ways as well.

Best,
Wade


On Thu, Feb 19, 2009 at 1:49 PM, Bobby Powers  wrote:

> 2009/2/19 Wade Brainerd :
> > On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian 
> wrote:
> >>
> >> I'd suggest just uncompressing the various image files and re-timing
> >> as a start.  The initial implementation was uncompressed, but people
> >> complained about space usage on the emulator images (which are
> >> uncompressed).  The current code supports both uncompressed and
> >> compressed image formats.  For uncompressed images, putting the bits
> >> on the screen is an mmap and memcpy, so I can't imagine any
> >> implementation being faster than that (it's possible, of course, that
> >> what's stealing CPU is the shell's invocation of the client program;
> >> recoding just that little part in C should be trivial, since it does
> >> nothing but write to a socket IIRC.)
> >
> > I implemented a RLE compressor specifically for these 16bit image files
> the
> > last time this question came up.  This can certainly be faster than
> memcpy
> > since we are talking memory performance.
>
> Can you explain this?  I don't think I have enough knowledge to
> evaluate your claim.
>
> bobby
>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
Bobby Powers wrote:
> On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian  wrote:
>   
>> I'd suggest just uncompressing the various image files and re-timing
>> as a start.  The initial implementation was uncompressed, but people
>> complained about space usage on the emulator images (which are
>> uncompressed).  The current code supports both uncompressed and
>> compressed image formats.  For uncompressed images, putting the bits
>> on the screen is an mmap and memcpy, so I can't imagine any
>> implementation being faster than that (it's possible, of course, that
>> what's stealing CPU is the shell's invocation of the client program;
>> recoding just that little part in C should be trivial, since it does
>> nothing but write to a socket IIRC.)
>>
>> Anyway, further benchmarking of the current implementation is probably
>> worthwhile before a complete reimplementation is called for.  But if
>> you want to reimplement it from scratch, go nuts.
>>  --scott
>> 
>
> I already re-implemented it - it was a fun optimization project and
> introduction to lower level systems programming.  Using Mitch's D565
> format to keep track of only the parts of the image that change cut
> down the implementation size significantly.  Its now only 2
> uncompressed images (frame00.565 and ul-warning.565), and <300KB of
> differences for the animation sequence.  I understand reads from video
> memory (which I think is what the framebuffer is?) can be extremely
> slow, so it could turn out faster to open a D565 file, mmap it and
> mcpy the several tens of kilobytes of differences to the framebuffer
> than it is to read those differences from one part of video memory to
> another.
>   

It is easy to measure just how "slow" video memory reads are.  Lets test 
256K (0x4):

ok screen-ih iselect
ok t(  frame-buffer-adr   frame-buffer-adr 4. +  4.  move  )t
56,272 uS

Conversely, for memory to frame buffer:

ok t(  load-base   frame-buffer-adr 4. +  4.  move  )t
05,407 uS

So frame buffer reads are slower.  But the total amount of time that we 
have "wasted" is 50 milliseconds over the whole procedure.  I suspect 
that it  will be difficult to come up with a way to save those 50 mS 
that doesn't cost a similar amount of time in setup.

For ongoing stuff like run-time graphics operations, it's clearly 
important to avoid "slow" operations, but in this case, we are trading 
off slow FB accesses against the complexity of maintaining persistent 
state in main memory.


> This is where benchmarking should give some clearer answers.
>
> yours,
> Bobby
>   

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Bobby Powers
2009/2/19 Wade Brainerd :
> On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian  wrote:
>>
>> I'd suggest just uncompressing the various image files and re-timing
>> as a start.  The initial implementation was uncompressed, but people
>> complained about space usage on the emulator images (which are
>> uncompressed).  The current code supports both uncompressed and
>> compressed image formats.  For uncompressed images, putting the bits
>> on the screen is an mmap and memcpy, so I can't imagine any
>> implementation being faster than that (it's possible, of course, that
>> what's stealing CPU is the shell's invocation of the client program;
>> recoding just that little part in C should be trivial, since it does
>> nothing but write to a socket IIRC.)
>
> I implemented a RLE compressor specifically for these 16bit image files the
> last time this question came up.  This can certainly be faster than memcpy
> since we are talking memory performance.

Can you explain this?  I don't think I have enough knowledge to
evaluate your claim.

bobby
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Bobby Powers
On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian  wrote:
> I'd suggest just uncompressing the various image files and re-timing
> as a start.  The initial implementation was uncompressed, but people
> complained about space usage on the emulator images (which are
> uncompressed).  The current code supports both uncompressed and
> compressed image formats.  For uncompressed images, putting the bits
> on the screen is an mmap and memcpy, so I can't imagine any
> implementation being faster than that (it's possible, of course, that
> what's stealing CPU is the shell's invocation of the client program;
> recoding just that little part in C should be trivial, since it does
> nothing but write to a socket IIRC.)
>
> Anyway, further benchmarking of the current implementation is probably
> worthwhile before a complete reimplementation is called for.  But if
> you want to reimplement it from scratch, go nuts.
>  --scott

I already re-implemented it - it was a fun optimization project and
introduction to lower level systems programming.  Using Mitch's D565
format to keep track of only the parts of the image that change cut
down the implementation size significantly.  Its now only 2
uncompressed images (frame00.565 and ul-warning.565), and <300KB of
differences for the animation sequence.  I understand reads from video
memory (which I think is what the framebuffer is?) can be extremely
slow, so it could turn out faster to open a D565 file, mmap it and
mcpy the several tens of kilobytes of differences to the framebuffer
than it is to read those differences from one part of video memory to
another.

This is where benchmarking should give some clearer answers.

yours,
Bobby
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Wade Brainerd
On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian  wrote:

> I'd suggest just uncompressing the various image files and re-timing
> as a start.  The initial implementation was uncompressed, but people
> complained about space usage on the emulator images (which are
> uncompressed).  The current code supports both uncompressed and
> compressed image formats.  For uncompressed images, putting the bits
> on the screen is an mmap and memcpy, so I can't imagine any
> implementation being faster than that (it's possible, of course, that
> what's stealing CPU is the shell's invocation of the client program;
> recoding just that little part in C should be trivial, since it does
> nothing but write to a socket IIRC.)


I implemented a RLE compressor specifically for these 16bit image files the
last time this question came up.  This can certainly be faster than memcpy
since we are talking memory performance.

GZip+RLE also beats plain GZip on size, again due to the contents of the
images.

http://wadeb.com/rle.c
http://wadeb.com/unrle.c

-Wade
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
C. Scott Ananian wrote:
> I'd suggest just uncompressing the various image files and re-timing
> as a start.  The initial implementation was uncompressed, but people
> complained about space usage on the emulator images (which are
> uncompressed).  The current code supports both uncompressed and
> compressed image formats.  For uncompressed images, putting the bits
> on the screen is an mmap and memcpy, so I can't imagine any
> implementation being faster than that (it's possible, of course, that
> what's stealing CPU is the shell's invocation of the client program;
> recoding just that little part in C should be trivial, since it does
> nothing but write to a socket IIRC.)
>
> Anyway, further benchmarking of the current implementation is probably
> worthwhile before a complete reimplementation is called for.  But if
> you want to reimplement it from scratch, go nuts.
>  --scott
>
>   
It has already been reimplemented.

The "disk" I/O time for 26 full-screen images is several seconds.

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread C. Scott Ananian
I'd suggest just uncompressing the various image files and re-timing
as a start.  The initial implementation was uncompressed, but people
complained about space usage on the emulator images (which are
uncompressed).  The current code supports both uncompressed and
compressed image formats.  For uncompressed images, putting the bits
on the screen is an mmap and memcpy, so I can't imagine any
implementation being faster than that (it's possible, of course, that
what's stealing CPU is the shell's invocation of the client program;
recoding just that little part in C should be trivial, since it does
nothing but write to a socket IIRC.)

Anyway, further benchmarking of the current implementation is probably
worthwhile before a complete reimplementation is called for.  But if
you want to reimplement it from scratch, go nuts.
 --scott

-- 
 ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Peter Robinson
>> I just measured the time taken by the boot animation by the simple
>> technique of renaming /usr/bin/rhgb-client so the initscripts can't find it.
>
> how did you measure exactly? stopwatch? I'd like to recreate the
> tests.  It sounds like you did this on a freshly flashed system?

There were a number of tools used by some of the Fedora devs for boot
speed when developing plymouth to replace the old RHGB system. It
would be interesting to plymouth in this (both text and graphical) to
see what the comparison is like. It might be possible to get alot of
the wins that Fedora got with very little work as plymouth has a full
plugin system so shouldn't be hard to add the OLPC boot logos in.

Peter
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread pgf
mitch wrote:
 > Bobby Powers wrote:
 > > - its designed to be as light as possible, using syscalls instead of
 > > libc functions as much as possible (the only thing we use libc for is
 > > string comparison, which could be replaced with a local function).
 > > while its written like this, I haven't worked on cutting down the
 > > linking (I need some guidance for that)

great stuff bobby -- i'm happy to help with any remaining details if
you like.

 > >   
 > 
 > To reduce the execution footprint, you could try linking it against 
 > dietlibc, http://www.fefe.de/dietlibc/
 > 
 > I'm not sure just how much time that would save; maybe it wouldn't be 
 > significant.  But it's worth a try.

my gut says that using already present glibc shared lib will be cheaper
than introducing a new library, even if it's small and static.  but
you're right it's worth a try.

 > > and source is avail at
 > > http://dev.laptop.org/git?p=users/bobbyp/bootanim

i took a very brief look.  as a favor to future maintainers,
i think you could either a) merge boot-anim-start/client/stop and
ul-warning into a single executable (much of the code is the
same) or b) extract the common parts (e.g. initial_setup(), and the
code that mmaps the framebuffer) into a boot-anim-utils.c or
something like that.

(and while i'm all for reducing dependencies, the XO has so much
else going on that i don't think using against string libraries
or even stdio will affect things much in the greater scheme of
things.  so i'd have used fputs rather than write(2,...) for
errors.  but i understand the intent.)

paul
=-
 paul fox, p...@laptop.org
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-19 Thread Mitch Bradley
Cool!

Bobby Powers wrote:
> On Wed, Feb 11, 2009 at 2:01 AM, Mitch Bradley  wrote:
>   
>> I just measured the time taken by the boot animation by the simple
>> technique of renaming /usr/bin/rhgb-client so the initscripts can't find it.
>> 
>
> how did you measure exactly? stopwatch? I'd like to recreate the
> tests.  It sounds like you did this on a freshly flashed system?
>   

Yes on both counts.  Stopwatch on freshly-flashed os7.img .


>   
>> With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60
>> seconds from first dot (indicating OFW transfer to Linux) to Sugar
>> "prompt for your name".   Without it, 53 seconds.  I repeated the test
>> several times with consistent results.
>>
>> Clearly, it should be possible to display that amount of information in
>> much less than 7 seconds.
>>
>> The boot animation code is in the OLPC domain, not the upstream domain,
>> so replacing it should be relatively free of upstream politics.
>>
>> So if anybody is interested in implementing a relatively simple
>> boot-time speedup, I offer this as low-hanging fruit.
>>
>> I suggest 1 second (differential time between animation and no-animation
>> cases) as a reasonable target goal, assuming images of the complexity of
>> the current ones.  Arbitrary full-screen graphics might require more
>> time, but speeding up the baseline case is a good starting point.
>>
>> Go wild.
>> 
>
> So I've taken a first cut at this, implemented with the following
> design considerations (mostly from a conversation with Mitch)
> - the Python client/server was reimplemented as several standalone C
> programs (boot-anim-start, boot-anim-client, and some cleanup in
> boot-anim-stop)
> - a client and server was used before because there is state
> information that needs to be saved: we need to keep track of where in
> the animation we are.  We can keep track of this by using offscreen
> memory in the framebuffer (its 16MB in size, and only the first 2ish
> MB is used for the onscreen graphics (my terminology might be off
> here)).  For state we really only need to keep track of 2 integers,
> one for the current frame number and another to store the offset of
> the next diff to apply.
> - on startup we load an initial image into the framebuffer (the first
> 1200*900*2 bytes, since we use 2 bytes per pixel for color
> information), and then load in a series of changes to the framebuffer
> image (<300KB).  This takes the form of a series of diffs
> - for each update (a valid call to boot-anim-client) we apply the next
> diff in the series to the onscreen image and update our state
> information
> - after applying the last diff we have (the end in the animation
> series), freeze the DCON (when I first attempted to freeze the DCON
> when z-boot-anim-stop was called it left the screen in an inconsistent
> state, I believe because of X startup)
> - its designed to be as light as possible, using syscalls instead of
> libc functions as much as possible (the only thing we use libc for is
> string comparison, which could be replaced with a local function).
> while its written like this, I haven't worked on cutting down the
> linking (I need some guidance for that)
>   

To reduce the execution footprint, you could try linking it against 
dietlibc, http://www.fefe.de/dietlibc/

I'm not sure just how much time that would save; maybe it wouldn't be 
significant.  But it's worth a try.


> comments and suggestions welcome :)
>
> I'd appreciate any testing as well as any code review.  (the shutdown
> image appears to be broken, FYI.  i haven't looked at that in depth,
> its probably a one line fix.)
> rpms (built with mock) are available at
> http://dev.laptop.org/~bobbyp/bootanim/
> and source is avail at
> http://dev.laptop.org/git?p=users/bobbyp/bootanim
>
> -Bobby
>   

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Opportunity for speedup

2009-02-18 Thread Bobby Powers
On Wed, Feb 11, 2009 at 2:01 AM, Mitch Bradley  wrote:
> I just measured the time taken by the boot animation by the simple
> technique of renaming /usr/bin/rhgb-client so the initscripts can't find it.

how did you measure exactly? stopwatch? I'd like to recreate the
tests.  It sounds like you did this on a freshly flashed system?

> With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60
> seconds from first dot (indicating OFW transfer to Linux) to Sugar
> "prompt for your name".   Without it, 53 seconds.  I repeated the test
> several times with consistent results.
>
> Clearly, it should be possible to display that amount of information in
> much less than 7 seconds.
>
> The boot animation code is in the OLPC domain, not the upstream domain,
> so replacing it should be relatively free of upstream politics.
>
> So if anybody is interested in implementing a relatively simple
> boot-time speedup, I offer this as low-hanging fruit.
>
> I suggest 1 second (differential time between animation and no-animation
> cases) as a reasonable target goal, assuming images of the complexity of
> the current ones.  Arbitrary full-screen graphics might require more
> time, but speeding up the baseline case is a good starting point.
>
> Go wild.

So I've taken a first cut at this, implemented with the following
design considerations (mostly from a conversation with Mitch)
- the Python client/server was reimplemented as several standalone C
programs (boot-anim-start, boot-anim-client, and some cleanup in
boot-anim-stop)
- a client and server was used before because there is state
information that needs to be saved: we need to keep track of where in
the animation we are.  We can keep track of this by using offscreen
memory in the framebuffer (its 16MB in size, and only the first 2ish
MB is used for the onscreen graphics (my terminology might be off
here)).  For state we really only need to keep track of 2 integers,
one for the current frame number and another to store the offset of
the next diff to apply.
- on startup we load an initial image into the framebuffer (the first
1200*900*2 bytes, since we use 2 bytes per pixel for color
information), and then load in a series of changes to the framebuffer
image (<300KB).  This takes the form of a series of diffs
- for each update (a valid call to boot-anim-client) we apply the next
diff in the series to the onscreen image and update our state
information
- after applying the last diff we have (the end in the animation
series), freeze the DCON (when I first attempted to freeze the DCON
when z-boot-anim-stop was called it left the screen in an inconsistent
state, I believe because of X startup)
- its designed to be as light as possible, using syscalls instead of
libc functions as much as possible (the only thing we use libc for is
string comparison, which could be replaced with a local function).
while its written like this, I haven't worked on cutting down the
linking (I need some guidance for that)

comments and suggestions welcome :)

I'd appreciate any testing as well as any code review.  (the shutdown
image appears to be broken, FYI.  i haven't looked at that in depth,
its probably a one line fix.)
rpms (built with mock) are available at
http://dev.laptop.org/~bobbyp/bootanim/
and source is avail at
http://dev.laptop.org/git?p=users/bobbyp/bootanim

-Bobby
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Opportunity for speedup

2009-02-10 Thread Mitch Bradley
I just measured the time taken by the boot animation by the simple 
technique of renaming /usr/bin/rhgb-client so the initscripts can't find it.

With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60 
seconds from first dot (indicating OFW transfer to Linux) to Sugar 
"prompt for your name".   Without it, 53 seconds.  I repeated the test 
several times with consistent results.

Clearly, it should be possible to display that amount of information in 
much less than 7 seconds.

The boot animation code is in the OLPC domain, not the upstream domain, 
so replacing it should be relatively free of upstream politics.

So if anybody is interested in implementing a relatively simple 
boot-time speedup, I offer this as low-hanging fruit.

I suggest 1 second (differential time between animation and no-animation 
cases) as a reasonable target goal, assuming images of the complexity of 
the current ones.  Arbitrary full-screen graphics might require more 
time, but speeding up the baseline case is a good starting point.

Go wild.

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel