Re: [sugar] Trial-2 & pushing out bugs...

2007-07-20 Thread Jean Piché
>
> 1) bugs that you believe absolutely must get fixed for Trial-2


I am not certain why, but http://dev.laptop.org/ticket/2319 was  
milestoned to B-Test-2. This is a really bad hardware bug. Its going  
to physically bust machines. We want to make sure this is fixed  
before anything goes out the door


jp
TamTam


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Trial-2 & pushing out bugs...

2007-07-20 Thread Eben Eliason
> 1) bugs that you believe absolutely must get fixed for Trial-2
> 2) bugs that you believe are very important, and for which you think we
> should fix if we can, for which you believe non-invasive fixes are
> doable or in hand.
> 3) bugs we should definitely push later.
> 4) "one liner" cosmetic fixes, with high bang for the buck.  The risk
> had better be very low.

1 ---

null

2 ---

#2014: This should be a quick fix once I get Marco a screenshot. It
will make sharing more intuitive, and collaboration is important to
this trial.

#1129:  This isn't as essential as the above

3 ---

#1276:  I thought that this was mostly handled in ticket #663, but in
either case, we need full APIs and guidelines on this before it can be
implemented correctly. Related: #1441,  1815.  Jim, please help me out
on the status of these; Thanks.

#1502:  Design aside, this will require a lot of implementation, which
will happen post T2

4 ---

#2148:  If #2014 is done (listed above), this only requires one new icon
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Trial-2 & pushing out bugs...

2007-07-20 Thread Tomeu Vizoso
On Wed, 2007-07-18 at 16:17 -0400, Jim Gettys wrote:
> Could each of you working on Trial-2 please go over your assigned track
> bugs and reply to this mail categorizing *all* of your bugs into four
> bins, by Friday?
> 
> 1) bugs that you believe absolutely must get fixed for Trial-2
> 2) bugs that you believe are very important, and for which you think we
> should fix if we can, for which you believe non-invasive fixes are
> doable or in hand.
> 3) bugs we should definitely push later.
> 4) "one liner" cosmetic fixes, with high bang for the buck.  The risk
> had better be very low.

1)

#2057, #2309: Progresses in the datastore side. Will test in the next
build.

2)

#2285: Shouldn't be too hard.

3)

#1281: Looks a too risky change for doing it now, as it messes with the
frame activation logic.
#2056: We can copy files between devices by drag and drop. Additional
mechanisms can be implemented after trial2.
#2044: We are appending Today or Yesterday to the date. If Eben thinks
it's important for the user experience to add some more complex support,
he needs to comment on this.
#2245: Not easy, better to just update the preview on activity close for
Trial2.

4)

#2167: Should be easy to just add the icon at left to the entry. Inside
the entry looks too much risk to me.

Tomeu

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Activation and new builds.

2007-07-20 Thread C. Scott Ananian
On 7/19/07, Zephaniah E. Hull <[EMAIL PROTECTED]> wrote:
> How does that impact those of us wishing to boot from SD for development
> purposes?

Make sure you're using the jffs2 image (either directly, or unpacked
from the tree.tar file).  Copy the activation lease from
/security/activate.key from NAND to the SD if necessary, or ensure
that the USB key that had the lease written during activation is
present during first boot.  The activation ramdisk does currently
mount root from mtd0; I'll update it to mount root from the source of
the ramdisk, where that was.   Watch the build changelog for this
change.
 --scott

-- 
 ( http://cscott.net/ )
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Dramatic image size reduction

2007-07-20 Thread Dan Williams
On Fri, 2007-07-20 at 10:23 -0400, Bernardo Innocenti wrote:
> Build 511 was 290MB, build 513 dropped to 218MB, but
> I couldn't find anything in the ChangeLog to justify
> this dramatic improvement.  Anyone has a clue?

The library was broken and was removed.  I should have noted that in the
changelog.

Dan


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM manager project

2007-07-20 Thread Marc E. Fiuczynski
Hello Jim,

We (PlanetLab) have found that OOM does some relative bad things causing 
a system to get into an unusable state.  We replaced OOM with something 
that just panics and reboots rather than letting the system get into an 
unrecoverable state, which we need as many of our PL servers are in 
remote locations and unattended (kinda like your mini-servers will be, 
but unlike your laptops).  And we then introduced a user-level OOM 
governor, which is probably something far more rudimentary than what you 
are after.  Our governor, called pl_mom because "she cleans up your 
mess", assumes that separate applications/services are instantiated in 
separate vservers (slices).  From what I gather, this is definitely the 
direction that OLPC is going for the laptop and mini-server-gateways, so 
our approach might be at least from a thought perspective applicable.

What does pl_mom do?  At the moment she kills the vserver with the 
largest aggregate VSZ (i.e., all processes within that vserver).  This 
works for PlanetLab, but might not the best approach for OLPC.  We have 
found that most OOM scenarios occur by a slow leaker that has its pages 
swapped out by kswap (which happens on the order of a few hours and are 
hard to detect with the current vm metrics we peek at).  Since pl_mom 
does the trick for our usage scenario on PlanetLab for now we have not 
had an incentive to improve it further.  However, one should definitely 
look at better vm statistics to make a better choice than largest 
aggregate VSZ.

The code for pl_mom is available via anon cvs from:

cvs -d :pserver:[EMAIL PROTECTED]:/cvs co pl_mom

Take a peek at swap_mom.py and its helper functions in pl_mom.py.

I'm cc'ing Faiyaz Ahmed, who is the person at Princeton who is currently 
maintaining pl_mom.

Best regards,
Marc

Jim Gettys wrote:
> OLPC needs a OOM governor, so that the "right" process gets shot when we
> run low on RAM, and that processes that might get shot know enough to
> save state for restart.  As you know, various problems appear if the
> wrong process is killed, usually resulting in needing a restart.
> 
> Note that the kernel has to be able to recover memory when it needs it,
> or it will deadlock: this is a situation where the kernel must be in
> control, but user space could cooperate much better than it does today,
> by providing appropriate hints.  So don't say: "the kernel shouldn't
> kill processes: user space should"; that design doesn't fly.
> 
> Here's Kimmo Hämäläinen description of the (current) kernel OOM killer.
> 
> The OOM killer selects a process to kill by assigning a score to each
> process; the process with the highest score is the lucky winner that
> will be killed. The current OOM score for
> a process is visible in proc. The entry is in /proc/PID/oom score. The
> starting point of the score is the amount of memory consumed by the
> process and its children. This value is adjusted as follows:
> • It is set to zero if the process has no memory management or if the
> process has a negative
> nice value (this can be used for protecting processes from the killer).
> • Divided by the square root of the CPU time consumed by the process.
> • Divided by the square root of the square root of the run time of the
> process.
> • Multiplied by 2 if it is a process with a positive nice value.
> • Divided by 4 if it is a superuser process.
> • Divided by 4 if it is a process with direct hardware access.
> • Finally, the value is adjusted (shifted either left or right) by the
> oom adj value. It is shifted left in case the value is positive and
> right in case the value is negative.
> This means that a negative oom adj value will decrease the score and
> also decrease the risk that the particular process will be killed. A
> positive value will have the opposite effect. The value should be no
> smaller than -16 and no larger than 15.
> 
> Please note that you can set the oom adj value in the proc file system.
> It is located at /proc/PID/oom_adj. For more information about how the
> OOM killer behaves, see the Linux kernel source code, mm/oom kill.c in
> particular.
> 
> So we need an OOM killer helper.  
> 
> We have the ability to provide the kernel with much of the 
> information it needs for much better behavior, if we choose.
> 
> I see this project evolving through the following incremental
> improvements (and incremental difficulty) as set out below:
> 
> 1) start by setting the oom_adj appropriately so that the processes we
> really care about don't get shot.
> 
> 2) make this a window manager plug in (plug in, as people including us
> may end up using other window managers) that uses the stacking order on
> the screen to rank order the activities that are running.
> 
> 3) provide a mechanism by which applications may be given a hint that
> they might find it good to save enough state for a checkpoint restart,
> because they are likely a good candidate for shooting.
> 
> 4) use the XRes facilities in X (and/or m

Re: OOM manager project

2007-07-20 Thread Jim Gettys
On Fri, 2007-07-20 at 17:05 +0200, Carl-Daniel Hailfinger wrote:
> On 20.07.2007 16:37, Jim Gettys wrote:
> > Note that the kernel has to be able to recover memory when it needs it,
> > or it will deadlock: this is a situation where the kernel must be in
> > control, but user space could cooperate much better than it does today,
> > by providing appropriate hints.  So don't say: "the kernel shouldn't
> > kill processes: user space should"; that design doesn't fly.
> > [...]
> > So we need an OOM killer helper.  
> > 
> > We have the ability to provide the kernel with much of the 
> > information it needs for much better behavior, if we choose.
> > [...]
> > 5) see if there are better OOM algorithms that Linux presently has.
> 
> How does the new proposal relate to the OOM discussion here:
> http://www.redhat.com/archives/olpc-software/2006-March/msg00197.html
> 
> Have we simply given up trying to beat userspace into shape?


Of course not, at least for what we ship.

But the set of stuff people will run will include stuff that is bloated
garbage

My real point is that there is lots of useful information in various
parts of the system, and we're currently not using almost *any* of it
for choosing what to do.

Two comments: 

1) don't prejudge what the long term solution should be, including my
straw man, which is just to try to spur some real action.

2) the enemy of the good is the perfect; doing nothing, which is what
we're doing now, isn't a great position.  I suspect we can do lots
better than now, without taking on the entire problem up front.
- Jim



-- 
Jim Gettys
One Laptop Per Child


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: OOM manager project

2007-07-20 Thread Carl-Daniel Hailfinger
On 20.07.2007 16:37, Jim Gettys wrote:
> Note that the kernel has to be able to recover memory when it needs it,
> or it will deadlock: this is a situation where the kernel must be in
> control, but user space could cooperate much better than it does today,
> by providing appropriate hints.  So don't say: "the kernel shouldn't
> kill processes: user space should"; that design doesn't fly.
> [...]
> So we need an OOM killer helper.  
> 
> We have the ability to provide the kernel with much of the 
> information it needs for much better behavior, if we choose.
> [...]
> 5) see if there are better OOM algorithms that Linux presently has.

How does the new proposal relate to the OOM discussion here:
http://www.redhat.com/archives/olpc-software/2006-March/msg00197.html

Have we simply given up trying to beat userspace into shape?

Regards,
Carl-Daniel
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: Dramatic image size reduction

2007-07-20 Thread Chris Ball
Hi,

   > Build 511 was 290MB, build 513 dropped to 218MB, but I couldn't
   > find anything in the ChangeLog to justify this dramatic
   > improvement.  Anyone has a clue?

The difference is that the library (/home/olpc/Library/) got taken out.
(Then it got put back in, and we went up past 300M, and then it got
taken out again.)

- Chris.
-- 
Chris Ball   <[EMAIL PROTECTED]>
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


OOM manager project

2007-07-20 Thread Jim Gettys
OLPC needs a OOM governor, so that the "right" process gets shot when we
run low on RAM, and that processes that might get shot know enough to
save state for restart.  As you know, various problems appear if the
wrong process is killed, usually resulting in needing a restart.

Note that the kernel has to be able to recover memory when it needs it,
or it will deadlock: this is a situation where the kernel must be in
control, but user space could cooperate much better than it does today,
by providing appropriate hints.  So don't say: "the kernel shouldn't
kill processes: user space should"; that design doesn't fly.

Here's Kimmo Hämäläinen description of the (current) kernel OOM killer.

The OOM killer selects a process to kill by assigning a score to each
process; the process with the highest score is the lucky winner that
will be killed. The current OOM score for
a process is visible in proc. The entry is in /proc/PID/oom score. The
starting point of the score is the amount of memory consumed by the
process and its children. This value is adjusted as follows:
• It is set to zero if the process has no memory management or if the
process has a negative
nice value (this can be used for protecting processes from the killer).
• Divided by the square root of the CPU time consumed by the process.
• Divided by the square root of the square root of the run time of the
process.
• Multiplied by 2 if it is a process with a positive nice value.
• Divided by 4 if it is a superuser process.
• Divided by 4 if it is a process with direct hardware access.
• Finally, the value is adjusted (shifted either left or right) by the
oom adj value. It is shifted left in case the value is positive and
right in case the value is negative.
This means that a negative oom adj value will decrease the score and
also decrease the risk that the particular process will be killed. A
positive value will have the opposite effect. The value should be no
smaller than -16 and no larger than 15.

Please note that you can set the oom adj value in the proc file system.
It is located at /proc/PID/oom_adj. For more information about how the
OOM killer behaves, see the Linux kernel source code, mm/oom kill.c in
particular.

So we need an OOM killer helper.  

We have the ability to provide the kernel with much of the 
information it needs for much better behavior, if we choose.

I see this project evolving through the following incremental
improvements (and incremental difficulty) as set out below:

1) start by setting the oom_adj appropriately so that the processes we
really care about don't get shot.

2) make this a window manager plug in (plug in, as people including us
may end up using other window managers) that uses the stacking order on
the screen to rank order the activities that are running.

3) provide a mechanism by which applications may be given a hint that
they might find it good to save enough state for a checkpoint restart,
because they are likely a good candidate for shooting.

4) use the XRes facilities in X (and/or modify X) to provide the kernel
with the pixmap usage on a process ID basis, for local
applications/activities.

5) see if there are better OOM algorithms that Linux presently has.

Discussion?  Anyone want to take on this project, or parts of this
project?
- Jim

-- 
Jim Gettys
One Laptop Per Child


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Dramatic image size reduction

2007-07-20 Thread Bernardo Innocenti
Build 511 was 290MB, build 513 dropped to 218MB, but
I couldn't find anything in the ChangeLog to justify
this dramatic improvement.  Anyone has a clue?

-- 
   // Bernardo Innocenti
 \X/  http://www.codewiz.org/
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


ChangeLog

2007-07-20 Thread Bernardo Innocenti
How about automatically posting the ChangeLog files
to the devel@ mailing-list in the fashion of Fedora?

Changelog entries sometimes are the starting point for
interesting discussion, and a form of post-mortem review
process.

-- 
   // Bernardo Innocenti
 \X/  http://www.codewiz.org/
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel