Bug#474947: the state of Bug#474947

2008-10-29 Thread Neil McGovern
clone 474947 -1
reassign -1 release-notes
retitle -1 Update information about apt MMap problem in release notes
thanks

On Tue, Oct 21, 2008 at 12:03:49PM +0200, A Mennucc wrote:
> hi bug, hi people, hi d-release
> 
> I did some study on bug 474947, that is grave/RC, and is posted against APT.
> 
> Since I was told that the APT team is understaffed, I decided to take
> action myself.
>

Firstly, thank you for all your hard work on this problem.

> So my conclusion is that the forthcoming release notes do address the
> problem some people may encounter in upgrading from  Etch to Lenny.
> 

I agree. This issue isn't RC for lenny.

> I propose the attached patch, though, since it is funny to suggest a
> value of 1250 (bytes) when the internal value in Lenny is 20MB.
> I hope someone in the d-relase team can apply it.
> 

Thanks, I'll clone this and re-assign.

Neil
-- 
* stockholm bangs head against budget
 outsch
 h01ger: it is still very soft, i did not hurt myself
 stockholm: But you bled on the budget, and now it's red again!


signature.asc
Description: Digital signature


Bug#474947: the state of Bug#474947

2008-10-27 Thread Eugene V. Lyubimkin
Elliott Mitchell wrote:
>> Yes. So, If you claim this have to be fixed before Lenny, go ahead and ask 
>> Debian release
>> team what they think about changes in internals of apt and additional 
>> month(s) of testing.
>>
> 
> I thought that was the point of copy the messages to their list was.
And, nevertheless, it was no answer yet. I assume you should ask them directly 
by mail to
receive 'yes' or 'no' answer.

>> Yes, it will change ABI and API. This will cause recompiling packages that 
>> rely on apt
>> against new apt, and would cause breakage of some apt-dependent tools (such 
>> as aptitude,
>> perl and python bindings). Another big pain for other developers.
> 
> Adding a level of indirection isn't a very big change. Yes, it has
> effects all over the place, but 95% of those are pretty simple (can
> mostly be done with `sed`). The difficult part is change the allocator,
> which I presume is the portion you did?
Rather simple, but in many places. Yes, it's exactly what I did. But this part 
may contain
bugs too, as I cannot test it without all redirections implemented.

>> My conclusion: please not force fixing this bug before Lenny until release 
>> team agree to
>> change internals of apt at this stage.
> 
> My point with the above is to keep working on it. Even if slight, there
> is a very small chance it might be possible to complete in time.
I hope apt in Squeeze will be significantly more bug-clear than in Lenny.
But apt in Lenny is better than apt in Etch. And, again, this bug is important. 
But not so
important to be forced to fix before Lenn release.

> As for combining bug reports, #474947 is distinct from the #380509,
> #413024, #429171, #431410 and #451526. None of those includes a
[some investigations snipped]
I assume this investigation will be also done post-Lenny. Thanks for triaging 
and attention.

-- 
Eugene V. Lyubimkin aka JackYF, Ukrainian C++ developer.



signature.asc
Description: OpenPGP digital signature


Bug#474947: the state of Bug#474947

2008-10-26 Thread Elliott Mitchell
>From: Eugene V. Lyubimkin <[EMAIL PROTECTED]>
> Elliott Mitchell wrote:
> > I have made no such claims. I am merely stating that this is a serious
> > bug. Severe enough to seriously consider delaying the release. This is
> > what the release team gets to decide, which is worse (neither option is
> > good)?
> Yes. So, If you claim this have to be fixed before Lenny, go ahead and ask 
> Debian release
> team what they think about changes in internals of apt and additional 
> month(s) of testing.
> 

I thought that was the point of copy the messages to their list was.

> It might be found that fixing it isn't anywhere near as bad
> > as you thought. Even though it changes the API/ABI, if no one has ever
> > touched that field, the impact on other packages will be zero.

> Yes, it will change ABI and API. This will cause recompiling packages that 
> rely on apt
> against new apt, and would cause breakage of some apt-dependent tools (such 
> as aptitude,
> perl and python bindings). Another big pain for other developers.

Adding a level of indirection isn't a very big change. Yes, it has
effects all over the place, but 95% of those are pretty simple (can
mostly be done with `sed`). The difficult part is change the allocator,
which I presume is the portion you did?

> > Perhaps
> > the release team will decide it is worth delaying the release, in which
> > case a head start in testing will be of great value. Perhaps some other
> > issue will force a delay of the release, in which case the extra time
> > might allow sufficient testing.

> Perhaps. And perhaps not.
> 
> My conclusion: please not force fixing this bug before Lenny until release 
> team agree to
> change internals of apt at this stage.

My point with the above is to keep working on it. Even if slight, there
is a very small chance it might be possible to complete in time.



As for combining bug reports, #474947 is distinct from the #380509,
#413024, #429171, #431410 and #451526. None of those includes a
segmentation violation. #474947 might get fixed simply because fixing
the little core piece will prevent it from being tickled or the rewrite
might just squash it; but I still think it is distinct.

I also think #429173 should be separate from that grouping. Again, this
one might never show up, if not for the MMap issue; but this issue is
that locks are left behind on error, not the MMap issue.

On the flip side, #474947 might be the same as #443564. The MMap issue
shows up, followed by a segmentation violation. Similar fixes work, but
that is due to aggrevation by the MMap issue, without that bug these
might be clearly distinct.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| [EMAIL PROTECTED] PGP F6B23DE0 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
2477\___\_|_/DC21 03A0 5D61 985B <-PGP-> F2BE 6526 ABD2 F6B2\_|_/___/3DE0





-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#474947: the state of Bug#474947

2008-10-26 Thread Eugene V. Lyubimkin
Dropped debian-release from CC.

Elliott Mitchell wrote:
>>  - this patch reduces apt speed (not serious though, as I see) on most
>> operations with the cache;
> 
> I guess I should ask, do you have an less issues less relevant waiting in
> the wings? While more speed is good, that is worthless if it is badly
> broken.
Yes.

>>  - fix requires a big patch (small part of it was written by me, see #474947
>> thread);
>>  - this patch have to change internals of apt;
>>  - this patch can break apt API and ABI (don't checked);
> 
>>  - this patch definitely requires thorough review and testing;
> 
> Here it is a matter of the weightings. The flipside is without this
> fixed:
> 
>  - Almost certainly a significant number of users will run into this
> issue during the lifetime of Lenny (the history is 5-10 bugs/year; plus
> an unknown and likely large number of people who do not report it since
> they see it has already been reported and therefore presume work has
> already begun on fixing it).
Release notes have already an item about upgrading APT first and setting 
Cache-Limit, if I
recall correctly.

>  - This complicates debugging, as it escalates otherwise harmless issues
> to major severity (see #400768; while certainly an otherwise unrelated
> bug, if the MMap issue wasn't present, this bug would never have caused
> any problems).
It seems that 400768 was caused by other limits. Though I may be wrong.

>  - It is quite likely that upon upgrading to a version of Debian after
> Lenny, APT will again break due to this issue and again have to include
> a major warning in the release notes.
Yes :(

>  - As far as actual impact of the change we still do not know. Despite
> knowing about the first problem for at least 5 years (#178623 is the
> oldest report I have found), and knowing that it was still very
> definitely an issue for a minimum of nearly 2 years (#400768); we still
> do not have anything but rough guestimates. It might be that this is the
> time your estimate is wildly wrong, but we do not know since no patch has
> ever been tried. Why not try this in experimental? Then we would have
> real experience to judge how much work it will take to fix.
Please, leave this after Lenny. Testing in experimental is definitely not 
enough.

Yes, it's bad that this bug wasn't touched during long time.

>> I don't think this would be acceptable by release team.
> 
> Too a point I think I can summarize your position as: It is too dangerous
> to fix this in Lenny.
Yes.
> 
> Correspondingly I can summarize my position as: It is too dangerous NOT
> to fix in Lenny.
> 
> There is no clearly right answer here. The issue is which will damage
> Debian more; delaying the release, or releasing with another serious
> issue?
Well, old bug with clear instructions how to fix it is better that undiscovered 
bunch of
new ones which, If will not be discovered before Lenny, will affect user during 
all Lenny
lifetime. You also know that, firstly, Debian had full freeze in July 2008, 
freeze of core
components was at several months before.

>> Elliott, reason for this bug is apt architecture. Do you think we can easily
>> change architecture of the core package at freeze stage?
> 
> I have made no such claims. I am merely stating that this is a serious
> bug. Severe enough to seriously consider delaying the release. This is
> what the release team gets to decide, which is worse (neither option is
> good)?
Yes. So, If you claim this have to be fixed before Lenny, go ahead and ask 
Debian release
team what they think about changes in internals of apt and additional month(s) 
of testing.

> Yet, since you've got an initial patch, why not put that out in
> experimental?
I've have not written initial patch. I've written small part of the patch.

It might be found that fixing it isn't anywhere near as bad
> as you thought. Even though it changes the API/ABI, if no one has ever
> touched that field, the impact on other packages will be zero.
Yes, it will change ABI and API. This will cause recompiling packages that rely 
on apt
against new apt, and would cause breakage of some apt-dependent tools (such as 
aptitude,
perl and python bindings). Another big pain for other developers.

> Perhaps
> the release team will decide it is worth delaying the release, in which
> case a head start in testing will be of great value. Perhaps some other
> issue will force a delay of the release, in which case the extra time
> might allow sufficient testing.
Perhaps. And perhaps not.

My conclusion: please not force fixing this bug before Lenny until release team 
agree to
change internals of apt at this stage.

-- 
Eugene V. Lyubimkin aka JackYF, Ukrainian C++ developer.



signature.asc
Description: OpenPGP digital signature


Bug#474947: the state of Bug#474947

2008-10-25 Thread Elliott Mitchell
>From: Eugene V. Lyubimkin <[EMAIL PROTECTED]>
> A Mennucc wrote:
> > IMHO one way to decide if to accept a patch during the freeze is to
> > see how large and "important" it is. Does anybody have an example
> > patch, or a description of what code changes would be necessary?

> I had a look on this bug and, thus seems I have. I don't understand why
> Elliott ignored my previous 2 mails about it. So, I am repeating my humble
> look here. Reasons for not touching this bug anymore before Lenny release are:

I ignored nothing. This does not mean I will come to the same conclusion.

>  - this patch reduces apt speed (not serious though, as I see) on most
> operations with the cache;

I guess I should ask, do you have an less issues less relevant waiting in
the wings? While more speed is good, that is worthless if it is badly
broken.


>  - fix requires a big patch (small part of it was written by me, see #474947
> thread);
>  - this patch have to change internals of apt;
>  - this patch can break apt API and ABI (don't checked);

>  - this patch definitely requires thorough review and testing;

Here it is a matter of the weightings. The flipside is without this
fixed:

 - Almost certainly a significant number of users will run into this
issue during the lifetime of Lenny (the history is 5-10 bugs/year; plus
an unknown and likely large number of people who do not report it since
they see it has already been reported and therefore presume work has
already begun on fixing it).
 - This complicates debugging, as it escalates otherwise harmless issues
to major severity (see #400768; while certainly an otherwise unrelated
bug, if the MMap issue wasn't present, this bug would never have caused
any problems).
 - It is quite likely that upon upgrading to a version of Debian after
Lenny, APT will again break due to this issue and again have to include
a major warning in the release notes.


 - As far as actual impact of the change we still do not know. Despite
knowing about the first problem for at least 5 years (#178623 is the
oldest report I have found), and knowing that it was still very
definitely an issue for a minimum of nearly 2 years (#400768); we still
do not have anything but rough guestimates. It might be that this is the
time your estimate is wildly wrong, but we do not know since no patch has
ever been tried. Why not try this in experimental? Then we would have
real experience to judge how much work it will take to fix.

> I don't think this would be acceptable by release team.

Too a point I think I can summarize your position as: It is too dangerous
to fix this in Lenny.

Correspondingly I can summarize my position as: It is too dangerous NOT
to fix in Lenny.

There is no clearly right answer here. The issue is which will damage
Debian more; delaying the release, or releasing with another serious
issue?

> Elliott, reason for this bug is apt architecture. Do you think we can easily
> change architecture of the core package at freeze stage?

I have made no such claims. I am merely stating that this is a serious
bug. Severe enough to seriously consider delaying the release. This is
what the release team gets to decide, which is worse (neither option is
good)?



Yet, since you've got an initial patch, why not put that out in
experimental? It might be found that fixing it isn't anywhere near as bad
as you thought. Even though it changes the API/ABI, if no one has ever
touched that field, the impact on other packages will be zero. Perhaps
the release team will decide it is worth delaying the release, in which
case a head start in testing will be of great value. Perhaps some other
issue will force a delay of the release, in which case the extra time
might allow sufficient testing.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| [EMAIL PROTECTED] PGP F6B23DE0 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
2477\___\_|_/DC21 03A0 5D61 985B <-PGP-> F2BE 6526 ABD2 F6B2\_|_/___/3DE0





-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#474947: the state of Bug#474947

2008-10-23 Thread Eugene V. Lyubimkin
A Mennucc wrote:
> On Wed, Oct 22, 2008 at 10:09:58PM -0700, Elliott Mitchell wrote:
>> I must therefore suggest that at the very least, the first part of this
>> bug is too severe to allow to continue on to yet another release. Despite
>> the pain now, that it is better to solve this issue and avoid yet more
>> pain down the road.
> 
> IMHO one way to decide if to accept a patch during the freeze is to
> see how large and "important" it is. Does anybody have an example
> patch, or a description of what code changes would be necessary?
I had a look on this bug and, thus seems I have. I don't understand why
Elliott ignored my previous 2 mails about it. So, I am repeating my humble
look here. Reasons for not touching this bug anymore before Lenny release are:

 - fix requires a big patch (small part of it was written by me, see #474947
thread);
 - this patch have to change internals of apt;
 - this patch can break apt API and ABI (don't checked);
 - this patch reduces apt speed (not serious though, as I see) on most
operations with the cache;
 - this patch definitely requires thorough review and testing;

I don't think this would be acceptable by release team.

Elliott, reason for this bug is apt architecture. Do you think we can easily
change architecture of the core package at freeze stage?

-- 
Eugene V. Lyubimkin aka JackYF



signature.asc
Description: OpenPGP digital signature


Bug#474947: the state of Bug#474947

2008-10-23 Thread A Mennucc
On Wed, Oct 22, 2008 at 10:09:58PM -0700, Elliott Mitchell wrote:
> I must therefore suggest that at the very least, the first part of this
> bug is too severe to allow to continue on to yet another release. Despite
> the pain now, that it is better to solve this issue and avoid yet more
> pain down the road.

IMHO one way to decide if to accept a patch during the freeze is to
see how large and "important" it is. Does anybody have an example
patch, or a description of what code changes would be necessary?

a.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#474947: the state of Bug#474947

2008-10-22 Thread Elliott Mitchell
>From: A Mennucc <[EMAIL PROTECTED]>
> In this same bug there were reported two different issues, a "Dynamic
> MMap error"  and a segmentation fault; moreover some people were using
> APT in Etch, and some other in Lenny.

I believe this is a good estimation of how it breaks down.

> The only way to trigger a segmentation fault was instead to set
> Cache-Limit to a ridiculously low value.

Perhaps I was lucky. It is difficult to probe deeper with the first bug
complicating #474947.

Bugs #409336 and #443564 may be similar, or perhaps the exact same issue.

> So my conclusion is that the forthcoming release notes do address the
> problem some people may encounter in upgrading from  Etch to Lenny.
> 
> I propose the attached patch, though, since it is funny to suggest a
> value of 1250 (bytes) when the internal value in Lenny is 20MB.
> I hope someone in the d-relase team can apply it.

One thing comes to mind here, is this an amount of space reserved or an
amount allocated? If the latter, then smaller (embedded) systems will
have problems.



> Some months go by; in Sept. JackYF offers to help fixing the problem by
> changing the code.
> (Unfortunately we are already in deep freeze, and I am afraid deep
> changes to APT would not be accepted by the release team.)

I must suggest that the release team think very carefully about this.

The earliest manifestation of the former (MMap problem without the core
dump) is bug #178623. For the past 5 years, every release has had this
bug multiple times as an important issue and had to have it documented in
release notes. I'm estimating 25-50 reports, plus a large number of
people who found the workaround via Google or simply looking at the
reports and decided that since it was reported and a workaround was
known, a fix must be under way.

This effects a rather large number of people.

Given past history, no matter how high the limit (workaround) is set now,
a significant number of people will encounter it within the lifespan of
Lenny. Worse, when the next version beyond Lenny is produced, upgrades
will very likely trigger this bug yet again and a horde of people will
encounter it.


I must therefore suggest that at the very least, the first part of this
bug is too severe to allow to continue on to yet another release. Despite
the pain now, that it is better to solve this issue and avoid yet more
pain down the road.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| [EMAIL PROTECTED] PGP F6B23DE0 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
2477\___\_|_/DC21 03A0 5D61 985B <-PGP-> F2BE 6526 ABD2 F6B2\_|_/___/3DE0





-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#474947: the state of Bug#474947

2008-10-21 Thread A Mennucc
retitle 474947 "fix Dynamic MMaps error"
severity 474947 important
tag 474947 -unreproducible
thanks

hi bug, hi people, hi d-release

I did some study on bug 474947, that is grave/RC, and is posted against APT.

Since I was told that the APT team is understaffed, I decided to take
action myself.

---

First of all, I tried to focus the problem.

In this same bug there were reported two different issues, a "Dynamic
MMap error"  and a segmentation fault; moreover some people were using
APT in Etch, and some other in Lenny.

Let's summarize it shortly.

First poster is Elliot Mitchel, in Apr 08; he is using APT 0.6.46 inside
Etch. He claims that he cannot work around a "Dynamic MMap" error; he
then sets severity to grave.
 It turns out that he is using the wrong option, he is using '-o
APT::Cache-File=2' instead of '-o APT::Cache-Limit=2' .
(And that confuses "Joe Nahmias" and me as well - the mistake is noted
by JackYF quite later). So the first report is flawed.

jasen reports a similar problem, but I don't see enough details to comment.

Then Elliott Mitchell again also posts a report about a segmentation fault.

Some months go by; in Sept. JackYF offers to help fixing the problem by
changing the code.
(Unfortunately we are already in deep freeze, and I am afraid deep
changes to APT would not be accepted by the release team.)



I did some more research around, and tests, and this is what I found out.

The default in APT (inside the code) for Cache-Limit in Lenny is 20MB ,
in Etch is 8MB .

Note also that in the Lenny release notes
http://svn.debian.org/viewsvn/ddp/manuals/branches/release-notes/lenny/en/upgrading.dbk
 it is suggested to set Cache-Limit to 1250 in case of Dynamic MMap
error.


I tested APT 0.6.46 in Etch. I tried with 3 different sources.list, see
attachment. The "fat" list is the union of all lists of all reporters
(minus obsolete and duplicates).

The two smaller files work perfectly well; the "fat" sources file does
trigger the DynamicMMap problem, that I can though work around by using
 'apt-get -o APT::Cache-Limit=1 update'

The only way to trigger a segmentation fault was instead to set
Cache-Limit to a ridiculously low value.

So my conclusion is that the forthcoming release notes do address the
problem some people may encounter in upgrading from  Etch to Lenny.

I propose the attached patch, though, since it is funny to suggest a
value of 1250 (bytes) when the internal value in Lenny is 20MB.
I hope someone in the d-relase team can apply it.

a.

Index: manuals/branches/release-notes/lenny/en/upgrading.dbk
===
--- manuals/branches/release-notes/lenny/en/upgrading.dbk	(revisione 5426)
+++ manuals/branches/release-notes/lenny/en/upgrading.dbk	(copia locale)
@@ -957,10 +957,11 @@
 to a value that should be sufficient for the upgrade:
 
 
-# echo 'APT::Cache-Limit 1250;' >> /etc/apt/apt.conf
+# echo 'APT::Cache-Limit 2100;' >> /etc/apt/apt.conf
 
 
-This assumes that you do not yet have this variable set in that file.
+This assumes that you do not yet have this variable set in that file;
+otherwise you may manually edit the file to set the above variable.
 
 
 Sometimes it's necessary to enable the APT::Force-LoopBreak

deb http://debian.oregonstate.edu/debian/ stable main contrib non-free
deb http://security.debian.org stable/updates main contrib non-free
deb http://www.debian-multimedia.org stable main
deb http://volatile.debian.org/debian-volatile stable/volatile main contrib 
non-free
deb http://debian.oregonstate.edu/debian/ testing main contrib non-free



deb-src http://debian.oregonstate.edu/debian/ stable main contrib non-free
deb-src http://security.debian.org stable/updates main contrib non-free
deb-src http://www.debian-multimedia.org stable main
deb-src http://volatile.debian.org/debian-volatile stable/volatile main contrib 
non-free
deb-src http://debian.oregonstate.edu/debian/ testing main contrib non-free


#-

deb http://ftp.egr.msu.edu/debian/ unstable main contrib
deb-src http://ftp.egr.msu.edu/debian/ unstable main contrib

deb http://ftp.us.debian.org/debian/ unstable main contrib non-free
deb-src http://ftp.us.debian.org/debian/ unstable main contrib non-free


deb http://ftp.us.debian.org/debian/ etch main contrib non-free
deb http://ftp.us.debian.org/debian/ lenny main contrib non-free

deb-src http://mentors.debian.net/debian unstable main contrib

#
deb ftp://ftp.nz.debian.org/debian etch main non-free contrib
deb-src http://ftp.nz.debian.org/debian stable main non-free contrib
deb http://ftp.nz.debian.org/debian lenny main contrib non-free
deb http://ftp.nz.debian.org/debian unstable main contrib
deb http://www.debian-multimedia.org etch main


deb ftp://ftp.au.debian.org/debian etch main non-free contrib
deb-src http://ftp.au.debian.org/debian etch main non-free contrib

deb-src http://ftp.nz.debian.org/debian testing main contrib