On Tue, 29 May 2018 23:27:14 -0400 Cedric Bail <[email protected]> said:

> I have been wondering where to answer on this email thread, but I think that
> the first one kind of set the tone and trend for the rest and overall it is a
> very good example of bad communication and expectation to say the least.

what tone? that there is a problem and there needs to be a solution quickly for
efl release? that i'm unhappy with the testing/qa for such large and major
changes?

> On May 26, 2018 11:55 PM, Carsten Haitzler <[email protected]> wrote:
> > i've been dreading updating e1fl for the past few days. the dread proved
> > founded. the short version:
> > 
> > the batch below is pretty horrible. it leaves enlightenment glitching with
> > garbage windows after a desktop switch or a second or 2. it makes
> > enlightenment's restarts significantly slower (like from 1 second on this
> > machine to like 2-3) with more visual bi-products (screen with just
> > wallpaper visible for 0.5-1sec). the batch is also essentially unbisectable.
> 
> You have been doing open source work for how many years now ? How do you
> expect anyone to deal with such a bug report ? There is no task where I could
> find details about them. It is just a rambling with the expectation that
> everyone is seeing the problem.

the e devel mailing list is expected to be subscribed to and read by all
developers. you know this full well. so i expect "dealing with it" as
reading the email like a human, some kind of response of "oh wait - don't revert
it i can fix it" or "what? i tested it and it works perfectly for me? why does
it break for you?". the details are in the email like they would be in any
ticket too.

what about an email makes it impossible to be dealt with? i use emails when
there is more to be said than just point to a bug and track that, or for
anything major which needs wider distribution (like don't update beyond rev
X, and that i think testing is not good enough). i would like to point out that
for example kernel developers exclusively use email for both review and bug
reports and discussion, so they obviously can deal with email...

> > this batch leaves efl in a far worse state than before.
> 
> Thanks. Getting rid of technical debt is underrated.

before things worked and were faster on restart (well seemingly faster). after
they do not work to a point where things are usable at all. and it's not just
me seeing it too. others hit the same problems and started to send mails and
pop up on irc... i do value the work behind it, but it's created a very visible
problem just prior to an efl release. and just prior to you disappearing.

http://www.enlightenment.org/ss/e-5b0e303f8d8576.82049191.jpg

i described the issue in my mail. there is no backtrace to point to in detail
nor could i even detail what commit broke it.

i have since tried this on 4 systems, and the same result as above. default
theme, flat theme... single or multiple screens. intel or nvidia gpu. all 64bit
arch though and all x11.

> > the batch itself is horrible because of the unbisectability. i have tried
> > now about 15 commits (i lost count as i ended up in text console unable to
> > write notes where i was writing them) and out of them only 1 compiled and
> > ran at all without major issues. the largest amount of these just left e
> > crashing on startup along with another group having edje_cc crash during
> > compilation. while the crashes were gone by the end of the batch the above
> > issues (short version above) remain.
> 
> I guess you didn't know we had people overriding efl_del and a lot of other
> hacks. So as soon as you fix the lifecycle everything falls apart. Leaving

then as q66 said: it should have been merged, or divided so each commit is
"runnable" on its own. you have done open source long enough to know that
introducing a large batch of "unbisectable" commits makes digging through
history very hard. 

> two options. Having one giant commit which doesn't explain anything or small
> one that explain the reasoning of the change. Their might have been a few
> that could have maybe not broken the rest, but overall, with all the hacks we
> had in place, it was impossible until late in the serie to make anything able
> to run.

one way to do it: flag a class to have old or new style destroy and so object
type by type they behave differently, cleaning up one at a time per commit. at
the end remove flag feature as it's not needed anymore. an idea for doing it
step by step.

> > finding out the causes of these issues is nigh impossible. i've already
> > spent over 4hrs on the above trying to find them.
> 
> Well, Marcel did seems to have find something quickly, I have no idea if that
> does fix any other of your issues, as I obviously have not seen them nor
> gotten any way to reproduce them.

no fix. i only know of:

https://phab.enlightenment.org/D6222 - abandoned (doesn't fix)
https://phab.enlightenment.org/D6223 - committed but doesn't fix it
https://phab.enlightenment.org/D6224 - just adding test, not a "fix" (but good)

i'm looking at git master and i see no fixed in there and i don't know of any
pending fixed... :/

> > so... i've gone back to 0090384ef5ac9f9e939874a1bbf233298c9db930 which is
> > good/works. i'm sitting on this (and i suggest anyone else do the same for
> > now).
> 
> Seems a good advice along with not making a bug report.

an email is both here a bug report and a broadcast for people to hold off on
updating and where to stay to avoid problems - the mail detailed them. it's
pretty decent advice given people already hit the problem.

i was hoping to at least get a "oh shit. really?" response. maybe a "please
don't revet - i think i know the source" or "it worked for me? why is it now
broken for you and others?". instead i get this "well i did everything fine and
it's the only way it can be done".

this is the problem. you know full well you were quitting and going to vanish
and just before that we end up with this commit. i was incredibly restrained
and nice given the nature of what i have seen. since you sent a mail bout your
sabbatical now, it's public knowledge but i knew it was coming and knew it was
end of this month and this lands right before the end of the month... if it was
good without any major issues i wouldn't have said peep.

> > i'm going to wait until my wednesday morning here in seoul. if the above
> > issues are not fixed by then i have no choice but to revert every patch from
> > 36f8a70041a8a16249a07d5b7131d57a8a6ea95b to
> > 75bb7c049f05176aef635bddcfb320c306b196bf from cedric because tbh - this is
> > the problem batch as described. it's not personal. it's the reality of the
> > situation. i have to do this because there is no way an efl release can go
> > out with these in place in their current condition. this is 115 commits
> > btw... so going over every single one to figure out what may or may not be
> > involved is going to be a major time sink that i don't think can be done
> > well other than maybe trim the start of this series and keep some of those.
> > i might do so in the meantime.
> 
> I don't know. What about T6879 ? It borked completely one of my computer and
> took a month to get fixed. Would that have been corrected faster if I did
> write the same kind of email as you did here ?

if you had reverted i would have understood. it didn't seem important at least
to you due to the slowness of providing a backtrace and no other reports i
heard of.

it's a code path that isn't common. it luckily was a single very small commit
(not 100+). it took 3 weeks for you to provide a backtrace from original bug
report (and i'd kind of forgotten about it due to reply gaps until i came back
to it), then about 3 weeks or so until i read it again and saw the bt's and
saw what code path it might be and fixed it. my fix was within hours of seeing
the replies. my bad that i added the bug. i did address it once i had enough
info on what it might be (and noticed the response).

> > i'm a bit disappointed on the lack of testing of these. :( also this is a
> > perfect example of drive-by commit batches causing major issues which is
> > why i keep pushing against branches or hoarding of commits because they
> > lead to this again and again. it also reinforces my take that work needs to
> > be done in small units and shared frequently and i am certain the more and
> > more common issues efl etc. are having is a result of the change to
> > branches and hoarding of patches and dumping them in large numbers. review
> > doesn't work because i already reverted one patch earlier today that was
> > reviewed that totally broke e. review obviously doesn't involve any testing
> > and that is the most basic thing to do. :(
> 
> The idea of associating drive by commit and branches is kind of interesting.
> The fact are that many people did help on this branch and did tests it. There
> wasn't a way to land a more atomically version of it (If you just read the
> commit message you will understand how much technical debt there was that
> couldn't be dealt with small change). Also your entire email and following
> thread is build on the assumption that I haven't done any tests and just yolo

when there's a massive batch of commits going in and i know full well that a
few days after you plan to disappear... i do call this a drive-by commit. it's
a worse version of "commit on friday at 5pm then go away for the weekend".
it's 100+ commits messing with deep internals a few days before diappearing
for a few months (and perhaps forever). call it a drive-by branch-landing if
you want - the result is the same. massive amount of code delivered to master
all at once with problems just before disappearing.

indeed i do assume lack of testing because i have tried it now on 4 systems
machines. i know you have avoided testing e before and you directly admitted
that, so i see results that indicate similar again.

> landed 121 patches. Which is an absolutely ridiculous idea that shouldn't
> even need to be discussed/debated, but well, I have run make check and
> updated/added tests as necessary. I have checked E and terminology on my
> laptop and have been using it for more than a week before landing it. My
> branch has been buildable and testable in public for weeks. I got people to

see above. edje_cc segv's in maybe a bit under half of the commits i landed on.
e going into hangs and infinite loops inside destructor on maybe another 30%
of them. i fail to fathom how it could have been tested when i ran into problem
after problem. the infinite loops were in common code paths (destruction of
objects). if you have indeed tested it then i would dearly love to know why
everything is fine for you and seemingly not fine for quite a few others. 

> run and check it. So at some point, you might realize that not everyone,
> including you, has the same configuration and tests environment has limit
> that will only be caught when a larger population use it.

that is true, but how can it be so bad when i look at it and all the commits
inside and it work so wonderfully for you and others? edje_cc is not
configuration related. perhaps race or libc mem layout related... but if i got
such reliable crashes how did things just all work a charm for you?

> Anyway, this type of email are clearly not useful and tiring to deal with. It
> would be better to come back to more constructive one. And I hope in the
> future to see better bugs report and involvment with testing. Otherwise it is
> unlikely that any more important work that get rid of our technical debt will
> take place and efl will just be on life support.

i'll happily file a ticket if it needs tracking. one was filed within like
a few hours or so of the email anyway so not sure if there is a point:
https://phab.enlightenment.org/D6220

i was not happy after many hours of building, rebuilding, rebuilding, resetting
my bisect to a new range with new manually chosen commits to bypass the "it
doesn't even compile" problems so i don't know if the bisect is good or bad.

let me put it this way. let's say you were a cook and the head chef walks in
and notices part of the kitchen is on fire. he shouts "wtf? what is going on
here? i'm running to get the extinguisher in a second if you don't put it out,
and that'll ruin all food prepared for the day because of the fumes", and then
your email here is "well it was necessary, but ... i was somewhere else and
didn't notice - also the other cooks were helping me out" ... do you think that
the chef isn't justified in yelling and jumping up and down? i didn't even do
that i actually was pretty calm about it.

i made it clear in following mails that i valued the work and content. it's why
i didn't just insta-revert. i know it's hard. but testing is not good enough.

now i have since narrowed things down a bit. it only works with native
surface/gl rendering. software is fine. i now am baffled how lifecycle changes
affect a specific rendering engine or native surface path... but it does. this
says to me that testing is indeed not good. not testing accelerated
rendering/compositing in this day and age should have few if any excuses at
least on devices with such full support which is just about every development
machine devs work on...

> Cedric
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - [email protected]


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to