[389-devel] Re: Mapping tree rework

2020-10-19 Thread Pierre Rogier
Hi William,

Things are not black and white:
   there is a huge difference between a fix with limited impact (like
adding some check in configuration tools or in the server) and redesigning
something that is used in many different contexts for every request handled
by the server ...

In the first case we could easily mitigate the risk by testing and be
fairly confident, in the second case the tests are too complex to achieve
the same confidence and we should take this kind of risk only if there were
a serious benefit to balance it, but in this case, there are other
solutions with less risks.

I can understand it could seem too conservervative and frustrating but that
is the price when working on mature projects. If you do not do that, the
product becomes unstable, and users quickly abandon it.

Regards,
   Pierre


On Mon, Oct 19, 2020 at 1:27 AM William Brown  wrote:

>
>
> > On 16 Oct 2020, at 17:48, Pierre Rogier  wrote:
> >
> > Hi William,
> > I agree with your architecture points and that is why I said my proposal
> is a less appealing trade off.
> >
> > My real concern is your last point:
> >  we just do not know and IMHO we are unable to predict what (or if)
> config will cause problems, and I am afraid we will only discover it when
> people start to complain.
> > So I still think that the benefit/risk ratio is bad)
> >
>
> I think this wasn't my point. The thing is *any* change will have that
> "unknown" risk. Our job is to qualify and identify as many of those risks
> as we can, to remove them as unknowns. Think about the work recently to
> merge the changelog to the main db, or BDB to LMDB work, even changing from
> perl to python for installation. These are all significantly larger
> changes, which would be "much riskier" but all of them have been managed
> effectively by the team communicating, coordinating, analysing, designing
> and testing changes.
>
> So I really don't accept this "unknown" risk argument. I have laid out a
> design that explores the configuration, how it works today and how the
> values are currently trusted, and a process to manage and understand this
> change in a way to minimise the risk. There are associated tests, and it
> passes with address sanitiser, and other test cases for mapping trees,
> replication and others.
>
> If we just say "unknown risk" at every change we make we'd never progress.
> We may as well packup and go home, the project is completed.
>
> So I still stand by my design and the PR I have submitted in this case,
> and if there are concerns about esoteric configurations, then we should
> identify and understand them too beyond the testing I have already provided.
>
> —
> Sincerely,
>
> William Brown
>
> Senior Software Engineer, 389 Directory Server
> SUSE Labs, Australia
> ___
> 389-devel mailing list -- 389-devel@lists.fedoraproject.org
> To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
>


-- 
--

389 Directory Server Development Team
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Mapping tree rework

2020-10-19 Thread Mark Reynolds

Hi,

So some of the arguments here is that we are introducing risk for 
something that is not really a big problem.  Or, simply not worth 
investing in. From a Red Hat perspective "we" would _never_ fix this, 
it's just not a problem that comes up enough to justify the work and 
time.  But...  The initial work has been done by the upstream community 
(William).  So from a RH perspective we are getting this work for free.  
Personally I don't see this code change as "very" risky, but this is a 
very sensitive area of the code.  That being said, I am not opposed to 
adding it, but...  I think we need much more testing around it to build 
confidence in the patch.  I would want tests that deal with suffixes of 
varying size, names, nested levels/complexity:


    o=my_server.com

    dc=example,dc=com

    dc=abcdef,dc=abc  (same length as suffix above - since the patch 
uses sizing as a way of sorting)


    dc=test,dc=this,dc=patch


I want tests that are adding and removing subsuffixes, and 
sub-subsuffixes, and making sure ldap ops work, and replication, etc.  I 
want tests that use many different suffixes at the same time and many 
subsuffixes - some customers have 50 subsuffixes. Our current CI test 
suite does not have these kinds of tests, and we need them.


As of today I'm not comfortable with the current CI tests to ack this 
patch, but if we can ramp it up and cover more scenarios it would be a 
step in the right direction.  This is all just my humble opinion, we are 
all still just talking :-)


Mark




On 10/19/20 6:47 AM, Pierre Rogier wrote:

Hi William,

Things are not black and white:
   there is a huge difference between a fix with limited impact (like 
adding some check in configuration tools or in the server) and 
redesigning something that is used in many different contexts for 
every request handled by the server ...


In the first case we could easily mitigate the risk by testing and be 
fairly confident, in the second case the tests are too complex to 
achieve the same confidence and we should take this kind of risk only 
if there were a serious benefit to balance it, but in this case, there 
are other solutions with less risks.


I can understand it could seem too conservervative and frustrating but 
that is the price when working on mature projects. If you do not do 
that, the product becomes unstable, and users quickly abandon it.


Regards,
   Pierre


On Mon, Oct 19, 2020 at 1:27 AM William Brown > wrote:




> On 16 Oct 2020, at 17:48, Pierre Rogier mailto:prog...@redhat.com>> wrote:
>
> Hi William,
> I agree with your architecture points and that is why I said my
proposal is a less appealing trade off.
>
> My real concern is your last point:
>  we just do not know and IMHO we are unable to predict what (or
if) config will cause problems, and I am afraid we will only
discover it when people start to complain.
> So I still think that the benefit/risk ratio is bad)
>

I think this wasn't my point. The thing is *any* change will have
that "unknown" risk. Our job is to qualify and identify as many of
those risks as we can, to remove them as unknowns. Think about the
work recently to merge the changelog to the main db, or BDB to
LMDB work, even changing from perl to python for installation.
These are all significantly larger changes, which would be "much
riskier" but all of them have been managed effectively by the team
communicating, coordinating, analysing, designing and testing changes.

So I really don't accept this "unknown" risk argument. I have laid
out a design that explores the configuration, how it works today
and how the values are currently trusted, and a process to manage
and understand this change in a way to minimise the risk. There
are associated tests, and it passes with address sanitiser, and
other test cases for mapping trees, replication and others.

If we just say "unknown risk" at every change we make we'd never
progress. We may as well packup and go home, the project is completed.

So I still stand by my design and the PR I have submitted in this
case, and if there are concerns about esoteric configurations,
then we should identify and understand them too beyond the testing
I have already provided.

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org

To unsubscribe send an email to
389-devel-le...@lists.fedoraproject.org

Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:

https://lists

[389-devel] Re: Mapping tree rework

2020-10-19 Thread William Brown
> In the first case we could easily mitigate the risk by testing and be fairly 
> confident, in the second case the tests are too complex to achieve the same 
> confidence and we should take this kind of risk only if there were a serious 
> benefit to balance it, but in this case, there are other solutions with less 
> risks.

Actually, I think testing the lib389 tooling would be even harder. You would 
need to recreate the logic of the mapping tree and sorting in python, which may 
have subtle differences compared to the C version. So it would be harder to 
test and gain confidence in. It also doesn't solve the issue that may come 
about from manual misconfiguration.

> I can understand it could seem too conservervative and frustrating but that 
> is the price when working on mature projects. If you do not do that, the 
> product becomes unstable, and users quickly abandon it.

I have worked on this project for a number of years, so I'm well aware of the 
culture in the team. We are a team who values the highest quality of code, with 
customers who demand the very best. To satisfy this as engineers we need to be 
confident in what we do and the work we create. But every day we make changes 
that are bigger than this, or have "more unknowns" and more. It's out attitude 
as a team to quality, our attention to testing, and designs, that make us 
excellent at effectively making changes with confidence.

Because just as easily, when a product has subtle traps, unknown configuration 
bugs and lets people mishandle it, then they also abandon us. Our user 
experience is paramount, and part of that experience is not just stability, but 
reliability and correctness, that changes performed by administrators will work 
and not "silently fail". This bug is just as much a risk for people to abandon 
us because when the server allows misconfiguration to exist that is hard to 
isolate and understand that too can cause a negative user experience.

So here, I think we are going to have to "agree to disagree", but as Mark has 
stated - the fix is created, the PR is open. If you have more configuration 
cases to contribute to the test suite, that would benefit the project 
significantly to ensure the quality of the change, and the quality of the 
mapping tree in general. Our job is to qualify and create scenarios that were 
"unknown" and turn them to "knowns" so we can control changes and have 
confidence in our work.

> On 20 Oct 2020, at 06:10, Mark Reynolds  wrote:
> 
> Hi,
> 
> So some of the arguments here is that we are introducing risk for something 
> that is not really a big problem.  Or, simply not worth investing in. From a 
> Red Hat perspective "we" would never fix this, it's just not a problem that 
> comes up enough to justify the work and time.  But...  The initial work has 
> been done by the upstream community (William).

With a corporate interest too, we have a customer at SUSE who has hit this :).

>  So from a RH perspective we are getting this work for free.  Personally I 
> don't see this code change as "very" risky, but this is a very sensitive area 
> of the code.  That being said, I am not opposed to adding it, but...  I think 
> we need much more testing around it to build confidence in the patch.  I 
> would want tests that deal with suffixes of varying size, names, nested 
> levels/complexity:
> 
> o=my_server.com
> 
> dc=example,dc=com
> 
> dc=abcdef,dc=abc  (same length as suffix above - since the patch uses 
> sizing as a way of sorting)
> 
> dc=test,dc=this,dc=patch


Yep, these are some great test ideas. I can add these.

> 
> 
> 
> I want tests that are adding and removing subsuffixes, and sub-subsuffixes, 
> and making sure ldap ops work, and replication, etc.  I want tests that use 
> many different suffixes at the same time and many subsuffixes - some 
> customers have 50 subsuffixes.  Our current CI test suite does not have these 
> kinds of tests, and we need them.

I have already checked with replication suite too, and of course, with ASAN. I 
think that these also are good to have added in general, so I can expand the 
testing to include more suffixes too. 

Do you see 50 subsuffixes in a single level nesting or deeper? I can do some 
shallow nesting and deep nesting hierarchies with that kind of number if you 
want. I think an interesting test would also be to have 

ou=x,ou=y,dc=example,dc=com

dc=example,dc=com

and then add ou=y,dc=example,dc=com in between. Today I think the pre-patched 
MT code would actually not handle this either, but that's a pretty big edge 
case IMO. The real guarantee is that we do assemble the tree correctly.

We thankfully gain confidence already because the CN is already relied on for 
routing and query matching anyway, so we know these values *must* be correct, 
we just need to guarantee the sorting order and tree assembly.

Thanks for the ideas Mark :) 


> 
> As of today I'm not comfortable with the current CI tests to ack this patch