On Fri, Aug 17, 2018 at 9:19 AM, Lyor Goldstein <lgoldst...@apache.org> wrote:
> >>> We should be careful when trying to replace existing code with > external libraries because there is rarely a guarantee that it will work > exactly as the old code does. > > I agree in principle, but am not sure about "rarely a guarantee" - > especially in this case where the code is a 100% duplicate of the external > library. > Depends on whether the library which has the original then depends on other libraries or not. I have this problem often in NodeJS where one library depends on 16 other libraries and I could recreate the entire functionality I need without a single dependency in half a day. > > >>> Dependencies create problems when the dependent project decides to > slightly change the behavior of X class (for some reason) then our project > starts showing random bugs every 16 hours because of it. > > True, but isn't it why we have unit tests in place ? Granted that there is > no 100% guarantee, but what is the alternative ? Repeat work that has > already been done elsewhere again and again and again ? > Do we have unit tests for everything? Every possible scenario? I think a lot of the tests are more functional. > > I have seen quite a few libraries that have adopted the avoidance principle > and if one looks at their code, it is D.R.Y. (Do Not Repeat yourself) at > its purest form. Everybody seems to have their "own" logging, > *StringUtils*, > *CollectionUtils*, *MapUtils, IOUtils*, etc., that virtually do the exact > same thing all over again. > > >>> Unless the dependent code is huge (like Bouncycastle), I > think it rarely works out as a energy-time-saver. > > In this case I could even argue (just for the sake of this discussion) > "what if the 3rd part developers went so far as to add malware code ?" Does > this mean that we are doomed to always write all our code in-house > because because > we can't trust the other developers to be good or honest ? In that case > let's ditch *slf4j *and write our own logger wrapper framework (it is not > such a huge task).... > > :-)) I am not really suggesting this, just making a point ... > > Seems to me that the same points can be made of *any *code - huge medium or > small - you either trust it or not. So we > > * choose libraries that are widely popular, thus minimizing bug risks on > one hand due to their widespread usage, and increasing chances that > existing/new bugs are more likely to be discovered quickly and fixed - if > only due to "peer pressure" from the developers community > > * choose libraries that are open-source - so we can at least debug > problems, and if push comes to shove, perhaps devise a workaround if the > dependent code bug is not fixed > > * choose libraries from "reputable" sources - ones that are known to > produce high quality code, and at the same time are quick to address bugs > (I am happy to say that I consider the Apache Foundation such an > organization ...) > > To summarize, all the points you have raised are indeed very valid concerns > that need to be addressed and their pros and cons considered very > carefully. I am not advocating any kind of "extremist" policy in this case > - neither 100% reliance on 3rd party libraries nor 100% avoidance of them. > My own opinion is that it is indeed very much a matter of *measure* > > - how much code are you repeating > > - are you repeating trivial code that is basically "write once and forget" > or complex one that is likely to require frequent maintenance > > Thanks a lot for the insightful remarks > I agree with your sentiments. At the end of the day, all we can do is weigh the pros and cons and make informed decisions on a case by case basis. > Lyor >