On Jan 21, 9:24 pm, Phlip <phlip2...@gmail.com> wrote: > On Jan 20, 11:20 pm, Michele Simionato <michele.simion...@gmail.com> > wrote: > > > pylint does too many things, I want something fast that just counts > > the lines and can be run on thousands of files at once. > > cloc seems fine, I have just tried on 2,000 files and it gives me a > > report in just a few seconds. > > In my experience with Python codebases that big... > > ...how many of those lines are duplicated, and might merge together > into a better design? > > The LOC would go down, too.
Actually 2,000 files is a very small portion of our code base, the one I am working on now. I have spent the last couple of months on a big refactoring project (which is still only at the beginning) and I wanted to count the difference between the lines of code before the refactoring and after the refactoring. I guess the new code is less than half than the old one. There was no cut and paste in the old code but a lot of subtle duplication, i.e. a code that could be unified in common libraries, but only after a lot of grunt work. The core parts were written 10 years ago, with a wrong architecture starting from the beginning, and then things started growing and growing on that monster. Just for fun I have run cloc on our trunk: Language files blank comment code scale 3rd gen. equiv -------------------------------------------------------------------------------- C++ 1528 67150 48251 304365 x 1.51 = 459591.15 XML 560 2769 2517 223223 x 1.90 = 424123.70 ASP 731 40136 4630 216713 x 1.29 = 279559.77 Python 2027 38825 47261 179532 x 4.20 = 754034.40 C/C++ Header 2150 51352 72619 141356 x 1.00 = 141356.00 Javascript 153 26196 9819 115311 x 1.48 = 170660.28 C 332 14147 12871 97918 x 0.77 = 75396.86 SQL 426 16432 4214 93598 x 2.29 = 214339.42 CSS 110 1493 1013 23087 x 1.00 = 23087.00 C# 83 3301 1990 19827 x 1.36 = 26964.72 Visual Basic 35 4363 5927 14633 x 2.76 = 40387.08 make 259 1617 650 8339 x 2.50 = 20847.50 Bourne Shell 52 598 1282 6557 x 3.81 = 24982.17 m4 28 611 627 5612 x 1.00 = 5612.00 IDL 23 560 0 3895 x 3.80 = 14801.00 HTML 33 354 76 3834 x 1.90 = 7284.60 MSBuild scripts 3 2 7 3419 x 1.90 = 6496.10 Lisp 33 562 648 2695 x 1.25 = 3368.75 Ruby 13 272 97 1141 x 4.20 = 4792.20 DOS Batch 77 790 410 1034 x 0.63 = 651.42 Java 4 148 181 972 x 1.36 = 1321.92 Perl 6 104 131 922 x 4.00 = 3688.00 XSD 6 0 0 506 x 1.90 = 961.40 awk 5 65 17 366 x 3.81 = 1394.46 DTD 4 117 50 351 x 1.90 = 666.90 ASP.Net 36 153 561 280 x 1.29 = 361.20 Bourne Again Shell 12 63 8 245 x 3.81 = 933.45 XSLT 1 15 14 196 x 1.90 = 372.40 NAnt scripts 3 27 0 119 x 1.90 = 226.10 Teamcenter def 10 16 0 93 x 1.00 = 93.00 -------------------------------------------------------------------------------- SUM: 8743 272238 215871 1470139 x 1.84 = 2708354.95 -- http://mail.python.org/mailman/listinfo/python-list