Re: [python-committers] Codecov and PR

2017-04-28 Thread Brett Cannon
On Fri, 28 Apr 2017 at 02:19 Michael Foord  wrote:

>
>
> On 28/04/17 01:49, Terry Reedy wrote:
> > On 4/27/2017 3:44 PM, Brett Cannon wrote:
> >>
> >>
> >> On Wed, 26 Apr 2017 at 22:36 Terry Reedy  >> > wrote:
> >>
> >> On 4/26/2017 1:45 PM, Brett Cannon wrote:
> >
> >>  > E.g. I don't expect
> >>  > test_importlib to be directly responsible for exercising all
> >> code in
> >>  > importlib, just that Python's entire test suite exercise
> >> importlib as
> >>  > much as possible as a whole.
> >>
> >> The advantage for importlib in this respect is that import
> >> statements
> >> cannot be mocked; only the objects imported, after importlib is
> >> finished.
> >>
> >>
> >> Oh, you can mock import statements. :)
> >
> > Other than by pre-loading a mock module into sys.modules?
> > If so, please give a hint, as this could be useful to me.
>
>
> https://docs.python.org/3/library/unittest.mock-examples.html#mocking-imports-with-patch-dict


The other option is to stub out __import__() itself.
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-28 Thread Michael Foord



On 28/04/17 01:49, Terry Reedy wrote:

On 4/27/2017 3:44 PM, Brett Cannon wrote:



On Wed, 26 Apr 2017 at 22:36 Terry Reedy > wrote:


On 4/26/2017 1:45 PM, Brett Cannon wrote:



 > E.g. I don't expect
 > test_importlib to be directly responsible for exercising all 
code in
 > importlib, just that Python's entire test suite exercise 
importlib as

 > much as possible as a whole.

The advantage for importlib in this respect is that import 
statements

cannot be mocked; only the objects imported, after importlib is
finished.


Oh, you can mock import statements. :)


Other than by pre-loading a mock module into sys.modules?
If so, please give a hint, as this could be useful to me.


https://docs.python.org/3/library/unittest.mock-examples.html#mocking-imports-with-patch-dict



At the moment, I am the only one pushing idlelib patches, except 
when it
gets included in one of Serhiy's multi-module refactoring patches 
(and

he always nosies me).


It turns out that Louie Lu's new tool revealed a couple of other 
patches, though just to tests that started failing.


I had not thought about the issue that way.  I should add a 
test_module

for each remaining module, import the module, and at least create an
instance of every tkinter widget defined therein, and see what other
classes could be easily instantiated and what functions easily run.


That seems like a good starting point. Kind of like test_sundry but 
with class instantiation on top of it.


I looked and saw that bdb is in 'untested'.  I also discovered
https://bugs.python.org/issue19417
to change that, with a 3+ year-old-patch.  I plan to review it.


 > I view 100% coverage as aspirational, not attainable. But if we
want an
 > attainable goal, what should we aim for? We're at 83.44% now

On what system?



Travis, where the Codecov run is driven from.


I meant OS, because


   I suspect that Tkinter, ttk, turtle, and IDLE
GUI-dependent tests make at least a 2% difference on GUI Windows 
versus

no-GUI *nix.


--
Terry Jan Reedy

___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-27 Thread Terry Reedy

On 4/27/2017 3:44 PM, Brett Cannon wrote:



On Wed, 26 Apr 2017 at 22:36 Terry Reedy > wrote:


On 4/26/2017 1:45 PM, Brett Cannon wrote:



 > E.g. I don't expect
 > test_importlib to be directly responsible for exercising all code in
 > importlib, just that Python's entire test suite exercise importlib as
 > much as possible as a whole.

The advantage for importlib in this respect is that import statements
cannot be mocked; only the objects imported, after importlib is
finished.


Oh, you can mock import statements. :)


Other than by pre-loading a mock module into sys.modules?
If so, please give a hint, as this could be useful to me.


At the moment, I am the only one pushing idlelib patches, except when it
gets included in one of Serhiy's multi-module refactoring patches (and
he always nosies me).


It turns out that Louie Lu's new tool revealed a couple of other 
patches, though just to tests that started failing.



I had not thought about the issue that way.  I should add a test_module
for each remaining module, import the module, and at least create an
instance of every tkinter widget defined therein, and see what other
classes could be easily instantiated and what functions easily run.


That seems like a good starting point. Kind of like test_sundry but with 
class instantiation on top of it.


I looked and saw that bdb is in 'untested'.  I also discovered
https://bugs.python.org/issue19417
to change that, with a 3+ year-old-patch.  I plan to review it.


 > I view 100% coverage as aspirational, not attainable. But if we
want an
 > attainable goal, what should we aim for? We're at 83.44% now

On what system?



Travis, where the Codecov run is driven from.


I meant OS, because


   I suspect that Tkinter, ttk, turtle, and IDLE
GUI-dependent tests make at least a 2% difference on GUI Windows versus
no-GUI *nix.


--
Terry Jan Reedy

___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-27 Thread Brett Cannon
On Wed, 26 Apr 2017 at 22:36 Terry Reedy  wrote:

> On 4/26/2017 1:45 PM, Brett Cannon wrote:
> >
> > On Tue, 25 Apr 2017 at 17:00 Terry Reedy  > > wrote:
>
> > While I use code coverage to improve automated unittesting, I am
> opposed
> > to turning a usable but limited and sometime faulty tool into a blind
> > robotic master that blocks improvements.  The prospect of this being
> > done has discouraged me from learning the new system.  (More on
> 'faulty
> > tool' later.)
> >
> > It should be stated that code coverage is not a blocking status check
> > for merging from our perspective (the **only** required check is that
> > Travis pass with it's test run).
>
> I have the impression that at one time you hoped to make it blocking.
> If that was wrong, I apologize for misunderstanding.  If you have
> changed your mind, then I am happy.
>

"Hope", sure; "currently plan to", no. IOW I have said a lot of things over
the 2.5 years I've been leading this workflow shift and I don't expect all
of them to pan out.


>
> I am otherwise in favor of both the measurement and report of coverage
> being improved.
>
> > The temptation to write artificial tests to satisfy an artificial
> goal
> > is real.  Doing so can eat valuable time better used for something
> else.
> >For instance:
> >
> >   def meth(self, arg):
> >   mod.inst.meth(arg, True, ob=self, kw='cut')
> >
> > Mocking mod.class.meth, calling meth, and checking that the mock is
> > called will satisfy the robot, but does not contribute much to the
> goal
> > of providing a language that people can use to solve problems.
>
> > My assumption is that there will be a test that meth() does the right
> > thing, mock or not. If we provide an API there should be some test for
> > it; using a mock or something else to do the test is an implementation
> > detail.
>
> My impression is that default mocks have a generic signature, so that
> merely checking that the mock is called will not catch an invalid call.
> I presume that one can do better with mocks, and I have with custom
> mocks I have written, but my point above was that coverage does not care.
>
> >  >> If it's not important enough to require tests >> it's not
> > important enough to be in Python.  ;)
> >
> > Modules in the test package are mostly not tested. ;)
> >
> >
> > :) But they are at least executed which is what we're really measuring
> > here and I think all Ethan and I are advocating for.
>
> I thought Ethan was advocating for more -- a specific unittest for each
> line.
>
> > E.g. I don't expect
> > test_importlib to be directly responsible for exercising all code in
> > importlib, just that Python's entire test suite exercise importlib as
> > much as possible as a whole.
>
> The advantage for importlib in this respect is that import statements
> cannot be mocked; only the objects imported, after importlib is finished.
>

Oh, you can mock import statements. :)


>
> There is lots of interaction between idlelib modules, but I would still
> like direct tests of each idlelib.xyz with a test_xyz.py.  Three years
> ago, there was no test.test_idle.  There now is, and it runs 35
> idlelib.idle_test.text_* files.  (There are 60 idlelib modules.)
>
> > The problem I have with just doing manual testing for things that can
> > easily be covered by a unit test -- directly or indirectly -- is there
> > are technically 85 people who can change CPython today. That means
> > there's potentially 85 different people who can screw up the code ;) .
>
> At the moment, I am the only one pushing idlelib patches, except when it
> gets included in one of Serhiy's multi-module refactoring patches (and
> he always nosies me).  That skews my view a bit.  However, with most of
> the critical issues fixed, I am anxious to automate what I can of what I
> now do by hand.
>
> > Making sure code is exercised somehow by tests at least minimizes how
> > badly someone like me might mess something thanks to me not realizing I
> > broke the code.
>
> I had not thought about the issue that way.  I should add a test_module
> for each remaining module, import the module, and at least create an
> instance of every tkinter widget defined therein, and see what other
> classes could be easily instantiated and what functions easily run.
>

That seems like a good starting point. Kind of like test_sundry but with
class instantiation on top of it.


> > Some practical issues with coverage and CodeCov:
>
> > 2. Some statements are only intended to run on certain systems,
> making
> > 100% coverage impossible unless one carefully puts all
> system-specific
> > code in "if system == 'xyz'" statements and uses system-specific
> > .coveragerc files to exclude code for 'other' systems.
>
> > True. We could have a discussion as to whether we want to use
> > Coverage.py's pragmas ... I'm sure we could discuss things with Ned
> > Bat

Re: [python-committers] Codecov and PR

2017-04-26 Thread Terry Reedy

On 4/27/2017 1:52 AM, Ethan Furman wrote:

On 04/26/2017 10:35 PM, Terry Reedy wrote:

On 4/26/2017 1:45 PM, Brett Cannon wrote:


:) But they are at least executed which is what we're really 
measuring here and I think all Ethan and I are advocating

for.


I thought Ethan was advocating for more -- a specific unittest for 
each line.


I'm hoping that's just poor wording, or each module would need hundreds 
of tests.  ;)


Poor wording, but I was wrong in any case.  An aspirational goal is that 
each line of of each function be exercised for correctness by at least 
one unittest.  For top-level code not in a function or class, we mostly 
have to settle for compile and run without error.

___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-26 Thread Ethan Furman

On 04/26/2017 10:35 PM, Terry Reedy wrote:

On 4/26/2017 1:45 PM, Brett Cannon wrote:



:) But they are at least executed which is what we're really measuring here and 
I think all Ethan and I are advocating
for.


I thought Ethan was advocating for more -- a specific unittest for each line.


I'm hoping that's just poor wording, or each module would need hundreds of 
tests.  ;)

What I am advocating is to have 100% test coverage -- but as Brett said, that's an aspirational goal.  Said a different 
way:  all code in Python is important enough to test; whether we can write those tests with our limited resources is a 
different matter.


I am well aware that writing good tests is hard.  I am painfully aware of the frustration of having error-reporting code 
error out because it wasn't tested.


I agree that we shouldn't deny core developers from merging because of coverage 
statistics.

--
~Ethan~
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-26 Thread Terry Reedy

On 4/26/2017 1:45 PM, Brett Cannon wrote:


On Tue, 25 Apr 2017 at 17:00 Terry Reedy > wrote:



While I use code coverage to improve automated unittesting, I am opposed
to turning a usable but limited and sometime faulty tool into a blind
robotic master that blocks improvements.  The prospect of this being
done has discouraged me from learning the new system.  (More on 'faulty
tool' later.)

It should be stated that code coverage is not a blocking status check 
for merging from our perspective (the **only** required check is that 
Travis pass with it's test run).


I have the impression that at one time you hoped to make it blocking. 
If that was wrong, I apologize for misunderstanding.  If you have 
changed your mind, then I am happy.


I am otherwise in favor of both the measurement and report of coverage 
being improved.



The temptation to write artificial tests to satisfy an artificial goal
is real.  Doing so can eat valuable time better used for something else.
   For instance:

  def meth(self, arg):
  mod.inst.meth(arg, True, ob=self, kw='cut')

Mocking mod.class.meth, calling meth, and checking that the mock is
called will satisfy the robot, but does not contribute much to the goal
of providing a language that people can use to solve problems.


My assumption is that there will be a test that meth() does the right 
thing, mock or not. If we provide an API there should be some test for 
it; using a mock or something else to do the test is an implementation 
detail.


My impression is that default mocks have a generic signature, so that 
merely checking that the mock is called will not catch an invalid call. 
I presume that one can do better with mocks, and I have with custom 
mocks I have written, but my point above was that coverage does not care.



 >> If it's not important enough to require tests >> it's not
important enough to be in Python.  ;)

Modules in the test package are mostly not tested. ;)


:) But they are at least executed which is what we're really measuring 
here and I think all Ethan and I are advocating for.


I thought Ethan was advocating for more -- a specific unittest for each 
line.


E.g. I don't expect 
test_importlib to be directly responsible for exercising all code in 
importlib, just that Python's entire test suite exercise importlib as 
much as possible as a whole.


The advantage for importlib in this respect is that import statements 
cannot be mocked; only the objects imported, after importlib is finished.


There is lots of interaction between idlelib modules, but I would still 
like direct tests of each idlelib.xyz with a test_xyz.py.  Three years 
ago, there was no test.test_idle.  There now is, and it runs 35 
idlelib.idle_test.text_* files.  (There are 60 idlelib modules.)


The problem I have with just doing manual testing for things that can 
easily be covered by a unit test -- directly or indirectly -- is there 
are technically 85 people who can change CPython today. That means 
there's potentially 85 different people who can screw up the code ;) . 


At the moment, I am the only one pushing idlelib patches, except when it 
gets included in one of Serhiy's multi-module refactoring patches (and 
he always nosies me).  That skews my view a bit.  However, with most of 
the critical issues fixed, I am anxious to automate what I can of what I 
now do by hand.


Making sure code is exercised somehow by tests at least minimizes how 
badly someone like me might mess something thanks to me not realizing I 
broke the code.


I had not thought about the issue that way.  I should add a test_module 
for each remaining module, import the module, and at least create an 
instance of every tkinter widget defined therein, and see what other 
classes could be easily instantiated and what functions easily run.

Some practical issues with coverage and CodeCov:



2. Some statements are only intended to run on certain systems, making
100% coverage impossible unless one carefully puts all system-specific
code in "if system == 'xyz'" statements and uses system-specific
.coveragerc files to exclude code for 'other' systems.


True. We could have a discussion as to whether we want to use 
Coverage.py's pragmas ... I'm sure we could discuss things with Ned 
Batchelder if we needed some functionality in coverage.py for our needs).


Let's skip this for now.


3. Some tests required extended resources.  Statements that are only
covered by such tests will be seen as uncovered when coverage is run on
a system lacking the resources.  As far as I know, all non-Windows
buildbots and CodeCov are run on systems lacking the 'gui' resource.  So
patches to gui code will be seen as uncovered.

I view 100% coverage as aspirational, not attainable. But if we want an 
attainable goal, what should we aim for? We're at 83.44% now


On what system?  I suspect 

Re: [python-committers] Codecov and PR

2017-04-26 Thread Brett Cannon
On Tue, 25 Apr 2017 at 17:00 Terry Reedy  wrote:

> On 4/25/2017 11:00 AM, Barry Warsaw wrote:
> > On Apr 24, 2017, at 09:32 PM, Ethan Furman wrote:
> >>On 04/21/2017 03:29 PM, Victor Stinner wrote:
>
> (In the context of having a patch blocked by the blind Codecov robot ...)
>
> >>> I dislike code coverage because there is a temptation to write
> artificial
> >>> tests whereas the code is tested indirectly or the code is not
> important
> >>> enough to *require* tests.
>
> While I use code coverage to improve automated unittesting, I am opposed
> to turning a usable but limited and sometime faulty tool into a blind
> robotic master that blocks improvements.  The prospect of this being
> done has discouraged me from learning the new system.  (More on 'faulty
> tool' later.)
>

It should be stated that code coverage is not a blocking status check for
merging from our perspective (the **only** required check is that Travis
pass with it's test run). The only reason Victor ran into it at a blocker
is because he tried to do a merge over his phone where GitHub apparently
prevents merging if any failing check exists, required or not (I assume
this is because reviewing code on a small screen raises the chances of
overlooking a failing check that one actually cared about).

I'm on vacation until tomorrow so I've been off of GitHub since last
Thursday, but it's possible there's already a PR out there to turn off the
status checks. If there is still no such PR I'm still fine with turning off
the status checks for Codecov.


>
> The temptation to write artificial tests to satisfy an artificial goal
> is real.  Doing so can eat valuable time better used for something else.
>   For instance:
>
>  def meth(self, arg):
>  mod.inst.meth(arg, True, ob=self, kw='cut')
>
> Mocking mod.class.meth, calling meth, and checking that the mock is
> called will satisfy the robot, but does not contribute much to the goal
> of providing a language that people can use to solve problems.
>

My assumption is that there will be a test that meth() does the right
thing, mock or not. If we provide an API there should be some test for it;
using a mock or something else to do the test is an implementation detail.


>
> Victor, can you explain 'tested indirectly' and perhaps give an example?
>
> As used here,'whereas' is incorrect English and a bit confusing.  I
> believe Victor meant something more like 'even when'.
>
> For the last clause, I believe he meant "the code change is not
> complicated enough to *require* automated unit test coverage for the
> changed line(s)".  If I change a comment or add missing spaces, I don't
> think I should be forced to write a missing test to make the improvement.
>

Agreed.


>
> A less trivial example: on IDLE's menu, Options => Configure IDLE opens
> a dialog with a font size widget that when clicked opens a list of font
> sizes.  I recently added 4 larger sizes to the tuple in
> idlelib.configdialog.ConfigDialog.LoadFontCfg, as requested, I think, by
> at least 2 people.  I tested manually by clicking until the list was
> displayed.  As I remember, I did not immediately worry about automated
> testing, let alone line coverage, and I do not think I should have had
> to to get the change into 3.6.0.
>
> That line may or may not by covered by the current minimal test that
> simply creates a ConfigDialog instance.  But this gets back to what I
> think is Viktor's point.  This minimal test 'covers' 46% of the file,
> but it only tests that 46% of the lines run without raising.  This is
> useful, but does not test that the lines are really correct.  (For GUI
> display code, human eyeballing is required.)  This would remain true
> even if all the other code were moved to a new module, making the
> coverage of configdialog the magical ***100%***.
>
> >> If it's not important enough to require tests >> it's not important
> enough to be in Python.  ;)
>
> Modules in the test package are mostly not tested. ;)
>

:) But they are at least executed which is what we're really measuring here
and I think all Ethan and I are advocating for. E.g. I don't expect
test_importlib to be directly responsible for exercising all code in
importlib, just that Python's entire test suite exercise importlib as much
as possible as a whole.


>
> If 'test' means 'line coverage test for new or changed lines', then as a
> practical matter, I disagree, as explained above.  So, in effect, did
> the people who committed untested lines.
>
> In the wider sense of 'test', there is no real argument.  Each statement
> written should be mentally tested both when written and when reviewed.
> Code should be manually tested, preferably by someone in addition to the
> author. Automated testing is more than nice, but not everything.  Ditto
> for unit testing.
>

The problem I have with just doing manual testing for things that can
easily be covered by a unit test -- directly or indirectly -- is there are
technically 85 people who can change C

Re: [python-committers] Codecov and PR

2017-04-26 Thread Ethan Furman

On 04/25/2017 11:05 PM, Victor Stinner wrote:

On 04/25/2017 10:15 PM, INADA Naoki wrote:



`self.initfp()` is very unlikely raise exceptions.  But MemoryError,
KeyboardInterrupt or
other rare exceptions may be happen.


unittest.mock helps a lot to test such corner case: mock initfp() with a 
MemoryError side effect, maybe also close () to
check that the method was called... The new problem is that on 3 instructions, 
2 are mocked... The test now checks the
test itself or the real application?


That application -- specifically, the except clause of the application.  There should be other tests for the non-except 
clause portions.


--
~Ethan~
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-26 Thread Ethan Furman

On 04/25/2017 10:15 PM, INADA Naoki wrote:


This is one examples I merged "untested line of code".
https://github.com/python/cpython/pull/162/files#diff-0ad86c44e7866421ecaa5ad2c0edb0e2R552

+file_object = builtins.open(f, 'wb')
+try:
+self.initfp(file_object)
+except:
+file_object.close()
+raise

`self.initfp()` is very unlikely raise exceptions.  But MemoryError,
KeyboardInterrupt or
other rare exceptions may be happen.
Test didn't cover this except clause.  But I merged it because:

* this code is simple enough.
* I can write test for it with mocking `self.initfp()` to raise
exception.  But such test code
   have significant maintenance cost.  I don't think this except clause
is important enough
   to maintain such code.


I'm curious as to the maintenance cost -- surely it's write once, forget until you have to change that method again? 
How often are we changing that method?



If I remove the except clause, all lines are tested, but there is
(very unlikely)
leaking unclosed file.  Which are "broken"?


Definitely the (unlikely) leaking of a file, possibly the untested closing of 
an otherwise leaked file.


Coverage is very nice information to find code which should be tested,
but not tested.
But I don't think "all code should be tested".


All code should be tested.  Doesn't mean we can, but that should be the goal.


It may make hard to improve Python.


Adding missing tests _is_ improving Python.

--
~Ethan~
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-25 Thread Victor Stinner
`self.initfp()` is very unlikely raise exceptions.  But MemoryError,
KeyboardInterrupt or
other rare exceptions may be happen.


unittest.mock helps a lot to test such corner case: mock initfp() with a
MemoryError side effect, maybe also close () to check that the method was
called... The new problem is that on 3 instructions, 2 are mocked... The
test now checks the test itself or the real application?

Victor
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-25 Thread INADA Naoki
> While I use code coverage to improve automated unittesting, I am opposed to
> turning a usable but limited and sometime faulty tool into a blind robotic
> master that blocks improvements.  The prospect of this being done has
> discouraged me from learning the new system.  (More on 'faulty tool' later.)
>
> The temptation to write artificial tests to satisfy an artificial goal is
> real.  Doing so can eat valuable time better used for something else.  For
> instance:
>
> def meth(self, arg):
> mod.inst.meth(arg, True, ob=self, kw='cut')
>
> Mocking mod.class.meth, calling meth, and checking that the mock is called
> will satisfy the robot, but does not contribute much to the goal of
> providing a language that people can use to solve problems.
>
> Victor, can you explain 'tested indirectly' and perhaps give an example?
>

This is one examples I merged "untested line of code".
https://github.com/python/cpython/pull/162/files#diff-0ad86c44e7866421ecaa5ad2c0edb0e2R552

+file_object = builtins.open(f, 'wb')
+try:
+self.initfp(file_object)
+except:
+file_object.close()
+raise

`self.initfp()` is very unlikely raise exceptions.  But MemoryError,
KeyboardInterrupt or
other rare exceptions may be happen.
Test didn't cover this except clause.  But I merged it because:

* this code is simple enough.
* I can write test for it with mocking `self.initfp()` to raise
exception.  But such test code
  have significant maintenance cost.  I don't think this except clause
is important enough
  to maintain such code.

If I remove the except clause, all lines are tested, but there is
(very unlikely)
leaking unclosed file.  Which are "broken"?

Coverage is very nice information to find code which should be tested,
but not tested.
But I don't think "all code should be tested".
It may make hard to improve Python.

So I agree with Victor and Terry.

Regards,
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-25 Thread Terry Reedy

On 4/25/2017 11:00 AM, Barry Warsaw wrote:

On Apr 24, 2017, at 09:32 PM, Ethan Furman wrote:

On 04/21/2017 03:29 PM, Victor Stinner wrote:


(In the context of having a patch blocked by the blind Codecov robot ...)


I dislike code coverage because there is a temptation to write artificial
tests whereas the code is tested indirectly or the code is not important
enough to *require* tests.


While I use code coverage to improve automated unittesting, I am opposed 
to turning a usable but limited and sometime faulty tool into a blind 
robotic master that blocks improvements.  The prospect of this being 
done has discouraged me from learning the new system.  (More on 'faulty 
tool' later.)


The temptation to write artificial tests to satisfy an artificial goal 
is real.  Doing so can eat valuable time better used for something else. 
 For instance:


def meth(self, arg):
mod.inst.meth(arg, True, ob=self, kw='cut')

Mocking mod.class.meth, calling meth, and checking that the mock is 
called will satisfy the robot, but does not contribute much to the goal 
of providing a language that people can use to solve problems.


Victor, can you explain 'tested indirectly' and perhaps give an example?

As used here,'whereas' is incorrect English and a bit confusing.  I 
believe Victor meant something more like 'even when'.


For the last clause, I believe he meant "the code change is not 
complicated enough to *require* automated unit test coverage for the 
changed line(s)".  If I change a comment or add missing spaces, I don't 
think I should be forced to write a missing test to make the improvement.


A less trivial example: on IDLE's menu, Options => Configure IDLE opens 
a dialog with a font size widget that when clicked opens a list of font 
sizes.  I recently added 4 larger sizes to the tuple in 
idlelib.configdialog.ConfigDialog.LoadFontCfg, as requested, I think, by 
at least 2 people.  I tested manually by clicking until the list was 
displayed.  As I remember, I did not immediately worry about automated 
testing, let alone line coverage, and I do not think I should have had 
to to get the change into 3.6.0.


That line may or may not by covered by the current minimal test that 
simply creates a ConfigDialog instance.  But this gets back to what I 
think is Viktor's point.  This minimal test 'covers' 46% of the file, 
but it only tests that 46% of the lines run without raising.  This is 
useful, but does not test that the lines are really correct.  (For GUI 
display code, human eyeballing is required.)  This would remain true 
even if all the other code were moved to a new module, making the 
coverage of configdialog the magical ***100%***.



If it's not important enough to require tests >> it's not important enough to 
be in Python.  ;)


Modules in the test package are mostly not tested. ;)

If 'test' means 'line coverage test for new or changed lines', then as a 
practical matter, I disagree, as explained above.  So, in effect, did 
the people who committed untested lines.


In the wider sense of 'test', there is no real argument.  Each statement 
written should be mentally tested both when written and when reviewed. 
Code should be manually tested, preferably by someone in addition to the 
author. Automated testing is more than nice, but not everything.  Ditto 
for unit testing.


Some practical issues with coverage and CodeCov:

1. A Python module is comprised of statements but coverage module counts 
physical lines.  This is good for development, but not for gating.  The 
number of physical lines comprising a statement can change without 
changing or with only trivially changing the compiled run code.  So if 
coverage is not 100%, it can vary without a real change in statement 
coverage.


2. Some statements are only intended to run on certain systems, making 
100% coverage impossible unless one carefully puts all system-specific 
code in "if system == 'xyz'" statements and uses system-specific 
.coveragerc files to exclude code for 'other' systems.


3. Some tests required extended resources.  Statements that are only 
covered by such tests will be seen as uncovered when coverage is run on 
a system lacking the resources.  As far as I know, all non-Windows 
buildbots and CodeCov are run on systems lacking the 'gui' resource.  So 
patches to gui code will be seen as uncovered.


4. As I explained in a post on the core-workflow list, IDLE needs the 
following added to the 'exclude_lines' item of .coveragerc.

.*# htest #
if not _utest:
The mechanism behind these would also be useful for testing any other 
modules, scripts, or demos that use tkinter GUIs.


There seems to be other issues too.


"Untested code is broken code" :)


Most of CPython, including IDLE, has been pretty thoroughly tested.  And 
we have heaps of bug reports to show for it.  What's more important is 
that even code that is tested, by whatever means, may still bugs.  Hence 
 However, obscure bugs are still found.

Re: [python-committers] Codecov and PR

2017-04-25 Thread Barry Warsaw
On Apr 24, 2017, at 09:32 PM, Ethan Furman wrote:

>> I dislike code coverage because there is a temptation to write artficial
>> tests whereas the code is tested indirectly or the code is not important
>> enough to *require* tests.
>
>If it's not important enough to require tests it's not important enough to be
>in Python.  ;)

"Untested code is broken code" :)

-Barry
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-24 Thread Ethan Furman

On 04/21/2017 03:29 PM, Victor Stinner wrote:


I dislike code coverage because there is a temptation to write artficial tests 
whereas the code is tested indirectly or
the code is not important enough to *require* tests.


If it's not important enough to require tests it's not important enough to be 
in Python.  ;)

--
~Ethan~
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-22 Thread Victor Stinner
Thank you, I will take a look and see if I can help.

Victor

Le 22 avr. 2017 6:43 PM, "Brett Cannon"  a écrit :

>
>
> On Fri, 21 Apr 2017 at 15:33 Victor Stinner 
> wrote:
>
>> Ah, I found a workaround: Firefox on Android has a "[x] See the computer
>> version" option which allows the merge!?
>>
>> Victor
>>
>> Le 22 avr. 2017 12:29 AM, "Victor Stinner"  a
>> écrit :
>>
>>> Hi,
>>>
>>> I tried to merge a pull request on my phone, but I got the error:
>>>
>>> "Pull requests that have a failing status can’t be merged on a phone."
>>>
>>
> Well that's annoying.
>
>
>>
>>> The GitHub PEP announced that it will be possible to merge a change from
>>> the beach. Well, it's doable but only if you bring a laptop, not a phone :-)
>>>
>>> All tests pass except Codecov which is unstable. On a computer, I can
>>> merge such PR.
>>>
>>> What is the status of Codecov? Is someone actively working on fixing it
>>> to make it more reliable.
>>>
>>
> https://github.com/python/core-workflow/issues/38 is tracking trying to
> fix the inconsistency problem and https://github.com/python/
> core-workflow/issues/18 is tracking getting a more complete coverage
> report.
>
> I would like to see this fixed and it's on my workflow todo list, but
> obviously that list is not short so I don't know when I will get to it
> (especially since at some point I will need to take a workflow break and
> actually write code and review PRs again ;) .
>
>
>> I dislike code coverage in general, even more when it's run in a CI.
>>>
>>> Can we change the merge policy to allow merge on a phone? Or can we fix
>>> Codecov?
>>>
>>
> We might be able to turn off the status check. The config file is
> https://github.com/python/cpython/blob/master/.codecov.yml and the docs
> are at https://docs.codecov.io/docs/codecov-yaml (I'm on vacation so I
> don't have time to look up the exact config change since I really shouldn't
> even be replying to this email ;) .
>
>
>>
>>> Note: the PR is
>>> https://github.com/python/cpython/pull/1237#pullrequestreview-34044762
>>>
>>> Codecov says "10% of diff hit (target: 100%)". The newly added code is
>>> tested on Windows on release build. Maybe Codecov only tests on Windows?
>>>
>>
> Codecov actually only runs under Travis, so it's only testing on Linux.
>
>
>>
>>> I dislike code coverage because there is a temptation to write artficial
>>> tests whereas the code is tested indirectly or the code is not important
>>> enough to *require* tests.
>>>
>>
> I personally disagree as code that isn't tested isn't checked that at
> least some semblance of backwards-compatibility is being kept. Now I'm not
> advocating we must have explicit tests for every line or code, but I think
> we should somehow exercise as much code as possible.
>
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-22 Thread Victor Stinner
Oh, I forgot something about Codecov: it took me 2 minutes to understand
why a PR gets the red icon whereas all tests pass and the merge button was
waiting for my click. In fact, Codecov failed but the test isn't blocking.
I would expect the green icon on the overall list of PR.

Well, it's not blocking, just surprising the first time.

It may be annoying when you really want to list PR where tests fail, for
example to reschedule Travis jobs if we fix a bug in master (which caused a
test failure).

Victor
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-22 Thread Brett Cannon
On Fri, 21 Apr 2017 at 15:33 Victor Stinner 
wrote:

> Ah, I found a workaround: Firefox on Android has a "[x] See the computer
> version" option which allows the merge!?
>
> Victor
>
> Le 22 avr. 2017 12:29 AM, "Victor Stinner"  a
> écrit :
>
>> Hi,
>>
>> I tried to merge a pull request on my phone, but I got the error:
>>
>> "Pull requests that have a failing status can’t be merged on a phone."
>>
>
Well that's annoying.


>
>> The GitHub PEP announced that it will be possible to merge a change from
>> the beach. Well, it's doable but only if you bring a laptop, not a phone :-)
>>
>> All tests pass except Codecov which is unstable. On a computer, I can
>> merge such PR.
>>
>> What is the status of Codecov? Is someone actively working on fixing it
>> to make it more reliable.
>>
>
https://github.com/python/core-workflow/issues/38 is tracking trying to fix
the inconsistency problem and
https://github.com/python/core-workflow/issues/18 is tracking getting a
more complete coverage report.

I would like to see this fixed and it's on my workflow todo list, but
obviously that list is not short so I don't know when I will get to it
(especially since at some point I will need to take a workflow break and
actually write code and review PRs again ;) .


> I dislike code coverage in general, even more when it's run in a CI.
>>
>> Can we change the merge policy to allow merge on a phone? Or can we fix
>> Codecov?
>>
>
We might be able to turn off the status check. The config file is
https://github.com/python/cpython/blob/master/.codecov.yml and the docs are
at https://docs.codecov.io/docs/codecov-yaml (I'm on vacation so I don't
have time to look up the exact config change since I really shouldn't even
be replying to this email ;) .


>
>> Note: the PR is
>> https://github.com/python/cpython/pull/1237#pullrequestreview-34044762
>>
>> Codecov says "10% of diff hit (target: 100%)". The newly added code is
>> tested on Windows on release build. Maybe Codecov only tests on Windows?
>>
>
Codecov actually only runs under Travis, so it's only testing on Linux.


>
>> I dislike code coverage because there is a temptation to write artficial
>> tests whereas the code is tested indirectly or the code is not important
>> enough to *require* tests.
>>
>
I personally disagree as code that isn't tested isn't checked that at least
some semblance of backwards-compatibility is being kept. Now I'm not
advocating we must have explicit tests for every line or code, but I think
we should somehow exercise as much code as possible.
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/


Re: [python-committers] Codecov and PR

2017-04-21 Thread Victor Stinner
Ah, I found a workaround: Firefox on Android has a "[x] See the computer
version" option which allows the merge!?

Victor

Le 22 avr. 2017 12:29 AM, "Victor Stinner"  a
écrit :

> Hi,
>
> I tried to merge a pull request on my phone, but I got the error:
>
> "Pull requests that have a failing status can’t be merged on a phone."
>
> The GitHub PEP announced that it will be possible to merge a change from
> the beach. Well, it's doable but only if you bring a laptop, not a phone :-)
>
> All tests pass except Codecov which is unstable. On a computer, I can
> merge such PR.
>
> What is the status of Codecov? Is someone actively working on fixing it to
> make it more reliable. I dislike code coverage in general, even more when
> it's run in a CI.
>
> Can we change the merge policy to allow merge on a phone? Or can we fix
> Codecov?
>
> Note: the PR is
> https://github.com/python/cpython/pull/1237#pullrequestreview-34044762
>
> Codecov says "10% of diff hit (target: 100%)". The newly added code is
> tested on Windows on release build. Maybe Codecov only tests on Windows?
>
> I dislike code coverage because there is a temptation to write artficial
> tests whereas the code is tested indirectly or the code is not important
> enough to *require* tests.
>
> Victor
>
___
python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/