[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-08-04 Thread Łukasz Langa

Łukasz Langa  added the comment:

Thanks all for your effort on improving this! ✨ 🍰 ✨

--
assignee:  -> docs@python
components: +Documentation -Library (Lib)
nosy: +docs@python
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
versions: +Python 3.10, Python 3.11, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-08-04 Thread Łukasz Langa

Łukasz Langa  added the comment:


New changeset 7dad0337518a0d599caf8f803a5bf45db67cbe9b by Miss Islington (bot) 
in branch '3.9':
bpo-42958: Improve description of shallow= in filecmp.cmp docs (GH-27166) 
(GH-27608)
https://github.com/python/cpython/commit/7dad0337518a0d599caf8f803a5bf45db67cbe9b


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-08-04 Thread miss-islington


miss-islington  added the comment:


New changeset c2593b4d06712cefcdeae93b32f88faa4772bc3a by Miss Islington (bot) 
in branch '3.10':
bpo-42958: Improve description of shallow= in filecmp.cmp docs (GH-27166)
https://github.com/python/cpython/commit/c2593b4d06712cefcdeae93b32f88faa4772bc3a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-08-04 Thread miss-islington


Change by miss-islington :


--
pull_requests: +26103
pull_request: https://github.com/python/cpython/pull/27608

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-08-04 Thread Łukasz Langa

Łukasz Langa  added the comment:


New changeset a8dc4893d2b28827e82447326ea47759c161a722 by andrei kulakov in 
branch 'main':
bpo-42958: Improve description of shallow= in filecmp.cmp docs (GH-27166)
https://github.com/python/cpython/commit/a8dc4893d2b28827e82447326ea47759c161a722


--
message_count: 10.0 -> 11.0
nosy: +lukasz.langa, miss-islington
nosy_count: 4.0 -> 5.0
pull_requests: +26102
pull_request: https://github.com/python/cpython/pull/27607

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-15 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

Alexander: sorry, I didn't notice your PR because I was first working on a 
different issue, and then found this as a duplicate, and only noticed there's 
already an open PR here. If you prefer, we can close my PR and work on updating 
yours.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-15 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

I've put up the doc update PR here:
https://github.com/python/cpython/pull/27166

I've also reviewed a few dozen SO questions about filecmp.cmp and shallow arg, 
many of them find the docs confusing or lacking, so I thought updating docs is 
definitely very useful.

I haven't seen any of them asking for "force shallow" behavior, but there's 
many of them and I haven't read all; searching for "force shallow" doesn't find 
any.

I think it might be useful to also add a new 'force shallow' arg, but I hope 
more people ask for it or compile a list of SO questions asking for it, 
otherwise it might not be worth additional complexity.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-15 Thread Andrei Kulakov


Change by Andrei Kulakov :


--
pull_requests: +25702
pull_request: https://github.com/python/cpython/pull/27166

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-12 Thread Christof Hanke


Christof Hanke  added the comment:

Andrei,
cmp is the deep-compare part of filecmp. I thought we were taking about the 
shallow one.

Thus,
- shallow like rsync "quick": size + mtime.
- deep like cmp

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-12 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

But rsync is a quite more specific tool. For example, unix cmp command does not 
guess based on any part of `stat` sig. That's a much closer command in scope to 
'filecmp'.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-12 Thread Christof Hanke

Christof Hanke  added the comment:

Hi Andrei,

I would follow rsync. 
>From the man page:
"""
[...]
 -c, --checksum
  This changes the way rsync checks if the files have been changed 
and are in need of a transfer.   Without  this  option,  rsync
  uses  a  "quick  check" that (by default) checks if each file’s 
size and time of last modification match between the sender and
  receiver. 
[...]
"""

so, yes you can have false positives with a shallow comparison of size + mtime 
only. But that's usually ok for e.g. incremental backups. 


Wow, the bug is that old...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-12 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

Also see this 16 years old issue: https://bugs.python.org/issue1234674

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-12 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

Christof and Alexander, as you both said your preference is to have some way of 
enforcing "only shallow" compare, I want to clarify -- it seems if you do that, 
you will get the answer of True if the files are most likely the same (sig is 
equal), but if the sig is not equal, I'm not sure what answer you expect or 
why. The sig may be different because of mtime, and then the files may be 
different or the same, it's anyone's guess.

I wonder if both of you expect the same behavior, and if so, for the same use 
case or not?

--
nosy: +andrei.avk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-07-12 Thread Christof Hanke


Christof Hanke  added the comment:

Hi,

this is also discussed in https://bugs.python.org/issue41354.

Ciao,
  Christof

--
nosy: +chanke

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-01-19 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

This is a problem with the docstring. The actual docs for it are a bit more 
clear, https://docs.python.org/3/library/filecmp.html#filecmp.cmp :

"If shallow is true, files with identical os.stat() signatures are taken to be 
equal. Otherwise, the contents of the files are compared."

Your patch can't be used because it changes longstanding documented behavior. 
If you'd like to submit a patch to fix the docstring, that's fine, but we're 
not going to break existing code to make the function less accurate.

The patch should probably make the documentation more clear while it's at it.

1. The original wording could be misinterpreted as having the "Otherwise" apply 
to shallow=False only, not to the case where shallow=T rue but os.stat doesn't 
match.
2. The existing wording isn't clear on what an os.stat() "signature" is, which 
can leave the impression that the entirety of os.stat is compared (which would 
only match for hardlinks of the same file), when in fact it's just the file 
type (stat.S_IFMT(st.st_mode), file vs. directory vs. symlink, etc.), size and 
mtime.

Proposed rewording of main docs would be:

"If shallow is true, files with identical os.stat() signatures (file type, 
size, and modification time) are taken to be equal. When shallow is false, or 
the file signatures are identical, the contents of the files are compared."

A similar wording (or at least, a shorter version of the above, rather than a 
strictly wrong description of the shallow parameter) could be applied to the 
docstring.

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-01-18 Thread Alexander Vandenbulcke


Change by Alexander Vandenbulcke :


--
keywords: +patch
pull_requests: +23067
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/24246

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42958] filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs

2021-01-18 Thread Alexander Vandenbulcke


New submission from Alexander Vandenbulcke :

Consider the case where 2 files are shallow compared where only the mtime 
differs (i.e. the mode and size is identical).

With filecmp.cmp(f1, f2, shallow=True) a deep compare would be performed behind 
the scenes since the guard clauses fell through. This discrepancy is either a 
problem in the docstring or a problem in the comparison itself.

Two options remain:
- the documentation is altered and describes that, in case only the mtime 
differs (i.e. mode and size are equal) a deep compare is performed
- the behaviour is restricted in that effectively only a shallow compare is 
performed
- a shallow compare becomes more restrictive (do not consider the mtime during 
the compare)

My preference is to adjust the function to safeguard the meaning of 'shallow' 
in this context.

--
components: Library (Lib)
messages: 385225
nosy: AlexVndnblcke
priority: normal
severity: normal
status: open
title: filecmp.cmp(shallow=True) isn't actually shallow when only mtime differs
type: behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com