D3964: macosx: fixing macOS version generation after db9d1dd01bf0
glandium added a comment. This is python code, why is it parsing the version file instead of importing it? REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D3964 To: rdamazio, #hg-reviewers Cc: glandium, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D3958: Allow to run setup.py with python 3 without a mercurial checkout
glandium created this revision. glandium added a reviewer: indygreg. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY Some people may want to test mercurial in a python 3 environment through e.g. pip, in which case setup.py doesn't run in a mercurial checkout, so the hack in setup.py to allow python 3 cannot be overcome. This change allows a manual override with the HGPYTHON3 environment variable. Additionally, when for some reason the version is unknown (for crazy people like me, who have a git checkout of the mercurial repo), the version variable ends up being an unicode string, which fails the `isinstance(version, bytes)` assertion. So fix that at the same time. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D3958 AFFECTED FILES setup.py CHANGE DETAILS diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -74,7 +74,7 @@ badpython = True # Allow Python 3 from source checkouts. -if os.path.isdir('.hg'): +if os.path.isdir('.hg') or 'HGPYTHON3' in os.environ: badpython = False if badpython: @@ -369,7 +369,7 @@ from mercurial import __version__ version = __version__.version except ImportError: -version = 'unknown' +version = b'unknown' finally: if oldpolicy is None: del os.environ['HGMODULEPOLICY'] To: glandium, indygreg, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D2057: rust implementation of hg status
glandium added a comment. Doesn't mononoke have code to read revlogs already? REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D2057 To: Ivzhh, #hg-reviewers Cc: glandium, krbullock, indygreg, durin42, kevincox, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1846: rust: avoid redundant 'static lifetime
glandium added a comment. > but it's okay to depend on backports or rustup to /build/ packages The only exception where it's okay is, essentially, Firefox. With a backport of the rust compiler landing in stable about once a year, and even then it won't replace the version in stable. See the gcc-mozilla package in Debian wheezy. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1846 To: indygreg, #hg-reviewers, durin42 Cc: glandium, durin42, yuja, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1846: rust: avoid redundant 'static lifetime
glandium added a comment. > To my understanding, as long as we're only using the stable channel, we should be fine for the binaries we're building being packageable even on slower-moving distros like Debian. Slow-moving distros like Debian don't update the rust compiler. Debian stable is stuck on rustc 1.14 until Debian buster (next year?). REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1846 To: indygreg, #hg-reviewers, durin42 Cc: glandium, durin42, yuja, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D329: setup: Fix installing in a mingw environment
This revision was automatically updated to reflect the committed changes. Closed by commit rHG7686cbb0ba41: setup: fix installing in a mingw environment (authored by glandium). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D329?vs=751&id=828 REVISION DETAIL https://phab.mercurial-scm.org/D329 AFFECTED FILES setup.py CHANGE DETAILS diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -784,11 +784,11 @@ from distutils import cygwinccompiler # the -mno-cygwin option has been deprecated for years -compiler = cygwinccompiler.Mingw32CCompiler +mingw32compilerclass = cygwinccompiler.Mingw32CCompiler class HackedMingw32CCompiler(cygwinccompiler.Mingw32CCompiler): def __init__(self, *args, **kwargs): -compiler.__init__(self, *args, **kwargs) +mingw32compilerclass.__init__(self, *args, **kwargs) for i in 'compiler compiler_so linker_exe linker_so'.split(): try: getattr(self, i).remove('-mno-cygwin') @@ -809,11 +809,11 @@ # effect. from distutils import msvccompiler -compiler = msvccompiler.MSVCCompiler +msvccompilerclass = msvccompiler.MSVCCompiler class HackedMSVCCompiler(msvccompiler.MSVCCompiler): def initialize(self): -compiler.initialize(self) +msvccompilerclass.initialize(self) # "warning LNK4197: export 'func' specified multiple times" self.ldflags_shared.append('/ignore:4197') self.ldflags_shared_debug.append('/ignore:4197') To: glandium, #hg-reviewers, quark Cc: durin42, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D329: setup: Fix installing in a mingw environment
glandium added a comment. 4.3.1 can't currently be installed with MINGW64 python currently because of this, so I'd say yes. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D329 To: glandium, #hg-reviewers, quark Cc: durin42, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D330: Backed out changeset c34532365b38
This revision was automatically updated to reflect the committed changes. Closed by commit rHG1814ca418b30: branchmap: revert c34532365b38 for Python 2.7 compatibility (authored by glandium). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D330?vs=752&id=755 REVISION DETAIL https://phab.mercurial-scm.org/D330 AFFECTED FILES mercurial/branchmap.py CHANGE DETAILS diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py --- a/mercurial/branchmap.py +++ b/mercurial/branchmap.py @@ -406,7 +406,8 @@ # fast path: extract data from cache, use it if node is matching reponode = changelog.node(rev)[:_rbcnodelen] -cachenode, branchidx = unpack_from(_rbcrecfmt, self._rbcrevs, rbcrevidx) +cachenode, branchidx = unpack_from( +_rbcrecfmt, util.buffer(self._rbcrevs), rbcrevidx) close = bool(branchidx & _rbccloseflag) if close: branchidx &= _rbcbranchidxmask To: glandium, #hg-reviewers, quark, indygreg Cc: indygreg, quark, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D330: Backed out changeset c34532365b38
glandium created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY Old versions of python 2.7 don't like that the second argument to struct.unpack_from is a bytearray, so the change removing the util.buffer around that argument in branchmap broke running on older versions of python 2.7. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D330 AFFECTED FILES mercurial/branchmap.py CHANGE DETAILS diff --git a/mercurial/branchmap.py b/mercurial/branchmap.py --- a/mercurial/branchmap.py +++ b/mercurial/branchmap.py @@ -406,7 +406,8 @@ # fast path: extract data from cache, use it if node is matching reponode = changelog.node(rev)[:_rbcnodelen] -cachenode, branchidx = unpack_from(_rbcrecfmt, self._rbcrevs, rbcrevidx) +cachenode, branchidx = unpack_from( +_rbcrecfmt, util.buffer(self._rbcrevs), rbcrevidx) close = bool(branchidx & _rbccloseflag) if close: branchidx &= _rbcbranchidxmask To: glandium, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D328: setup: Fix installing in a mingw environment
glandium added a comment. Sorry, I rebased to stable, and that created a new differential: https://phab.mercurial-scm.org/D329, even though I updated the local tag. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D328 To: glandium, #hg-reviewers, quark Cc: quark, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D329: setup: Fix installing in a mingw environment
glandium created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY The addition, in https://phab.mercurial-scm.org/rHG9a4adc76c88a1a217983f051766b3009c0bca3aa, of a hack for the MSVC compiler class was overwriting the original class for the Mingw32CCompiler class, leading to an error when the HackedMingw32CCompiler is instantiated. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D329 AFFECTED FILES setup.py CHANGE DETAILS diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -784,11 +784,11 @@ from distutils import cygwinccompiler # the -mno-cygwin option has been deprecated for years -compiler = cygwinccompiler.Mingw32CCompiler +mingw32compilerclass = cygwinccompiler.Mingw32CCompiler class HackedMingw32CCompiler(cygwinccompiler.Mingw32CCompiler): def __init__(self, *args, **kwargs): -compiler.__init__(self, *args, **kwargs) +mingw32compilerclass.__init__(self, *args, **kwargs) for i in 'compiler compiler_so linker_exe linker_so'.split(): try: getattr(self, i).remove('-mno-cygwin') @@ -809,11 +809,11 @@ # effect. from distutils import msvccompiler -compiler = msvccompiler.MSVCCompiler +msvccompilerclass = msvccompiler.MSVCCompiler class HackedMSVCCompiler(msvccompiler.MSVCCompiler): def initialize(self): -compiler.initialize(self) +msvccompilerclass.initialize(self) # "warning LNK4197: export 'func' specified multiple times" self.ldflags_shared.append('/ignore:4197') self.ldflags_shared_debug.append('/ignore:4197') To: glandium, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D328: setup: Fix installing in a mingw environment
glandium created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY The addition, in https://phab.mercurial-scm.org/rHG9a4adc76c88a1a217983f051766b3009c0bca3aa, of a hack for the MSVC compiler class was overwriting the original class for the Mingw32CCompiler class, leading to an error when the HackedMingw32CCompiler is instantiated. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D328 AFFECTED FILES setup.py CHANGE DETAILS diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -784,11 +784,11 @@ from distutils import cygwinccompiler # the -mno-cygwin option has been deprecated for years -compiler = cygwinccompiler.Mingw32CCompiler +mingw32_compiler_class = cygwinccompiler.Mingw32CCompiler class HackedMingw32CCompiler(cygwinccompiler.Mingw32CCompiler): def __init__(self, *args, **kwargs): -compiler.__init__(self, *args, **kwargs) +mingw32_compiler_class.__init__(self, *args, **kwargs) for i in 'compiler compiler_so linker_exe linker_so'.split(): try: getattr(self, i).remove('-mno-cygwin') @@ -809,11 +809,11 @@ # effect. from distutils import msvccompiler -compiler = msvccompiler.MSVCCompiler +msvc_compiler_class = msvccompiler.MSVCCompiler class HackedMSVCCompiler(msvccompiler.MSVCCompiler): def initialize(self): -compiler.initialize(self) +msvc_compiler_class.initialize(self) # "warning LNK4197: export 'func' specified multiple times" self.ldflags_shared.append('/ignore:4197') self.ldflags_shared_debug.append('/ignore:4197') To: glandium, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: A possible explanation for random "stream ended unexpectedly (got m bytes, expected n)"
On Sat, Mar 25, 2017 at 06:24:26PM -0700, Jun Wu wrote: > May I ask what version of Python are you using? If it's < 2.7.4, the EINTR > issue is expected. 2.7.13. > > Excerpts from Mike Hommey's message of 2017-03-26 09:08:25 +0900: > > On Sat, Mar 25, 2017 at 10:34:02AM -0700, Gregory Szorc wrote: > > > On Sat, Mar 25, 2017 at 4:19 AM, Mike Hommey wrote: > > > > > > > Hi, > > > > > > > > I don't know about you, but occasionally, I've hit "stream ended > > > > unexpectedly (got m bytes, expected n)" errors that didn't make sense. > > > > Retrying would always work. > > > > > > > > Recently, I was trying to use signal.setitimer and a signal handler for > > > > some memory profiling on git-cinnabar, which uses mercurial as a > > > > library, and got "stream ended 4 unexpectedly (got m bytes, expected n)" > > > > *very* reproducibly. Like, with an interval timer firing every second, > > > > it took only a few seconds to hit the error during a clone. > > > > > > > > I'm pretty sure this can be reproduced with a similar setup in mercurial > > > > itself. > > > > > > > > Now, the reason this happens in this case is that, the code that fails > > > > does: > > > > > > > > def readexactly(stream, n): > > > > '''read n bytes from stream.read and abort if less was available''' > > > > s = stream.read(n) > > > > if len(s) < n: > > > > raise error.Abort(_("stream ended unexpectedly" > > > >" (got %d bytes, expected %d)") > > > > % (len(s), n)) > > > > return s > > > > > > > > ... and thanks to POSIX, interrupted reads can lead to short reads. So, > > > > you request n bytes, and get less, just because something interrupted > > > > the process. The problem then is that python doesn't let you know why > > > > you just got a short read, and you have to figure that out on your own. > > > > > > > > The same kind of problem is also true to some extent on writes. > > > > > > > > Now, the problem is that this piece of code is likely the most visible > > > > place where the issue exists, but there are many other places in the > > > > mercurial code base that are likely affected. > > > > > > > > And while the signal.setitimer case is a corner case (and I ended up > > > > using a separate thread to work around the problem ; my code wasn't > > > > interruption safe either anyways), I wonder if those random "stream > > > > ended unexpectedly (got m bytes, expected n)" errors I was getting under > > > > normal circumstances are not just a manifestation of the same underlying > > > > issue, which is that the code doesn't like interrupted reads. > > > > > > > > Disclaimer: I'm not going to work on fixing this issue, but I figured > > > > I'd let you know, in case someone wants to look into it more deeply. > > > > > > > > > > Thank you for writing this up. This "stream ended unexpectedly" has been a > > > thorn in my side for a while, as it comes up frequently in Mozilla's CI > > > with a frequency somewhere between 1 in 100-1000. Even retrying failed > > > operations multiple times isn't enough to overcome it > > > > > > I have long suspected interrupted system calls as a likely culprit. > > > However, when I initially investigated this a few months ago, I found that > > > Python's I/O APIs retry automatically for EINTR. See > > > https://hg.python.org/cpython/file/54c93e0fe79b/Lib/socket.py#l365 for > > > example. This /should/ make e.g. socket._fileobject.read() resilient > > > against signal interruption. (If Python's I/O APIs didn't do this, tons of > > > programs would break. Also, the semantics of .read() are such that it is > > > always supposed to retrieve all available bytes until EOF - at least for > > > some io ABCs. read1() exists to perform at most 1 system call.) > > > > Note that EINTR is not the only way read() can end from interruption: > > > >If a read() is interrupted by a signal before it reads any data, it > >shall return -1 with errno set to [EINTR]. > > > >If a rea
Re: A possible explanation for random "stream ended unexpectedly (got m bytes, expected n)"
On Sat, Mar 25, 2017 at 10:34:02AM -0700, Gregory Szorc wrote: > On Sat, Mar 25, 2017 at 4:19 AM, Mike Hommey wrote: > > > Hi, > > > > I don't know about you, but occasionally, I've hit "stream ended > > unexpectedly (got m bytes, expected n)" errors that didn't make sense. > > Retrying would always work. > > > > Recently, I was trying to use signal.setitimer and a signal handler for > > some memory profiling on git-cinnabar, which uses mercurial as a > > library, and got "stream ended 4 unexpectedly (got m bytes, expected n)" > > *very* reproducibly. Like, with an interval timer firing every second, > > it took only a few seconds to hit the error during a clone. > > > > I'm pretty sure this can be reproduced with a similar setup in mercurial > > itself. > > > > Now, the reason this happens in this case is that, the code that fails > > does: > > > > def readexactly(stream, n): > > '''read n bytes from stream.read and abort if less was available''' > > s = stream.read(n) > > if len(s) < n: > > raise error.Abort(_("stream ended unexpectedly" > >" (got %d bytes, expected %d)") > > % (len(s), n)) > > return s > > > > ... and thanks to POSIX, interrupted reads can lead to short reads. So, > > you request n bytes, and get less, just because something interrupted > > the process. The problem then is that python doesn't let you know why > > you just got a short read, and you have to figure that out on your own. > > > > The same kind of problem is also true to some extent on writes. > > > > Now, the problem is that this piece of code is likely the most visible > > place where the issue exists, but there are many other places in the > > mercurial code base that are likely affected. > > > > And while the signal.setitimer case is a corner case (and I ended up > > using a separate thread to work around the problem ; my code wasn't > > interruption safe either anyways), I wonder if those random "stream > > ended unexpectedly (got m bytes, expected n)" errors I was getting under > > normal circumstances are not just a manifestation of the same underlying > > issue, which is that the code doesn't like interrupted reads. > > > > Disclaimer: I'm not going to work on fixing this issue, but I figured > > I'd let you know, in case someone wants to look into it more deeply. > > > > Thank you for writing this up. This "stream ended unexpectedly" has been a > thorn in my side for a while, as it comes up frequently in Mozilla's CI > with a frequency somewhere between 1 in 100-1000. Even retrying failed > operations multiple times isn't enough to overcome it > > I have long suspected interrupted system calls as a likely culprit. > However, when I initially investigated this a few months ago, I found that > Python's I/O APIs retry automatically for EINTR. See > https://hg.python.org/cpython/file/54c93e0fe79b/Lib/socket.py#l365 for > example. This /should/ make e.g. socket._fileobject.read() resilient > against signal interruption. (If Python's I/O APIs didn't do this, tons of > programs would break. Also, the semantics of .read() are such that it is > always supposed to retrieve all available bytes until EOF - at least for > some io ABCs. read1() exists to perform at most 1 system call.) Note that EINTR is not the only way read() can end from interruption: If a read() is interrupted by a signal before it reads any data, it shall return -1 with errno set to [EINTR]. If a read() is interrupted by a signal after it has successfully read some data, it shall return the number of bytes read. From http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html But that's POSIX, Windows is another story. Recv is different too. On Sat, Mar 25, 2017 at 12:00:42PM -0700, Gregory Szorc wrote: > Can you please provide more detailed steps to reproduce? > > I added the following code at the top of exchange.pull: > > def sighandler(sig, stack): > pass > > import signal > signal.signal(signal.SIGALRM, sighandler) > signal.setitimer(signal.ITIMER_REAL, 1.0, 1.0) > > However, I was unable to reproduce the "stream ended unexpectedly" failure > when cloning a Firefox repo from hg.mozilla.org. And I even tried with the > interval set to 1ms. So, I tried to reproduce again with my original testcase, and failed to. In fact, instead I was getting urllib2.URLError: err
A possible explanation for random "stream ended unexpectedly (got m bytes, expected n)"
Hi, I don't know about you, but occasionally, I've hit "stream ended unexpectedly (got m bytes, expected n)" errors that didn't make sense. Retrying would always work. Recently, I was trying to use signal.setitimer and a signal handler for some memory profiling on git-cinnabar, which uses mercurial as a library, and got "stream ended 4 unexpectedly (got m bytes, expected n)" *very* reproducibly. Like, with an interval timer firing every second, it took only a few seconds to hit the error during a clone. I'm pretty sure this can be reproduced with a similar setup in mercurial itself. Now, the reason this happens in this case is that, the code that fails does: def readexactly(stream, n): '''read n bytes from stream.read and abort if less was available''' s = stream.read(n) if len(s) < n: raise error.Abort(_("stream ended unexpectedly" " (got %d bytes, expected %d)") % (len(s), n)) return s ... and thanks to POSIX, interrupted reads can lead to short reads. So, you request n bytes, and get less, just because something interrupted the process. The problem then is that python doesn't let you know why you just got a short read, and you have to figure that out on your own. The same kind of problem is also true to some extent on writes. Now, the problem is that this piece of code is likely the most visible place where the issue exists, but there are many other places in the mercurial code base that are likely affected. And while the signal.setitimer case is a corner case (and I ended up using a separate thread to work around the problem ; my code wasn't interruption safe either anyways), I wonder if those random "stream ended unexpectedly (got m bytes, expected n)" errors I was getting under normal circumstances are not just a manifestation of the same underlying issue, which is that the code doesn't like interrupted reads. Disclaimer: I'm not going to work on fixing this issue, but I figured I'd let you know, in case someone wants to look into it more deeply. Cheers, Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: hg-git and round-tripping (and file copies?)
On Thu, Mar 16, 2017 at 01:38:18PM -0700, Gregory Szorc wrote: > On Thu, Mar 16, 2017 at 1:05 PM, Danek Duvall > wrote: > > > In trying to convert > > > > https://hg.java.net/hg/solaris-userland~gate > > > > to a git repo and back, I'm seeing issues at changeset 34, where the hash > > changes for reasons I can't see. If I do a diff of the debug log, I see > > it's due to the manifest: > > > > $ diff -u =(hg log -R userland-more --debug -r 34) =(hg log -R > > userland-more.hgagain --debug -r 34 | grep -v "^phase:") > > --- /tmp/zshhHyEIb 2017-03-16 11:37:57.601340643 -0700 > > +++ /tmp/zshlyqHbd 2017-03-16 11:37:57.793642372 -0700 > > @@ -1,12 +1,10 @@ > > no terminfo entry for sitm > > -changeset: 34:d20b10eba31725ad8954aa6d20374da512f0e636 > > -tag: build-149 > > +changeset: 34:2ccb817b85926f410df2a6bd23000265805088df > > parent: 33:371c8e56136d19872ae7db8d273f9de78c8fa783 > > parent: -1: > > -manifest:34:e031f26e68549dadb3dfb4705d429c75622a58b4 > > +manifest:34:5a12a2a1bf3e7c0f7c30d01bd09a2e37185bcfb6 > > user:Norm Jacobs > > date:Sun Sep 19 13:50:53 2010 -0700 > > -phase: public > > files: > > components/Makefile > > make-rules/prep.mk > > > > and if I use debugdata to look at the manifest at changeset 34, I see: > > > > $ gdiff -a -u =(hg -R userland-more debugdata -m 34) =(hg -R > > userland-more.hgagain debugdata -m 34) > > --- /tmp/zshOdnjza 2017-03-16 11:53:16.971130878 + > > +++ /tmp/zshzoTzmc 2017-03-16 11:53:17.118194061 + > > @@ -24,12 +24,12 @@ > > make-rules/setup.py.mk302733d738cc7c6cceb63457442f24f931867472 > > make-rules/shared-macros.mk03dd5df583b6e39a17ba66fc6ed6205df7f6be49 > > tools/Makefilecc964766028e3b963b4a321c88815d211415006b > > -tools/bass-o-matica618ef38ceda467b9a09680dd8b94debcd303037x > > +tools/bass-o-matic349f9611499fddf1a110f9488a84fb110c90b7bfx > > tools/build-watch.df69b9a2b6a265c06268733430bbf3f9aa7d5e160x > > tools/build-watch.pl5e23340c7a84ac555e630a5ccdc28eceda95f4b6x > > tools/time.ca0a1f64ff8ac947ce9d045e0448f8ee72f9fd273 > > -tools/userland-fetch851170bb5cebf2648c53d4909eac26ac2055cdd3x > > -tools/userland-unpack0977e35fa356d4cfab889b93613dc75d90d89b6bx > > +tools/userland-fetchbae023e70db29fd07f6f989aaa858cfaed09238ax > > +tools/userland-unpackb3800b9db86df38a644a653b3095805b269b6ac6x > > transforms/actuatorsc9d84677229efde5f89b1d985de5cd1b09267b56 > > transforms/archive-libraries-drop5b346a0133242f460ff66f6689 > > 790da094ce27f6 > > transforms/comparison-cleanupde1288c586594a171d43a3da5234cb920be408cc > > > > Now, those three files were copied in that changeset, but they're not the > > first to be copied, so it's not that, strictly. But it is the first > > changeset in which files were copied without being modified. > > > > The index data is off-by-one, if that makes any difference: > > > > $ hg -R userland-more debugrevlog -d tools/bass-o-matic > > # rev p1rev p2rev start end deltastart base p1 p2 rawsize > > totalsize compression heads chainlen > > 0-1-1 0 2175 00006005 > > 6005 2 10 > > 1 0-1 2175 2228 00005929 > > 11934 5 11 > > > > $ hg -R userland-more.hgagain debugrevlog -d tools/bass-o-matic > > # rev p1rev p2rev start end deltastart base p1 p2 rawsize > > totalsize compression heads chainlen > > 0-1-1 0 2174 00006005 > > 6005 2 10 > > 1 0-1 2174 2227 00005929 > > 11934 5 11 > > > > Any thoughts on how to further debug this? > > > > Or is this just > > > > https://bitbucket.org/durin42/hg-git/issues/46 Note that bug is about git->hg conversion where the original repository is git. > > > > and I'm out of luck? > > > > It is effectively impossible to round-trip between Git and Mercurial when > file copies are involved. This is because Mercurial's filelog hashes > include copy metadata and the parent nodes. Git's blob hashes, by contrast, > are effectively content only. When you convert from Mercurial to Git, it > will drop copy metadata (because Git doesn't track it explicitly). Then > when you convert back to Mercurial, the copies have to be detected "just > right" by hg-git for the hashes to align. Furthermore, the files have to be > reintroduced in the same order, or the filelog parents may not align and > the hashes may diverge. If a repo isn't linear, there's a non-zero chance > of that happening. hg-git actually "stores" copy/rename in the commit messages, but that's assuming the commit was done in mercurial and pushed to git with hg-git in the first place.
Re: [PATCH RFC] similar: allow similarity detection to use sha256 for digesting file contents
On Wed, Mar 01, 2017 at 04:34:43PM -0800, Gregory Szorc wrote: > On Wed, Mar 1, 2017 at 7:02 AM, FUJIWARA Katsunori > wrote: > > > # HG changeset patch > > # User FUJIWARA Katsunori > > # Date 1488380487 -32400 > > # Thu Mar 02 00:01:27 2017 +0900 > > # Node ID 018d9759cb93f116007d4640341a82db6cf2d45c > > # Parent 0bb3089fe73527c64f1afc40b86ecb8dfe7fd7aa > > similar: allow similarity detection to use sha256 for digesting file > > contents > > > > Before this patch, similarity detection logic (used for addremove and > > automv) uses SHA-1 digesting. But this cause incorrect rename > > detection, if: > > > > - removing file A and adding file B occur at same committing, and > > - SHA-1 hash values of file A and B are same > > > > This may prevent security experts from managing sample files for > > SHAttered issue in Mercurial repository, for example. > > > > https://security.googleblog.com/2017/02/announcing-first- > > sha1-collision.html > > https://shattered.it/ > > > > Hash collision itself isn't so serious for core repository > > functionality of Mercurial, described by mpm as below, though. > > > > https://www.mercurial-scm.org/wiki/mpm/SHA1 > > > > HOW ABOUT: > > > > - which should we use default algorithm SHA-1, SHA-256 or SHA-512 ? > > > > SHA-512 should be faster than SHA-256 on 64-bit hardware. So, there's > likely no good reason to use SHA-256 for simple identity checks. > > > > > > ease (handling problematic files safely by default) or > > performance? > > > > > On my Skylake at 4.0 GHz, SHA-1 is capable of running at ~975 MB/s and > SHA-512 at ~700 MB/s. Both are fast enough that for simple one-time content > identity checks, hashing shouldn't be a bottleneck, at least not for most > repos. > > So I think it is fine to change this function from SHA-1 to SHA-512 > assuming the hashes don't "leak" into storage. If they end up being stored > or used for something other than identity checks, then we need to bloat > scope to discuss our general hashing future. And that needs its own thread > ;) With hashing, there is *always* the risk of collision. It might be tiny, but it still exists. Why not just compare the contents when the hash match? Then it doesn't really matter what the hash is. The hash is just a shortcut to avoid comparing full contents in a O(n^2) fashion. There aren't going to be that many hash matches anyways, comparing the content then should not make a significant difference in speed, but would guarantee that the "similarity" is real. (BTW, interestingly, in terms of similarity detection, while the SHAttered PDFs are not 100% identical, they are 80%+ similar) Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH] convert: add config option to control saving Git committer in message
On Fri, Jan 06, 2017 at 11:36:32AM -0800, Sean Farley wrote: > Gregory Szorc writes: > > > # HG changeset patch > > # User Gregory Szorc > > # Date 1483729033 28800 > > # Fri Jan 06 10:57:13 2017 -0800 > > # Node ID 1901566ab484a56b177b88ff080d635840e0912c > > # Parent 3de9df6ee5bf7601aa3870f18304bbeb3ce351af > > convert: add config option to control saving Git committer in message > > > > As part of converting a Git repository to Mercurial at Mozilla, I > > encountered a scenario where I didn't want `hg convert` to > > automatically add the "committer: " line to commit messages. > > While I can hack around it downstream by rewriting the Git commit > > before feeding it into `hg convert`, I'd prefer to just specify a > > config flag to turn it off. This patch adds that flag. > > I'm fine with this as-is but what about maybe storing it in the > metadata? Just a thought. ... like hg-git does. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 8 of 8 zstd-revlogs] [RFC] localrepo: support non-zlib compression engines in revlogs
On Wed, Jan 04, 2017 at 11:18:21PM -0800, Gregory Szorc wrote: > * The lz4 performance note in the commit message isn't very accurate. There > is a small subset of operations where the zstd python bindings are as fast > as lz4. I'll strike the comment from the next version. > > * zlib has checksums built into the compression format with how it is used > in hg today. The patches as written do not have zstd writing checksums. > > * Enabling checksums in zstd appears to have a negligible impact on > performance. > > * Reusing zstd compression and decompression "contexts" can make a > significant difference to performance. Having a reusable "compressor" > object that allows "context" reuse should increase performance for zstd. > > * For the changelog, zstd level=1 versus level=3 makes almost no difference > on compression ratio but does speed up compression a bit. Now I'm > considering per-revlog settings for the compressors. > > * zstd compression dictionaries speed up *both* compression and > decompression. On changelog chunks, dictionaries improve decompress > throughput from ~180 MB/s to ~300 MB/s. That's nothing to sneeze at. > > * When dictionaries are used, zstd level=1 compresses the changelog > considerably faster than level=3. ~160 MB/s vs ~27 MB/s. > > * I was going to hold off seriously investigating compression dictionaries, > but since there are massive perf win potentials, I think it should be done > sooner than later. All these perf information wrt dictionaries make me wonder if there is a corpus of non-english changesets that could be used for some different performance measurements. It's nice that we know things are better for english content, but version control is not exclusive to people writing everything in english. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 7 of 7 v2] bdiff: give slight preference to removing trailing lines
On Thu, Nov 24, 2016 at 05:52:29PM +, Jun Wu wrote: > Excerpts from Augie Fackler's message of 2016-11-17 12:42:26 -0500: > > My own cursory perfbdiff runs suggest this is a perf wash (using > > `perfbdiff -m 3041e4d59df2` in the mozilla repo). Queued. Thanks! > > I'd mention this series changes the behavior of the diff output. The > difference was caught by fastannotate test. > > See the below table (old: e1d6aa0e4c3a, new: 8836f13e3c5b): > >a | b | old | new > >a | a | a | -a >a | z | +z | a >a | a | a | +z > | | -a | a > >a | a | a >a | a | a >a | |-a > > I think we would always prefer putting deletions at the end, to be consistent. Wouldn't a -a +z a Be preferable to both old and new? That's what plain diff does, by the way. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 8 of 9 RFC] wireproto: introduce listkeys2 command
On Sun, Aug 14, 2016 at 04:59:50PM -0700, Gregory Szorc wrote: > On Sun, Aug 14, 2016 at 3:12 PM, Mike Hommey wrote: > > > On Sun, Aug 14, 2016 at 02:10:07PM -0700, Gregory Szorc wrote: > > > # HG changeset patch > > > # User Gregory Szorc > > > # Date 1471208237 25200 > > > # Sun Aug 14 13:57:17 2016 -0700 > > > # Node ID d2870bcbc43041909e9f637b294cb889f7ed4933 > > > # Parent eb2bc1ac7869ad255965d16004524a95cea83c9d > > > wireproto: introduce listkeys2 command > > > > > > The command behaves exactly like "listkeys" except it uses a more > > > efficient and more robust binary encoding mechanism. > > > > Nowhere in the patch queue I see mentioned why you want this. Not saying > > that this shouldn't be done, but it's really not clear what the expected > > benefit is of all this refactoring and this new command. > > > I said it concisely in the commit message you just quoted: the wire > encoding is smaller and is able to represent all binary values. I offer > more detail at > https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/087243.html. ... and a large part of that message should be in this commit message. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 8 of 9 RFC] wireproto: introduce listkeys2 command
On Sun, Aug 14, 2016 at 02:10:07PM -0700, Gregory Szorc wrote: > # HG changeset patch > # User Gregory Szorc > # Date 1471208237 25200 > # Sun Aug 14 13:57:17 2016 -0700 > # Node ID d2870bcbc43041909e9f637b294cb889f7ed4933 > # Parent eb2bc1ac7869ad255965d16004524a95cea83c9d > wireproto: introduce listkeys2 command > > The command behaves exactly like "listkeys" except it uses a more > efficient and more robust binary encoding mechanism. Nowhere in the patch queue I see mentioned why you want this. Not saying that this shouldn't be done, but it's really not clear what the expected benefit is of all this refactoring and this new command. Mike ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel