Antw: Re: Terrible bad performance for it blame --date=iso -C -C master --
>>> Samuel Lijin schrieb am 03.05.2017 um 09:12 in Nachricht : > On Mon, May 1, 2017 at 7:59 PM, Junio C Hamano wrote: >> Samuel Lijin writes: >> >>> On Fri, Mar 31, 2017 at 10:52 AM, Junio C Hamano wrote: >>> It might not be a bad idea to teach "blame" not to pay attention to any path that is marked as "-diff" (e.g. binary files) when trying to see if remaining contents appeared by borrowing from them. We do not have that heuristics (yet). >>> >>> Could you elaborate on this? Do you mean to tell diffcore-rename to >>> ignore diff_filespec objects if they're binary? >> >> No and yes ;-). I do not think it is a good idea to unconditionally >> ignore binary in diffcore-rename. >> >> But when we know that the rename detection is called from inside >> blame.c, where by definition we would be digging line-oriented >> contents, there is no point checking if the contents we are looking >> for came from an existing binary file. > > A followup thought: perhaps it would make sense for blame.c rename > detection to only consider files with the same extension? As traditionally file type and file name extension had little to do with each other in UNIX, I thing that would not be a good solution (to assume the opposite). I also know that Git does not care too much about hierarchies also, but my idea was so find candidates "nearby", i.e. look in the subdirectories of the first level and in the parent directory; then, if needed, widen the radius of the search... Regards, Ulrich
Re: Terrible bad performance for it blame --date=iso -C -C master --
On Mon, May 1, 2017 at 7:59 PM, Junio C Hamano wrote: > Samuel Lijin writes: > >> On Fri, Mar 31, 2017 at 10:52 AM, Junio C Hamano wrote: >> >>> It might not be a bad idea to teach "blame" not to pay attention to >>> any path that is marked as "-diff" (e.g. binary files) when trying >>> to see if remaining contents appeared by borrowing from them. We do >>> not have that heuristics (yet). >> >> Could you elaborate on this? Do you mean to tell diffcore-rename to >> ignore diff_filespec objects if they're binary? > > No and yes ;-). I do not think it is a good idea to unconditionally > ignore binary in diffcore-rename. > > But when we know that the rename detection is called from inside > blame.c, where by definition we would be digging line-oriented > contents, there is no point checking if the contents we are looking > for came from an existing binary file. A followup thought: perhaps it would make sense for blame.c rename detection to only consider files with the same extension?
Re: Terrible bad performance for it blame --date=iso -C -C master --
Samuel Lijin writes: > On Fri, Mar 31, 2017 at 10:52 AM, Junio C Hamano wrote: > >> It might not be a bad idea to teach "blame" not to pay attention to >> any path that is marked as "-diff" (e.g. binary files) when trying >> to see if remaining contents appeared by borrowing from them. We do >> not have that heuristics (yet). > > Could you elaborate on this? Do you mean to tell diffcore-rename to > ignore diff_filespec objects if they're binary? No and yes ;-). I do not think it is a good idea to unconditionally ignore binary in diffcore-rename. But when we know that the rename detection is called from inside blame.c, where by definition we would be digging line-oriented contents, there is no point checking if the contents we are looking for came from an existing binary file.
Re: Terrible bad performance for it blame --date=iso -C -C master --
On Fri, Mar 31, 2017 at 10:52 AM, Junio C Hamano wrote: > "Ulrich Windl" writes: > >> I was running "vc-annotate" in Emacs for a file from a large >> repository (>4 files, a big percentage being binary, about 10 >> commits). For the first file the result was presented rather soon, but >> for a second file the command did not finish even after about 10 >> minutes! >> >> The file in question is a rather short text file (124 kB), and >> according to git log it has one commit. >> >> While being bored, I did an strace of the command to find out that a >> huge number of files is inspected. > > With -C -C the user (vc-annotate?) is asking to inspect huge number > of files, to find if the contents of the file (except for the part > that came from its own previous version) came from other existing > files. So this is very much expected. > > It might not be a bad idea to teach "blame" not to pay attention to > any path that is marked as "-diff" (e.g. binary files) when trying > to see if remaining contents appeared by borrowing from them. We do > not have that heuristics (yet). Could you elaborate on this? Do you mean to tell diffcore-rename to ignore diff_filespec objects if they're binary?
Antw: Re: Terrible bad performance for it blame --date=iso -C
>>> Jakub Narebski schrieb am 03.04.2017 um 17:16 in Nachricht <0ccc5cab-26b7-4b02-b964-452b61e92...@gmail.com>: > W dniu 03.04.2017 o 12:56, SZEDER Gábor pisze: >> Ulrich Windl wrote: > >>> In the other case (for the user bored of waiting seeking for some >>> entertainment ;-)) a "-v (verbose) option could be useful. Or at the >>> very least: If git is expecting that some operation will take (or >>> already did take) a lot of time, give some message explaining why it >>> is taking a lot of time, and maybe how to avoid that. >> >> It already does so by default since v2.8.0, see aba37f495 (blame: add >> support for --[no-]progress option, 2015-12-12). >> >> $ time git blame sha1_file.c |wc -l >> 4026 >> >> real0m1.744s >> user0m1.672s >> sys 0m0.068s >> $ time git blame -C -C sha1_file.c |wc -l >> Blaming lines: 100% (4026/4026), done. >> 4026 >> >> real0m3.832s >> user0m3.716s >> sys 0m0.112s >> >> However, after a short peek at that commit, it only displays progress >> by default when stderr is a terminal, which might not be the case when >> invoked from emacs. > > Emacs (magit?) should use `git blame --porcelain`, and do its own > progress report, just like 'git gui blame' and incremental blame mode > of gitweb. I was thinking similar: The pain vc-annotate obviously should work without those "-C" options, and with prefix argument (C-u in Emacs) it could start looking for copied stuff. HMO... Worse than no progress reporting is the inability to kill the process (if you run out of patience) with C-g (that stops most commands in Emacs). > > Actually... there already is git-blamed - Minor mode for incremental > blame for Git, and mo-git-blame - An interactive, iterative 'git blame' > mode for Emacs, both available on ELPA (Emacs Lisp Package Archive). I confess taht is still use RCS from time to time, and I prefer the higher-level Emacs commands ;-) Regards, Ulrich
Re: Terrible bad performance for it blame --date=iso -C
W dniu 03.04.2017 o 12:56, SZEDER Gábor pisze: > Ulrich Windl wrote: >> In the other case (for the user bored of waiting seeking for some >> entertainment ;-)) a "-v (verbose) option could be useful. Or at the >> very least: If git is expecting that some operation will take (or >> already did take) a lot of time, give some message explaining why it >> is taking a lot of time, and maybe how to avoid that. > > It already does so by default since v2.8.0, see aba37f495 (blame: add > support for --[no-]progress option, 2015-12-12). > > $ time git blame sha1_file.c |wc -l > 4026 > > real0m1.744s > user0m1.672s > sys 0m0.068s > $ time git blame -C -C sha1_file.c |wc -l > Blaming lines: 100% (4026/4026), done. > 4026 > > real0m3.832s > user0m3.716s > sys 0m0.112s > > However, after a short peek at that commit, it only displays progress > by default when stderr is a terminal, which might not be the case when > invoked from emacs. Emacs (magit?) should use `git blame --porcelain`, and do its own progress report, just like 'git gui blame' and incremental blame mode of gitweb. Actually... there already is git-blamed - Minor mode for incremental blame for Git, and mo-git-blame - An interactive, iterative 'git blame' mode for Emacs, both available on ELPA (Emacs Lisp Package Archive). HTH -- Jakub Narębski
Re: Terrible bad performance for it blame --date=iso -C
> In the other case (for the user bored of waiting seeking for some > entertainment ;-)) a "-v (verbose) option could be useful. Or at the > very least: If git is expecting that some operation will take (or > already did take) a lot of time, give some message explaining why it > is taking a lot of time, and maybe how to avoid that. It already does so by default since v2.8.0, see aba37f495 (blame: add support for --[no-]progress option, 2015-12-12). $ time git blame sha1_file.c |wc -l 4026 real0m1.744s user0m1.672s sys 0m0.068s $ time git blame -C -C sha1_file.c |wc -l Blaming lines: 100% (4026/4026), done. 4026 real0m3.832s user0m3.716s sys 0m0.112s However, after a short peek at that commit, it only displays progress by default when stderr is a terminal, which might not be the case when invoked from emacs.
Antw: Re: Terrible bad performance for it blame --date=iso -C -C master --
Junio, thanks for explaining! So if there are at least two commits, blame is fast, but with only one commit blame tries hard to find another commit that might have contributed to the one file? I verified that without those "-C" options the result is very quick. In the other case (for the user bored of waiting seeking for some entertainment ;-)) a "-v (verbose) option could be useful. Or at the very least: If git is expecting that some operation will take (or already did take) a lot of time, give some message explaining why it is taking a lot of time, and maybe how to avoid that. Regards, Ulrich >>> Junio C Hamano schrieb am 31.03.2017 um 17:52 in >>> Nachricht : > "Ulrich Windl" writes: > >> I was running "vc-annotate" in Emacs for a file from a large >> repository (>4 files, a big percentage being binary, about 10 >> commits). For the first file the result was presented rather soon, but >> for a second file the command did not finish even after about 10 >> minutes! >> >> The file in question is a rather short text file (124 kB), and >> according to git log it has one commit. >> >> While being bored, I did an strace of the command to find out that a >> huge number of files is inspected. > > With -C -C the user (vc-annotate?) is asking to inspect huge number > of files, to find if the contents of the file (except for the part > that came from its own previous version) came from other existing > files. So this is very much expected. > > It might not be a bad idea to teach "blame" not to pay attention to > any path that is marked as "-diff" (e.g. binary files) when trying > to see if remaining contents appeared by borrowing from them. We do > not have that heuristics (yet).
Re: Terrible bad performance for it blame --date=iso -C -C master --
Hi Ulrich, Is there any chance you could share the repo where this is coming from? This is actually something a colleague and I are looking into seeing if we can crunch out some performance gains since -C -C isn't threaded. Sam On Fri, Mar 31, 2017 at 10:52 AM, Junio C Hamano wrote: > "Ulrich Windl" writes: > >> I was running "vc-annotate" in Emacs for a file from a large >> repository (>4 files, a big percentage being binary, about 10 >> commits). For the first file the result was presented rather soon, but >> for a second file the command did not finish even after about 10 >> minutes! >> >> The file in question is a rather short text file (124 kB), and >> according to git log it has one commit. >> >> While being bored, I did an strace of the command to find out that a >> huge number of files is inspected. > > With -C -C the user (vc-annotate?) is asking to inspect huge number > of files, to find if the contents of the file (except for the part > that came from its own previous version) came from other existing > files. So this is very much expected. > > It might not be a bad idea to teach "blame" not to pay attention to > any path that is marked as "-diff" (e.g. binary files) when trying > to see if remaining contents appeared by borrowing from them. We do > not have that heuristics (yet).
Re: Terrible bad performance for it blame --date=iso -C -C master --
"Ulrich Windl" writes: > I was running "vc-annotate" in Emacs for a file from a large > repository (>4 files, a big percentage being binary, about 10 > commits). For the first file the result was presented rather soon, but > for a second file the command did not finish even after about 10 > minutes! > > The file in question is a rather short text file (124 kB), and > according to git log it has one commit. > > While being bored, I did an strace of the command to find out that a > huge number of files is inspected. With -C -C the user (vc-annotate?) is asking to inspect huge number of files, to find if the contents of the file (except for the part that came from its own previous version) came from other existing files. So this is very much expected. It might not be a bad idea to teach "blame" not to pay attention to any path that is marked as "-diff" (e.g. binary files) when trying to see if remaining contents appeared by borrowing from them. We do not have that heuristics (yet).