Re: inotify to minimize stat() calls

2013-02-08 Thread Junio C Hamano
Ramkumar Ramachandra writes: > ... Will Git ever > consider using inotify on Linux? What is the downside? I think this has come up from time to time, but my understanding is that nobody thought things through to find a good layer in the codebase to interface to an external daemon that listens

Re: inotify to minimize stat() calls

2013-02-08 Thread Junio C Hamano
Junio C Hamano writes: > Ramkumar Ramachandra writes: > >> ... Will Git ever >> consider using inotify on Linux? What is the downside? > > I think this has come up from time to time, but my understanding is > that nobody thought things through to find a good layer in the > codebase to interfac

Re: inotify to minimize stat() calls

2013-02-08 Thread Duy Nguyen
On Sat, Feb 9, 2013 at 5:45 AM, Junio C Hamano wrote: > Junio C Hamano writes: > >> Ramkumar Ramachandra writes: >> >>> ... Will Git ever >>> consider using inotify on Linux? What is the downside? >> >> I think this has come up from time to time, but my understanding is >> that nobody thought

Re: inotify to minimize stat() calls

2013-02-08 Thread Junio C Hamano
Duy Nguyen writes: > Can we replace "open a socket to our daemon" with "open a special file > in .git to get stat data written by our daemon"? TCP/IP socket means > system-wide daemon, not attractive. UNIX socket is not available on > Windows (although there may be named pipe, I don't know). I d

Re: inotify to minimize stat() calls

2013-02-08 Thread Junio C Hamano
Junio C Hamano writes: > I checked read-cache.c and preload-index.c code. To get the > discussion rolling, I think something like the outline below may be > a good starting point and a feasible weekend hack for somebody > competent: > > * At the beginning of preload_index(), instead of spawning

Re: inotify to minimize stat() calls

2013-02-08 Thread Robert Zeh
The delay for commands like git status is much worse on Windows than Linux; for my workflow I would be happy with a Windows only implementation. >From the description so far, I have some question: how does the daemon get >started and stopped? Is there one per repository --- this seems to be im

Re: inotify to minimize stat() calls

2013-02-09 Thread Duy Nguyen
On Sat, Feb 9, 2013 at 7:53 PM, Ramkumar Ramachandra wrote: > Ramkumar Ramachandra wrote: >> What about getting systemd to watch everything for us? > > systemd is the perfect candidate! How about this as a start? I did not really check what it does, but it does not look complicate enough to pull

Re: inotify to minimize stat() calls

2013-02-09 Thread Ramkumar Ramachandra
Duy Nguyen wrote: > How about this as a start? I did not really check what it does, but it > does not look complicate enough to pull systemd in. > > http://article.gmane.org/gmane.comp.version-control.git/151934 Clever hack. I didn't know that there was a switch called core.ignoreStat which will

Re: inotify to minimize stat() calls

2013-02-09 Thread Junio C Hamano
Ramkumar Ramachandra writes: > This is much better than Junio's suggestion to study possible > implementations on all platforms and designing a generic daemon/ > communication channel. That's no weekend project. It appears that you misunderstood what I wrote. That was not "here is a design; I

Re: inotify to minimize stat() calls

2013-02-09 Thread Duy Nguyen
On Sun, Feb 10, 2013 at 12:10 AM, Ramkumar Ramachandra wrote: > Finn notes in the commit message that it offers no speedup, because > .gitignore files in every directory still have to be read. I think > this is silly: we really should be caching .gitignore, and touching it > only when lstat() rep

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Sun, Feb 10, 2013 at 12:24:58PM +0700, Duy Nguyen wrote: > On Sun, Feb 10, 2013 at 12:10 AM, Ramkumar Ramachandra > wrote: > > Finn notes in the commit message that it offers no speedup, because > > .gitignore files in every directory still have to be read. I think > > this is silly: we really

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Sun, Feb 10, 2013 at 06:17:32PM +0700, Duy Nguyen wrote: > The following patch eliminates untracked search code. As we can see, > open+getdents also disappears with this patch: > > 0.462909 40950 lstat 0.462909 40950 lstat > 0.003417 129 brk 0.003417 129 brk > 0.000762 53 read 0.0

Re: inotify to minimize stat() calls

2013-02-10 Thread demerphq
On 10 February 2013 12:17, Duy Nguyen wrote: > Bear in mind though this is Linux, where lstat is fast. On systems > with slow lstat, these timings could look very different due to the > large number of lstat calls compared to open+getdents. I really like > to see similar numbers on Windows. Is wi

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Sun, Feb 10, 2013 at 8:26 PM, demerphq wrote: > On 10 February 2013 12:17, Duy Nguyen wrote: >> Bear in mind though this is Linux, where lstat is fast. On systems >> with slow lstat, these timings could look very different due to the >> large number of lstat calls compared to open+getdents. I

Re: inotify to minimize stat() calls

2013-02-10 Thread Ramkumar Ramachandra
On Sun, Feb 10, 2013 at 4:47 PM, Duy Nguyen wrote: > On Sun, Feb 10, 2013 at 12:24:58PM +0700, Duy Nguyen wrote: >> On Sun, Feb 10, 2013 at 12:10 AM, Ramkumar Ramachandra >> wrote: >> > Finn notes in the commit message that it offers no speedup, because >> > .gitignore files in every directory st

Re: inotify to minimize stat() calls

2013-02-10 Thread Erik Faye-Lund
On Sun, Feb 10, 2013 at 12:17 PM, Duy Nguyen wrote: > On Sun, Feb 10, 2013 at 12:24:58PM +0700, Duy Nguyen wrote: >> On Sun, Feb 10, 2013 at 12:10 AM, Ramkumar Ramachandra >> wrote: >> > Finn notes in the commit message that it offers no speedup, because >> > .gitignore files in every directory s

Re: inotify to minimize stat() calls

2013-02-10 Thread Robert Zeh
On Sat, Feb 9, 2013 at 1:35 PM, Junio C Hamano wrote: > Ramkumar Ramachandra writes: > >> This is much better than Junio's suggestion to study possible >> implementations on all platforms and designing a generic daemon/ >> communication channel. That's no weekend project. > > It appears that you

Re: inotify to minimize stat() calls

2013-02-10 Thread Martin Fick
On Sunday, February 10, 2013 12:03:00 pm Robert Zeh wrote: > On Sat, Feb 9, 2013 at 1:35 PM, Junio C Hamano wrote: > > Ramkumar Ramachandra writes: > >> This is much better than Junio's suggestion to study > >> possible implementations on all platforms and > >> designing a generic daemon/ commun

Re: inotify to minimize stat() calls

2013-02-10 Thread Junio C Hamano
Duy Nguyen writes: > On Sun, Feb 10, 2013 at 06:17:32PM +0700, Duy Nguyen wrote: >> The following patch eliminates untracked search code. As we can see, >> open+getdents also disappears with this patch: >> >> 0.462909 40950 lstat 0.462909 40950 lstat >> 0.003417 129 brk 0.003417 129 brk

Re: inotify to minimize stat() calls

2013-02-10 Thread Robert Zeh
On Feb 10, 2013, at 1:26 PM, Martin Fick wrote: > On Sunday, February 10, 2013 12:03:00 pm Robert Zeh wrote: >> On Sat, Feb 9, 2013 at 1:35 PM, Junio C Hamano > wrote: >>> Ramkumar Ramachandra writes: This is much better than Junio's suggestion to study possible implementations on a

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Mon, Feb 11, 2013 at 3:16 AM, Junio C Hamano wrote: > The other "lstat()" experiment was a very interesting one, but this > is not yet an interesting experiment to see where in the "ignore" > codepath we are spending times. > > We know that we can tell wt_status_collect_untracked() not to bothe

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Sun, Feb 10, 2013 at 11:45 PM, Ramkumar Ramachandra wrote: > So you're skipping the rest of refresh_cache_ent(), which contains our > lstat() and returning immediately. Instead of marking paths with the > "assume unchanged" bit, as core.ignoreStat does, you're directly > attacking the function

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Mon, Feb 11, 2013 at 2:03 AM, Robert Zeh wrote: > On Sat, Feb 9, 2013 at 1:35 PM, Junio C Hamano wrote: >> Ramkumar Ramachandra writes: >> >>> This is much better than Junio's suggestion to study possible >>> implementations on all platforms and designing a generic daemon/ >>> communication c

Re: inotify to minimize stat() calls

2013-02-10 Thread Duy Nguyen
On Sun, Feb 10, 2013 at 11:58 PM, Erik Faye-Lund wrote: > Karsten Blees has done something similar-ish on Windows, and he posted > the results here: > > https://groups.google.com/forum/#!topic/msysgit/fL_jykUmUNE/discussion > > I also seem to remember he doing a ReadDirectoryChangesW version, but

Re: inotify to minimize stat() calls

2013-02-11 Thread Duy Nguyen
On Mon, Feb 11, 2013 at 9:56 AM, Duy Nguyen wrote: > Yeah, it did not cut out syscall cost, I also cut a lot of user-space > processing (plus .gitignore content access). From the timings I posted > earlier, > >> unmodified dir.c >> real0m0.550s0m0.287s >> user0m0.305s0m0.2

Re: inotify to minimize stat() calls

2013-02-11 Thread Robert Zeh
On Sun, Feb 10, 2013 at 9:21 PM, Duy Nguyen wrote: > On Mon, Feb 11, 2013 at 2:03 AM, Robert Zeh > wrote: >> On Sat, Feb 9, 2013 at 1:35 PM, Junio C Hamano wrote: >>> Ramkumar Ramachandra writes: >>> This is much better than Junio's suggestion to study possible implementations on all

Re: inotify to minimize stat() calls

2013-02-12 Thread Karsten Blees
Am 11.02.2013 04:53, schrieb Duy Nguyen: > On Sun, Feb 10, 2013 at 11:58 PM, Erik Faye-Lund wrote: >> Karsten Blees has done something similar-ish on Windows, and he posted >> the results here: >> >> https://groups.google.com/forum/#!topic/msysgit/fL_jykUmUNE/discussion >> The new hashtable imple

Re: inotify to minimize stat() calls

2013-02-13 Thread Duy Nguyen
On Tue, Feb 12, 2013 at 09:48:18PM +0100, Karsten Blees wrote: > However, the difference between git status -uall and -uno was always > about 1.3 s in all fscache versions, even though > opendir/readdir/closedir was served entirely from the cache. I added > a bit of performance tracing to find the

Re: inotify to minimize stat() calls

2013-02-13 Thread Jeff King
On Wed, Feb 13, 2013 at 07:15:47PM +0700, Nguyen Thai Ngoc Duy wrote: > On Wed, Feb 13, 2013 at 3:48 AM, Karsten Blees > wrote: > > 2.) 0.135 s is spent in name-hash.c/hash_index_entry_directories, > > reindexing the same directories over and over again. In the end, the > > hashtable contains

Re: inotify to minimize stat() calls

2013-02-13 Thread Jeff King
On Wed, Feb 13, 2013 at 01:18:51PM -0500, Jeff King wrote: > I think the best way forward is to actually create a separate hash table > for the directory lookups. I note that we only care about these entries > in directory_exists_in_index_icase, which is really about whether > something is there,

Re: inotify to minimize stat() calls

2013-02-13 Thread Karsten Blees
Am 13.02.2013 19:18, schrieb Jeff King: > Moreover, looking at it again, I > don't think my patch produces the right behavior: we have a single > dir_next pointer, even though the same ce_entry may appear under many > directory hashes. So the cache_entries that has to "dir/foo/" and those > that ha

Re: inotify to minimize stat() calls

2013-02-13 Thread Jeff King
On Wed, Feb 13, 2013 at 09:25:59PM +0100, Karsten Blees wrote: > Am 13.02.2013 19:18, schrieb Jeff King: > > Moreover, looking at it again, I > > don't think my patch produces the right behavior: we have a single > > dir_next pointer, even though the same ce_entry may appear under many > > directo

Re: inotify to minimize stat() calls

2013-02-13 Thread Karsten Blees
Am 13.02.2013 23:55, schrieb Jeff King: > On Wed, Feb 13, 2013 at 09:25:59PM +0100, Karsten Blees wrote: > >> Alternatively, we could simply create normal cache_entries for the >> directories that are linked via ce->next, but have a trailing '/' in >> their name? >> >> Reference counting sounds go

Re: inotify to minimize stat() calls

2013-02-14 Thread Magnus Bäck
On Sunday, February 10, 2013 at 08:26 EST, demerphq wrote: > Is windows stat really so slow? Well, the problem is that there is no such thing as "Windows stat" :-) > I encountered this perception in windows Perl in the past, and I know > that on windows Perl stat *appears* slow compared to

Re: inotify to minimize stat() calls

2013-02-14 Thread Ævar Arnfjörð Bjarmason
On Fri, Feb 8, 2013 at 10:10 PM, Ramkumar Ramachandra wrote: > For large repositories, many simple git commands like `git status` > take a while to respond. I understand that this is because of large > number of stat() calls to figure out which files were changed. I > overheard that Mercurial wa

Re: inotify to minimize stat() calls

2013-02-14 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason writes: > I looked into this a long while ago and remembered that rebase was > doing something like a git status for every commit that it made to > check the dirtyness. > > This could be vastly improved by having an unsafe option to git-rebase > where it just assumes that

Re: inotify to minimize stat() calls

2013-02-19 Thread Ramkumar Ramachandra
Ævar Arnfjörð Bjarmason wrote: > On Fri, Feb 8, 2013 at 10:10 PM, Ramkumar Ramachandra > wrote: >> For large repositories, many simple git commands like `git status` >> take a while to respond. I understand that this is because of large >> number of stat() calls to figure out which files were cha

Re: inotify to minimize stat() calls

2013-02-19 Thread Ramkumar Ramachandra
Karsten Blees wrote: > Am 11.02.2013 04:53, schrieb Duy Nguyen: >> On Sun, Feb 10, 2013 at 11:58 PM, Erik Faye-Lund wrote: >>> Karsten Blees has done something similar-ish on Windows, and he posted >>> the results here: >>> >>> https://groups.google.com/forum/#!topic/msysgit/fL_jykUmUNE/discussion

Re: inotify to minimize stat() calls

2013-02-19 Thread Ramkumar Ramachandra
Robert Zeh wrote: > On Sun, Feb 10, 2013 at 9:21 PM, Duy Nguyen wrote: >> On Mon, Feb 11, 2013 at 2:03 AM, Robert Zeh >> wrote: >>> On Sat, Feb 9, 2013 at 1:35 PM, Junio C Hamano wrote: Ramkumar Ramachandra writes: > This is much better than Junio's suggestion to study possible >

Re: inotify to minimize stat() calls

2013-02-19 Thread Drew Northup
On Sun, Feb 10, 2013 at 12:24 AM, Duy Nguyen wrote: > On Sun, Feb 10, 2013 at 12:10 AM, Ramkumar Ramachandra > wrote: >> Finn notes in the commit message that it offers no speedup, because >> .gitignore files in every directory still have to be read. I think >> this is silly: we really should be

Re: inotify to minimize stat() calls

2013-02-19 Thread Duy Nguyen
On Tue, Feb 19, 2013 at 8:16 PM, Drew Northup wrote: > Did your testing turn up anything about the amount of time spent > parsing the .gitignore/.gitattributes files? Not the syscall count, > but the actual time spent running the parser (which I presume is > largely CPU-bound). The other notable b

Re: inotify to minimize stat() calls

2013-02-19 Thread Karsten Blees
Am 19.02.2013 10:49, schrieb Ramkumar Ramachandra: > Karsten Blees wrote: >> Am 11.02.2013 04:53, schrieb Duy Nguyen: >>> On Sun, Feb 10, 2013 at 11:58 PM, Erik Faye-Lund >>> wrote: Karsten Blees has done something similar-ish on Windows, and he posted the results here: https:

Re: inotify to minimize stat() calls

2013-03-07 Thread Torsten Bögershausen
On 11.02.13 03:56, Duy Nguyen wrote: > On Mon, Feb 11, 2013 at 3:16 AM, Junio C Hamano wrote: >> The other "lstat()" experiment was a very interesting one, but this >> is not yet an interesting experiment to see where in the "ignore" >> codepath we are spending times. >> >> We know that we can tel

Re: inotify to minimize stat() calls

2013-03-07 Thread Junio C Hamano
Torsten Bögershausen writes: > diff --git a/builtin/commit.c b/builtin/commit.c > index d6dd3df..6a5ba11 100644 > --- a/builtin/commit.c > +++ b/builtin/commit.c > @@ -1158,6 +1158,8 @@ int cmd_status(int argc, const char **argv, const char > *prefix) > unsigned char sha1[20]; > stat

Re: inotify to minimize stat() calls

2013-03-07 Thread Torsten Bögershausen
On 08.03.13 01:04, Junio C Hamano wrote: > Torsten Bögershausen writes: > >> diff --git a/builtin/commit.c b/builtin/commit.c >> index d6dd3df..6a5ba11 100644 >> --- a/builtin/commit.c >> +++ b/builtin/commit.c >> @@ -1158,6 +1158,8 @@ int cmd_status(int argc, const char **argv, const char >> *p

Re: inotify to minimize stat() calls

2013-03-08 Thread Junio C Hamano
Torsten Bögershausen writes: >> Doesn't this make one wonder why a separate bit and implementation >> is necessary to say "I am not interested in untracked files" when >> "-uno" option is already there? > ... > I need to admit that I wasn't aware about "git status -uno". Not so fast. I did not

Re: inotify to minimize stat() calls

2013-03-08 Thread Torsten Bögershausen
On 08.03.13 09:15, Junio C Hamano wrote: > Torsten Bögershausen writes: > >>> Doesn't this make one wonder why a separate bit and implementation >>> is necessary to say "I am not interested in untracked files" when >>> "-uno" option is already there? >> ... >> I need to admit that I wasn't aware

Re: inotify to minimize stat() calls

2013-03-08 Thread Duy Nguyen
On Fri, Mar 8, 2013 at 3:15 PM, Junio C Hamano wrote: >> The possible options are: >> + >> - - 'no' - Show no untracked files >> + - 'no' - Show no untracked files (this is fastest) > > There is a trade-off around the use of -uno between safety and > performance. The default

Re: inotify to minimize stat() calls

2013-03-10 Thread Ramkumar Ramachandra
Duy Nguyen wrote: > On Fri, Mar 8, 2013 at 3:15 PM, Junio C Hamano wrote: >>> The possible options are: >>> + >>> - - 'no' - Show no untracked files >>> + - 'no' - Show no untracked files (this is fastest) >> >> There is a trade-off around the use of -uno between safety and >

Re: inotify to minimize stat() calls

2013-04-29 Thread Duy Nguyen
On Tue, Apr 30, 2013 at 1:05 AM, Robert Zeh wrote: > The call to lstat is only there for testing and should not be in there for > the final version. Is there an easy way to only enable it for tests? The usual trick is invent a new GIT_ environment variable. Then check it and do something differen

Re: [PATCH] inotify to minimize stat() calls

2013-04-24 Thread Robert Zeh
Here is a patch that creates a daemon that tracks file state with inotify, writes it out to a file upon request, and changes most of the calls to stat to use said cache. It has bugs, but I figured it would be smarter to see if the approach was acceptable at all before spending the time to root th

Re: [PATCH] inotify to minimize stat() calls

2013-04-24 Thread Duy Nguyen
On Thu, Apr 25, 2013 at 3:20 AM, Robert Zeh wrote: > Here is a patch that creates a daemon that tracks file > state with inotify, writes it out to a file upon request, > and changes most of the calls to stat to use said cache. > > It has bugs, but I figured it would be smarter to see > if the appr

Re: [PATCH] inotify to minimize stat() calls

2013-04-25 Thread Thomas Rast
Robert Zeh writes: > Here is a patch that creates a daemon that tracks file > state with inotify, writes it out to a file upon request, > and changes most of the calls to stat to use said cache. > > It has bugs, but I figured it would be smarter to see > if the approach was acceptable at all befo

Re: [PATCH] inotify to minimize stat() calls

2013-04-25 Thread Robert Zeh
On Thu, Apr 25, 2013 at 3:18 AM, Thomas Rast wrote: > > Robert Zeh writes: > > > Here is a patch that creates a daemon that tracks file > > state with inotify, writes it out to a file upon request, > > and changes most of the calls to stat to use said cache. > > > > It has bugs, but I figured it

Re: [PATCH] inotify to minimize stat() calls

2013-04-25 Thread Robert Zeh
On Wed, Apr 24, 2013 at 4:32 PM, Duy Nguyen wrote: > On Thu, Apr 25, 2013 at 3:20 AM, Robert Zeh > wrote: >> Here is a patch that creates a daemon that tracks file >> state with inotify, writes it out to a file upon request, >> and changes most of the calls to stat to use said cache. >> >> It ha

Re: [PATCH] inotify to minimize stat() calls

2013-04-25 Thread Thomas Rast
Robert Zeh writes: > On Thu, Apr 25, 2013 at 3:18 AM, Thomas Rast wrote: >> >> I don't get this bit. The lstat() are run over all files listed in the >> index. So shouldn't your daemon watch exactly those (or rather, all >> dirnames of such files)? > I believe that fill_directory is handling w

Re: [PATCH] inotify to minimize stat() calls

2013-04-25 Thread Duy Nguyen
On Fri, Apr 26, 2013 at 2:44 AM, Robert Zeh wrote: >> Can you just replace lstat/stat with cached_lstat/stat inside >> git-compat-util.h and not touch all files at once? I think you may >> need to deal with paths outside working directory. But because you're >> using lookup table, that should be n

Re: [PATCH] inotify to minimize stat() calls

2013-04-26 Thread Robert Zeh
On Thu, Apr 25, 2013 at 4:20 PM, Duy Nguyen wrote: > On Fri, Apr 26, 2013 at 2:44 AM, Robert Zeh > wrote: >>> Can you just replace lstat/stat with cached_lstat/stat inside >>> git-compat-util.h and not touch all files at once? I think you may >>> need to deal with paths outside working directory

Re: [PATCH] inotify to minimize stat() calls

2013-04-27 Thread Thomas Rast
Thomas Rast writes: > Robert Zeh writes: > >> On Thu, Apr 25, 2013 at 3:18 AM, Thomas Rast wrote: >>> >>> I don't get this bit. The lstat() are run over all files listed in the >>> index. So shouldn't your daemon watch exactly those (or rather, all >>> dirnames of such files)? >> I believe th

Re: [PATCH] inotify to minimize stat() calls

2013-04-27 Thread Duy Nguyen
On Thu, Apr 25, 2013 at 12:20 AM, Robert Zeh wrote: > +int cached_lstat(const char *path, struct stat *buf) > +{ > + int stat_return_value = 0; > + struct stat_cache_entry *entry = 0; > + > + read_stat_cache(); > + > + entry = get_stat_cache_entry(path); > + > + stat_