[PATCH v2 4/9] object: add clear_commit_marks_all()

2017-12-25 Thread René Scharfe
Add a function for clearing the commit marks of all in-core commit
objects.  It's similar to clear_object_flags(), but more precise, since
it leaves the other object types alone.  It still has to iterate through
them, though.

Signed-off-by: Rene Scharfe 
---
 object.c | 11 +++
 object.h |  5 +
 2 files changed, 16 insertions(+)

diff --git a/object.c b/object.c
index b9a4a0e501..0afdfd19b7 100644
--- a/object.c
+++ b/object.c
@@ -434,3 +434,14 @@ void clear_object_flags(unsigned flags)
obj->flags &= ~flags;
}
 }
+
+void clear_commit_marks_all(unsigned int flags)
+{
+   int i;
+
+   for (i = 0; i < obj_hash_size; i++) {
+   struct object *obj = obj_hash[i];
+   if (obj && obj->type == OBJ_COMMIT)
+   obj->flags &= ~flags;
+   }
+}
diff --git a/object.h b/object.h
index df8abe96f7..f64dd9 100644
--- a/object.h
+++ b/object.h
@@ -148,4 +148,9 @@ void object_array_clear(struct object_array *array);
 
 void clear_object_flags(unsigned flags);
 
+/*
+ * Clear the specified object flags from all in-core commit objects.
+ */
+extern void clear_commit_marks_all(unsigned int flags);
+
 #endif /* OBJECT_H */
-- 
2.15.1


Re: [PATCH v2 4/9] object: add clear_commit_marks_all()

2018-01-09 Thread Jeff King
On Mon, Dec 25, 2017 at 06:44:58PM +0100, René Scharfe wrote:

> Add a function for clearing the commit marks of all in-core commit
> objects.  It's similar to clear_object_flags(), but more precise, since
> it leaves the other object types alone.  It still has to iterate through
> them, though.

Makes sense.

Is it worth having:

  void clear_object_flags_from_type(int type, unsigned flags);

rather than having two near-identical functions? I guess we'd need some
way of saying "all types" to reimplement clear_object_flags() as a
wrapper (OBJ_NONE, I guess?).

The run-time check is maybe a little bit slower in the middle of a tight
loop, but I'm not sure it would matter much (I'd actually be curious if
this approach is faster than the existing traversal code, too).

-Peff


Re: [PATCH v2 4/9] object: add clear_commit_marks_all()

2018-01-11 Thread René Scharfe
Am 10.01.2018 um 08:58 schrieb Jeff King:
> On Mon, Dec 25, 2017 at 06:44:58PM +0100, René Scharfe wrote:
> 
>> Add a function for clearing the commit marks of all in-core commit
>> objects.  It's similar to clear_object_flags(), but more precise, since
>> it leaves the other object types alone.  It still has to iterate through
>> them, though.
> 
> Makes sense.
> 
> Is it worth having:
> 
>void clear_object_flags_from_type(int type, unsigned flags);
> 
> rather than having two near-identical functions? I guess we'd need some
> way of saying "all types" to reimplement clear_object_flags() as a
> wrapper (OBJ_NONE, I guess?).

I don't know if there is a demand.  Perhaps the two callers of
clear_object_flags() should be switched to clear_commit_marks_all()?
They look like they only care about commits as well.  Or is it safe to
stomp over the flags of objects of other types?  Then we'd only need
to keep clear_object_flags()..

> The run-time check is maybe a little bit slower in the middle of a tight
> loop, but I'm not sure it would matter much (I'd actually be curious if
> this approach is faster than the existing traversal code, too).

I don't know how to measure this properly.  With 100 runs each I get
this for the git repo and the silly test program below, which measures
the duration of the respective function call:

   meanstddev method
   --- -- --
   5.89763e+06 613106 clear_commit_marks
   2.72572e+06 507689 clear_commit_marks_all
   1.96582e+06 494753 clear_object_flags

So these are noisy numbers, but kind of in the expected range.

René

---
 Makefile   |  1 +
 t/helper/test-clear-commit-marks.c | 67 ++
 2 files changed, 68 insertions(+)
 create mode 100644 t/helper/test-clear-commit-marks.c

diff --git a/Makefile b/Makefile
index 1a9b23b679..7db2c6ca7f 100644
--- a/Makefile
+++ b/Makefile
@@ -648,6 +648,7 @@ X =
 PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
 
 TEST_PROGRAMS_NEED_X += test-chmtime
+TEST_PROGRAMS_NEED_X += test-clear-commit-marks
 TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
diff --git a/t/helper/test-clear-commit-marks.c 
b/t/helper/test-clear-commit-marks.c
new file mode 100644
index 00..296ca0286f
--- /dev/null
+++ b/t/helper/test-clear-commit-marks.c
@@ -0,0 +1,67 @@
+#include "cache.h"
+#include "commit.h"
+#include "revision.h"
+
+static void start(struct timespec *start)
+{
+   clock_gettime(CLOCK_MONOTONIC, start);
+}
+
+static void print_duration(struct timespec *start)
+{
+   struct timespec end;
+   uint64_t d;
+   clock_gettime(CLOCK_MONOTONIC, &end);
+   d = end.tv_sec - start->tv_sec;
+   d *= 10;
+   d += end.tv_nsec - start->tv_nsec;
+   printf("%lu", d);
+}
+
+int cmd_main(int argc, const char **argv)
+{
+   struct rev_info revs;
+   struct object_id oid;
+   struct commit *commit;
+   struct timespec ts;
+
+   setup_git_directory();
+
+   if (get_oid("HEAD", &oid))
+   die("No HEAD?");
+   commit = lookup_commit(&oid);
+   if (!commit)
+   die("HEAD is not a committish?");
+
+   init_revisions(&revs, NULL);
+   add_pending_object(&revs, &commit->object, "HEAD");
+   if (prepare_revision_walk(&revs))
+   die("revision walk setup failed");
+   while (get_revision(&revs))
+   ; /* nothing */
+
+   if (argc != 2)
+   return 0;
+
+   if (!strcmp(argv[1], "clear_commit_marks")) {
+   start(&ts);
+   clear_commit_marks(commit, ALL_REV_FLAGS);
+   print_duration(&ts);
+   }
+
+   if (!strcmp(argv[1], "clear_commit_marks_all")) {
+   start(&ts);
+   clear_commit_marks_all(ALL_REV_FLAGS);
+   print_duration(&ts);
+   }
+
+   if (!strcmp(argv[1], "clear_object_flags")) {
+   start(&ts);
+   clear_object_flags(ALL_REV_FLAGS);
+   print_duration(&ts);
+   }
+
+   printf(" %s\n", argv[1]);
+
+   return 0;
+}
-- 
2.15.1


Re: [PATCH v2 4/9] object: add clear_commit_marks_all()

2018-01-12 Thread Jeff King
On Thu, Jan 11, 2018 at 07:57:42PM +0100, René Scharfe wrote:

> > Is it worth having:
> > 
> >void clear_object_flags_from_type(int type, unsigned flags);
> > 
> > rather than having two near-identical functions? I guess we'd need some
> > way of saying "all types" to reimplement clear_object_flags() as a
> > wrapper (OBJ_NONE, I guess?).
> 
> I don't know if there is a demand.  Perhaps the two callers of
> clear_object_flags() should be switched to clear_commit_marks_all()?
> They look like they only care about commits as well.  Or is it safe to
> stomp over the flags of objects of other types?  Then we'd only need
> to keep clear_object_flags()..

I'd worry that the call in reset_revision_walk() might want to clear
non-commits if the revisions have "--objects" passed to them.

I do suspect that clearing flags from all objects would just work in the
general case (since we're limiting ourselves to only a particular set of
flags). But it's probably not worth the risk of unintended fallout,
since there's not much benefit after your series.

> > The run-time check is maybe a little bit slower in the middle of a tight
> > loop, but I'm not sure it would matter much (I'd actually be curious if
> > this approach is faster than the existing traversal code, too).
> 
> I don't know how to measure this properly.  With 100 runs each I get
> this for the git repo and the silly test program below, which measures
> the duration of the respective function call:
> 
>meanstddev method
>--- -- --
>5.89763e+06 613106 clear_commit_marks
>2.72572e+06 507689 clear_commit_marks_all
>1.96582e+06 494753 clear_object_flags
> 
> So these are noisy numbers, but kind of in the expected range.

That's about what I'd expect. The "bad" case for looking at all objects
is when there are a bunch of objects loaded that _weren't_ part of this
particular traversal. I have no idea how often that happens, but we can
guess at the impact in the worst case: having done a previous --objects
traversal in the process and then traversing all of the commits a second
time, we'd probably have about 5-10x as many objects to look through for
that second path. So clear_commit_marks() would win there.

The absolute numbers are small enough that it probably doesn't matter
either way.

-Peff