Re: [PATCH v4 02/10] commit: add generation number to struct commmit

2018-04-30 Thread Derrick Stolee

On 4/28/2018 6:35 PM, Jakub Narebski wrote:

Derrick Stolee  writes:


The generation number of a commit is defined recursively as follows:

* If a commit A has no parents, then the generation number of A is one.
* If a commit A has parents, then the generation number of A is one
   more than the maximum generation number among the parents of A.

Very minor nitpick: it would be more readable wrapped differently:

   * If a commit A has parents, then the generation number of A is
 one more than the maximum generation number among parents of A.

Very minor nitpick: possibly "parents", not "the parents", but I am
not native English speaker.


Add a uint32_t generation field to struct commit so we can pass this
information to revision walks. We use three special values to signal
the generation number is invalid:

GENERATION_NUMBER_INFINITY 0x
GENERATION_NUMBER_MAX 0x3FFF
GENERATION_NUMBER_ZERO 0

The first (_INFINITY) means the generation number has not been loaded or
computed. The second (_MAX) means the generation number is too large to
store in the commit-graph file. The third (_ZERO) means the generation
number was loaded from a commit graph file that was written by a version
of git that did not support generation numbers.

Good explanation; I wonder if we want to have it in some shortened form
also in comments, and not only in the commit message.


Signed-off-by: Derrick Stolee 
---
  alloc.c| 1 +
  commit-graph.c | 2 ++
  commit.h   | 4 
  3 files changed, 7 insertions(+)

I have reordered patches to make it easier to review.


diff --git a/commit.h b/commit.h
index 23a3f364ed..aac3b8c56f 100644
--- a/commit.h
+++ b/commit.h
@@ -10,6 +10,9 @@
  #include "pretty.h"
  
  #define COMMIT_NOT_FROM_GRAPH 0x

+#define GENERATION_NUMBER_INFINITY 0x
+#define GENERATION_NUMBER_MAX 0x3FFF
+#define GENERATION_NUMBER_ZERO 0

I wonder if it wouldn't be good to have some short in-line comments
explaining those constants, or a block comment above them.

  
  struct commit_list {

struct commit *item;
@@ -30,6 +33,7 @@ struct commit {
 */
struct tree *maybe_tree;
uint32_t graph_pos;
+   uint32_t generation;
  };
  
  extern int save_commit_buffer;

All right, simple addition of the new field.  Nothing to go wrong here.

Sidenote: With 0x7FFF being (if I am not wrong) maximum graph_pos
and maximum number of nodes in commit graph, we won't hit 0x3FFF
generation number limit for all except very, very linear histories.


Both of these limits are far away from being realistic. But we could 
extend the maximum graph_pos independently from the maximum generation 
number now that we have the "capped" logic.





diff --git a/alloc.c b/alloc.c
index cf4f8b61e1..e8ab14f4a1 100644
--- a/alloc.c
+++ b/alloc.c
@@ -94,6 +94,7 @@ void *alloc_commit_node(void)
c->object.type = OBJ_COMMIT;
c->index = alloc_commit_index();
c->graph_pos = COMMIT_NOT_FROM_GRAPH;
+   c->generation = GENERATION_NUMBER_INFINITY;
return c;
  }

All right, start with initializing it with "not from commit-graph" value
after allocation.

  
diff --git a/commit-graph.c b/commit-graph.c

index 70fa1b25fd..9ad21c3ffb 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -262,6 +262,8 @@ static int fill_commit_in_graph(struct commit *item, struct 
commit_graph *g, uin
date_low = get_be32(commit_data + g->hash_len + 12);
item->date = (timestamp_t)((date_high << 32) | date_low);
  
+	item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;

+

I guess we should not worry about these "magical constants" sprinkled
here, like "+ 8" above.

Let's examine how it goes, taking a look at commit-graph-format.txt
in Documentation/technical/commit-graph-format.txt

  * The first H (g->hash_len) bytes are for the OID of the root tree.
  * The next 8 bytes are for the positions of the first two parents [...]

So 'commit_data + g->hash_len + 8' is our offset from the start of
commit data.  All right.

   * The next 8 bytes store the generation number of the commit and
 the commit time in seconds since EPOCH.  The generation number
 uses the higher 30 bits of the first 4 bytes. [...]

The higher 30 bits of the 4 bytes, which is 32 bits, means that we need
to shift 32-bit value 2 bits right, so that we get lower 30 bits of
32-bit value.  All right.

   All 4-byte numbers are in network order.

Shouldn't it be ntohl() to convert from network order to host order, and
not get_be32()?  I guess they are the same (network order is big-endian
order), and get_be32() is what rest of git uses...


ntohl() takes a 32-bit value, while get_be32() takes a pointer. This 
makes pulling network-bytes out of streams much cleaner with get_be32(), 
so I try to use that whenever possible.




Re: [PATCH v4 02/10] commit: add generation number to struct commmit

2018-04-28 Thread Jakub Narebski
Derrick Stolee  writes:

> The generation number of a commit is defined recursively as follows:
>
> * If a commit A has no parents, then the generation number of A is one.
> * If a commit A has parents, then the generation number of A is one
>   more than the maximum generation number among the parents of A.

Very minor nitpick: it would be more readable wrapped differently:

  * If a commit A has parents, then the generation number of A is
one more than the maximum generation number among parents of A.

Very minor nitpick: possibly "parents", not "the parents", but I am
not native English speaker.

>
> Add a uint32_t generation field to struct commit so we can pass this
> information to revision walks. We use three special values to signal
> the generation number is invalid:
>
> GENERATION_NUMBER_INFINITY 0x
> GENERATION_NUMBER_MAX 0x3FFF
> GENERATION_NUMBER_ZERO 0
>
> The first (_INFINITY) means the generation number has not been loaded or
> computed. The second (_MAX) means the generation number is too large to
> store in the commit-graph file. The third (_ZERO) means the generation
> number was loaded from a commit graph file that was written by a version
> of git that did not support generation numbers.

Good explanation; I wonder if we want to have it in some shortened form
also in comments, and not only in the commit message.

>
> Signed-off-by: Derrick Stolee 
> ---
>  alloc.c| 1 +
>  commit-graph.c | 2 ++
>  commit.h   | 4 
>  3 files changed, 7 insertions(+)

I have reordered patches to make it easier to review.

> diff --git a/commit.h b/commit.h
> index 23a3f364ed..aac3b8c56f 100644
> --- a/commit.h
> +++ b/commit.h
> @@ -10,6 +10,9 @@
>  #include "pretty.h"
>  
>  #define COMMIT_NOT_FROM_GRAPH 0x
> +#define GENERATION_NUMBER_INFINITY 0x
> +#define GENERATION_NUMBER_MAX 0x3FFF
> +#define GENERATION_NUMBER_ZERO 0

I wonder if it wouldn't be good to have some short in-line comments
explaining those constants, or a block comment above them.

>  
>  struct commit_list {
>   struct commit *item;
> @@ -30,6 +33,7 @@ struct commit {
>*/
>   struct tree *maybe_tree;
>   uint32_t graph_pos;
> + uint32_t generation;
>  };
>  
>  extern int save_commit_buffer;

All right, simple addition of the new field.  Nothing to go wrong here.

Sidenote: With 0x7FFF being (if I am not wrong) maximum graph_pos
and maximum number of nodes in commit graph, we won't hit 0x3FFF
generation number limit for all except very, very linear histories.

>
> diff --git a/alloc.c b/alloc.c
> index cf4f8b61e1..e8ab14f4a1 100644
> --- a/alloc.c
> +++ b/alloc.c
> @@ -94,6 +94,7 @@ void *alloc_commit_node(void)
>   c->object.type = OBJ_COMMIT;
>   c->index = alloc_commit_index();
>   c->graph_pos = COMMIT_NOT_FROM_GRAPH;
> + c->generation = GENERATION_NUMBER_INFINITY;
>   return c;
>  }

All right, start with initializing it with "not from commit-graph" value
after allocation.

>  
> diff --git a/commit-graph.c b/commit-graph.c
> index 70fa1b25fd..9ad21c3ffb 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -262,6 +262,8 @@ static int fill_commit_in_graph(struct commit *item, 
> struct commit_graph *g, uin
>   date_low = get_be32(commit_data + g->hash_len + 12);
>   item->date = (timestamp_t)((date_high << 32) | date_low);
>  
> + item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
> +

I guess we should not worry about these "magical constants" sprinkled
here, like "+ 8" above.

Let's examine how it goes, taking a look at commit-graph-format.txt
in Documentation/technical/commit-graph-format.txt

 * The first H (g->hash_len) bytes are for the OID of the root tree.
 * The next 8 bytes are for the positions of the first two parents [...]

So 'commit_data + g->hash_len + 8' is our offset from the start of
commit data.  All right.

  * The next 8 bytes store the generation number of the commit and
the commit time in seconds since EPOCH.  The generation number
uses the higher 30 bits of the first 4 bytes. [...]

The higher 30 bits of the 4 bytes, which is 32 bits, means that we need
to shift 32-bit value 2 bits right, so that we get lower 30 bits of
32-bit value.  All right.

  All 4-byte numbers are in network order.

Shouldn't it be ntohl() to convert from network order to host order, and
not get_be32()?  I guess they are the same (network order is big-endian
order), and get_be32() is what rest of git uses...

Looks all right.

>   pptr = >parents;
>  
>   edge_value = get_be32(commit_data + g->hash_len);


[PATCH v4 02/10] commit: add generation number to struct commmit

2018-04-25 Thread Derrick Stolee
The generation number of a commit is defined recursively as follows:

* If a commit A has no parents, then the generation number of A is one.
* If a commit A has parents, then the generation number of A is one
  more than the maximum generation number among the parents of A.

Add a uint32_t generation field to struct commit so we can pass this
information to revision walks. We use three special values to signal
the generation number is invalid:

GENERATION_NUMBER_INFINITY 0x
GENERATION_NUMBER_MAX 0x3FFF
GENERATION_NUMBER_ZERO 0

The first (_INFINITY) means the generation number has not been loaded or
computed. The second (_MAX) means the generation number is too large to
store in the commit-graph file. The third (_ZERO) means the generation
number was loaded from a commit graph file that was written by a version
of git that did not support generation numbers.

Signed-off-by: Derrick Stolee 
---
 alloc.c| 1 +
 commit-graph.c | 2 ++
 commit.h   | 4 
 3 files changed, 7 insertions(+)

diff --git a/alloc.c b/alloc.c
index cf4f8b61e1..e8ab14f4a1 100644
--- a/alloc.c
+++ b/alloc.c
@@ -94,6 +94,7 @@ void *alloc_commit_node(void)
c->object.type = OBJ_COMMIT;
c->index = alloc_commit_index();
c->graph_pos = COMMIT_NOT_FROM_GRAPH;
+   c->generation = GENERATION_NUMBER_INFINITY;
return c;
 }
 
diff --git a/commit-graph.c b/commit-graph.c
index 70fa1b25fd..9ad21c3ffb 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -262,6 +262,8 @@ static int fill_commit_in_graph(struct commit *item, struct 
commit_graph *g, uin
date_low = get_be32(commit_data + g->hash_len + 12);
item->date = (timestamp_t)((date_high << 32) | date_low);
 
+   item->generation = get_be32(commit_data + g->hash_len + 8) >> 2;
+
pptr = >parents;
 
edge_value = get_be32(commit_data + g->hash_len);
diff --git a/commit.h b/commit.h
index 23a3f364ed..aac3b8c56f 100644
--- a/commit.h
+++ b/commit.h
@@ -10,6 +10,9 @@
 #include "pretty.h"
 
 #define COMMIT_NOT_FROM_GRAPH 0x
+#define GENERATION_NUMBER_INFINITY 0x
+#define GENERATION_NUMBER_MAX 0x3FFF
+#define GENERATION_NUMBER_ZERO 0
 
 struct commit_list {
struct commit *item;
@@ -30,6 +33,7 @@ struct commit {
 */
struct tree *maybe_tree;
uint32_t graph_pos;
+   uint32_t generation;
 };
 
 extern int save_commit_buffer;
-- 
2.17.0.39.g685157f7fb