Re: [PATCH 05/23] midx: write header information to lockfile
On 6/19/2018 10:59 AM, Duy Nguyen wrote: On Tue, Jun 19, 2018 at 2:54 PM Derrick Stolee wrote: On 6/12/2018 11:00 AM, Duy Nguyen wrote: On Thu, Jun 7, 2018 at 7:01 PM Derrick Stolee wrote: diff --git a/midx.c b/midx.c index 616af66b13..3e55422a21 100644 --- a/midx.c +++ b/midx.c @@ -1,9 +1,62 @@ #include "git-compat-util.h" #include "cache.h" #include "dir.h" +#include "csum-file.h" +#include "lockfile.h" #include "midx.h" +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ +#define MIDX_VERSION 1 +#define MIDX_HASH_VERSION 1 /* SHA-1 */ ... +static size_t write_midx_header(struct hashfile *f, + unsigned char num_chunks, + uint32_t num_packs) +{ + char byte_values[4]; + hashwrite_be32(f, MIDX_SIGNATURE); + byte_values[0] = MIDX_VERSION; + byte_values[1] = MIDX_HASH_VERSION; Quoting from "State of NewHash work, future directions, and discussion" [1] * If you need to serialize an algorithm identifier into your data format, use the format_id field of struct git_hash_algo. It's designed specifically for that purpose. [1] https://public-inbox.org/git/20180612024252.ga141...@aiede.svl.corp.google.com/T/#m5fdd09dcaf31266c45343fb6c0beaaa3e928bc60 Thanks! I'll also use the_hash_algo->rawsz to infer the length of the hash function. BTW, since you're the author of commit-graph.c and may notice it has the same problem. Don't touch that code. Brian already has some WIP changes [1]. We just make sure new code does not add extra work for him. I expect he'll send all those patches out soon. [1] https://github.com/bk2204/git/commit/3f9031e06cfb21534eb7dfff7b54e7598ac1149f Thanks for the link. It seems he is creating an oid_version() method that returns a 1-byte version for the hash version instead of the 4-byte signature of the_hash_algo->format_id. I look forward to incorporating that into the MIDX format. I'll keep my macros for now, as we work out the other details, and while Brain's patch is cooking. Thanks, -Stolee
Re: [PATCH 05/23] midx: write header information to lockfile
On Tue, Jun 19, 2018 at 2:54 PM Derrick Stolee wrote: > > On 6/12/2018 11:00 AM, Duy Nguyen wrote: > > On Thu, Jun 7, 2018 at 7:01 PM Derrick Stolee wrote: > >> diff --git a/midx.c b/midx.c > >> index 616af66b13..3e55422a21 100644 > >> --- a/midx.c > >> +++ b/midx.c > >> @@ -1,9 +1,62 @@ > >> #include "git-compat-util.h" > >> #include "cache.h" > >> #include "dir.h" > >> +#include "csum-file.h" > >> +#include "lockfile.h" > >> #include "midx.h" > >> > >> +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ > >> +#define MIDX_VERSION 1 > >> +#define MIDX_HASH_VERSION 1 /* SHA-1 */ > > ... > >> +static size_t write_midx_header(struct hashfile *f, > >> + unsigned char num_chunks, > >> + uint32_t num_packs) > >> +{ > >> + char byte_values[4]; > >> + hashwrite_be32(f, MIDX_SIGNATURE); > >> + byte_values[0] = MIDX_VERSION; > >> + byte_values[1] = MIDX_HASH_VERSION; > > Quoting from "State of NewHash work, future directions, and discussion" [1] > > > > * If you need to serialize an algorithm identifier into your data > >format, use the format_id field of struct git_hash_algo. It's > >designed specifically for that purpose. > > > > [1] > > https://public-inbox.org/git/20180612024252.ga141...@aiede.svl.corp.google.com/T/#m5fdd09dcaf31266c45343fb6c0beaaa3e928bc60 > > Thanks! I'll also use the_hash_algo->rawsz to infer the length of the > hash function. BTW, since you're the author of commit-graph.c and may notice it has the same problem. Don't touch that code. Brian already has some WIP changes [1]. We just make sure new code does not add extra work for him. I expect he'll send all those patches out soon. [1] https://github.com/bk2204/git/commit/3f9031e06cfb21534eb7dfff7b54e7598ac1149f -- Duy
Re: [PATCH 05/23] midx: write header information to lockfile
On 6/12/2018 11:00 AM, Duy Nguyen wrote: On Thu, Jun 7, 2018 at 7:01 PM Derrick Stolee wrote: diff --git a/midx.c b/midx.c index 616af66b13..3e55422a21 100644 --- a/midx.c +++ b/midx.c @@ -1,9 +1,62 @@ #include "git-compat-util.h" #include "cache.h" #include "dir.h" +#include "csum-file.h" +#include "lockfile.h" #include "midx.h" +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ +#define MIDX_VERSION 1 +#define MIDX_HASH_VERSION 1 /* SHA-1 */ ... +static size_t write_midx_header(struct hashfile *f, + unsigned char num_chunks, + uint32_t num_packs) +{ + char byte_values[4]; + hashwrite_be32(f, MIDX_SIGNATURE); + byte_values[0] = MIDX_VERSION; + byte_values[1] = MIDX_HASH_VERSION; Quoting from "State of NewHash work, future directions, and discussion" [1] * If you need to serialize an algorithm identifier into your data format, use the format_id field of struct git_hash_algo. It's designed specifically for that purpose. [1] https://public-inbox.org/git/20180612024252.ga141...@aiede.svl.corp.google.com/T/#m5fdd09dcaf31266c45343fb6c0beaaa3e928bc60 Thanks! I'll also use the_hash_algo->rawsz to infer the length of the hash function.
Re: [PATCH 05/23] midx: write header information to lockfile
On Thu, Jun 7, 2018 at 7:01 PM Derrick Stolee wrote: > diff --git a/midx.c b/midx.c > index 616af66b13..3e55422a21 100644 > --- a/midx.c > +++ b/midx.c > @@ -1,9 +1,62 @@ > #include "git-compat-util.h" > #include "cache.h" > #include "dir.h" > +#include "csum-file.h" > +#include "lockfile.h" > #include "midx.h" > > +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ > +#define MIDX_VERSION 1 > +#define MIDX_HASH_VERSION 1 /* SHA-1 */ ... > +static size_t write_midx_header(struct hashfile *f, > + unsigned char num_chunks, > + uint32_t num_packs) > +{ > + char byte_values[4]; > + hashwrite_be32(f, MIDX_SIGNATURE); > + byte_values[0] = MIDX_VERSION; > + byte_values[1] = MIDX_HASH_VERSION; Quoting from "State of NewHash work, future directions, and discussion" [1] * If you need to serialize an algorithm identifier into your data format, use the format_id field of struct git_hash_algo. It's designed specifically for that purpose. [1] https://public-inbox.org/git/20180612024252.ga141...@aiede.svl.corp.google.com/T/#m5fdd09dcaf31266c45343fb6c0beaaa3e928bc60 > + byte_values[2] = num_chunks; > + byte_values[3] = 0; /* unused */ > + hashwrite(f, byte_values, sizeof(byte_values)); > + hashwrite_be32(f, num_packs); > + > + return MIDX_HEADER_SIZE; > +} -- Duy
Re: [PATCH 05/23] midx: write header information to lockfile
On Thu, Jun 7, 2018 at 4:03 PM, Derrick Stolee wrote: > +static char *get_midx_filename(const char *object_dir) > +{ > + struct strbuf midx_name = STRBUF_INIT; > + strbuf_addstr(_name, object_dir); > + strbuf_addstr(_name, "/pack/multi-pack-index"); > + return strbuf_detach(_name, NULL); > +} I think this whole function can be written as xstrfmt("%s/pack/multi-pack-index", object_dir); > + > +static size_t write_midx_header(struct hashfile *f, > + unsigned char num_chunks, > + uint32_t num_packs) > +{ > + char byte_values[4]; unsigned char just to be on the safe side? 'char' is signed on ARM if I remember correctly. -- Duy
[PATCH 05/23] midx: write header information to lockfile
As we begin writing the multi-pack-index format to disk, start with the basics: the 12-byte header and the 20-byte checksum footer. Start with these basics so we can add the rest of the format in small increments. As we implement the format, we will use a technique to check that our computed offsets within the multi-pack-index file match what we are actually writing. Each method that writes to the hashfile will return the number of bytes written, and we will track that those values match our expectations. Currently, write_midx_header() returns 12, but is not checked. We will check the return value in a later commit. Signed-off-by: Derrick Stolee --- midx.c | 53 + t/t5319-midx.sh | 5 +++-- 2 files changed, 56 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 616af66b13..3e55422a21 100644 --- a/midx.c +++ b/midx.c @@ -1,9 +1,62 @@ #include "git-compat-util.h" #include "cache.h" #include "dir.h" +#include "csum-file.h" +#include "lockfile.h" #include "midx.h" +#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */ +#define MIDX_VERSION 1 +#define MIDX_HASH_VERSION 1 /* SHA-1 */ +#define MIDX_HEADER_SIZE 12 + +static char *get_midx_filename(const char *object_dir) +{ + struct strbuf midx_name = STRBUF_INIT; + strbuf_addstr(_name, object_dir); + strbuf_addstr(_name, "/pack/multi-pack-index"); + return strbuf_detach(_name, NULL); +} + +static size_t write_midx_header(struct hashfile *f, + unsigned char num_chunks, + uint32_t num_packs) +{ + char byte_values[4]; + hashwrite_be32(f, MIDX_SIGNATURE); + byte_values[0] = MIDX_VERSION; + byte_values[1] = MIDX_HASH_VERSION; + byte_values[2] = num_chunks; + byte_values[3] = 0; /* unused */ + hashwrite(f, byte_values, sizeof(byte_values)); + hashwrite_be32(f, num_packs); + + return MIDX_HEADER_SIZE; +} + int write_midx_file(const char *object_dir) { + unsigned char num_chunks = 0; + uint32_t num_packs = 0; + char *midx_name; + struct hashfile *f; + struct lock_file lk; + + midx_name = get_midx_filename(object_dir); + if (safe_create_leading_directories(midx_name)) { + UNLEAK(midx_name); + die_errno(_("unable to create leading directories of %s"), + midx_name); + } + + hold_lock_file_for_update(, midx_name, LOCK_DIE_ON_ERROR); + f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf); + FREE_AND_NULL(midx_name); + + write_midx_header(f, num_chunks, num_packs); + + finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM); + commit_lock_file(); + return 0; } diff --git a/t/t5319-midx.sh b/t/t5319-midx.sh index a590137af7..80f9389837 100755 --- a/t/t5319-midx.sh +++ b/t/t5319-midx.sh @@ -3,8 +3,9 @@ test_description='multi-pack-indexes' . ./test-lib.sh -test_expect_success 'write midx with no pakcs' ' - git midx --object-dir=. write +test_expect_success 'write midx with no packs' ' + git midx --object-dir=. write && + test_path_is_file pack/multi-pack-index ' test_done -- 2.18.0.rc1