Re: [PATCH v4 12/24] read-cache: read index-v5
On Wed, Nov 27, 2013 at 7:00 PM, Thomas Gummerer t.gumme...@gmail.com wrote: --- a/cache.h +++ b/cache.h @@ -132,11 +141,17 @@ struct cache_entry { char name[FLEX_ARRAY]; /* more */ }; +#define CE_NAMEMASK (0x0fff) CE_NAMEMASK is redefined in read-cache-v2.c in read-cache: move index v2 specific functions to their own file. My gcc is smart enough to see the two defines are about the same value and does not warn me. But we should remove one (likely this one as I see no use of this macro outside read-cache-v2.c) #define CE_STAGEMASK (0x3000) #define CE_EXTENDED (0x4000) #define CE_VALID (0x8000) +#define CE_SMUDGED (0x0400) /* index v5 only flag */ #define CE_STAGESHIFT 12 +#define CONFLICT_CONFLICTED (0x8000) +#define CONFLICT_STAGESHIFT 13 +#define CONFLICT_STAGEMASK (0x6000) + /* * Range 0x in ce_flags is divided into * two parts: in-memory flags and on-disk ones. diff --git a/read-cache-v5.c b/read-cache-v5.c new file mode 100644 index 000..9d8c8f0 --- /dev/null +++ b/read-cache-v5.c +static int read_index_v5(struct index_state *istate, void *mmap, +unsigned long mmap_size, struct filter_opts *opts) +{ + unsigned int entry_offset, foffsetblock, nr = 0, *extoffsets; + unsigned int dir_offset, dir_table_offset; + int need_root = 0, i; + uint32_t *offset; + struct directory_entry *root_directory, *de, *last_de; + const char **paths = NULL; + struct pathspec adjusted_pathspec; + struct cache_header *hdr; + struct cache_header_v5 *hdr_v5; + + hdr = mmap; + hdr_v5 = ptr_add(mmap, sizeof(*hdr)); + istate-cache_alloc = alloc_nr(ntohl(hdr-hdr_entries)); + istate-cache = xcalloc(istate-cache_alloc, sizeof(struct cache_entry *)); + extoffsets = xcalloc(ntohl(hdr_v5-hdr_nextension), sizeof(int)); + for (i = 0; i ntohl(hdr_v5-hdr_nextension); i++) { + offset = ptr_add(mmap, sizeof(*hdr) + sizeof(*hdr_v5)); + extoffsets[i] = htonl(*offset); + } + + /* Skip size of the header + crc sum + size of offsets to extensions + size of offsets */ + dir_offset = sizeof(*hdr) + sizeof(*hdr_v5) + ntohl(hdr_v5-hdr_nextension) * 4 + 4 + + (ntohl(hdr_v5-hdr_ndir) + 1) * 4; + dir_table_offset = sizeof(*hdr) + sizeof(*hdr_v5) + ntohl(hdr_v5-hdr_nextension) * 4 + 4; + root_directory = read_directories(dir_offset, dir_table_offset, + mmap, mmap_size); + + entry_offset = ntohl(hdr_v5-hdr_fblockoffset); + foffsetblock = dir_offset; + + if (opts opts-pathspec opts-pathspec-nr) { + paths = xmalloc((opts-pathspec-nr + 1)*sizeof(char *)); + paths[opts-pathspec-nr] = NULL; Put this statement here GUARD_PATHSPEC(opts-pathspec, PATHSPEC_FROMTOP | PATHSPEC_MAXDEPTH | PATHSPEC_LITERAL | PATHSPEC_GLOB | PATHSPEC_ICASE); This says the mentioned magic is safe in this code. New magic may or may not be and needs to be checked (soonest by me, I'm going to add negative pathspec and I'll need to look into how it should be handled in this code block). + for (i = 0; i opts-pathspec-nr; i++) { + char *super = strdup(opts-pathspec-items[i].match); + int len = strlen(super); You should only check as far as items[i].nowildcard_len, not strlen(). The rest could be wildcards and stuff and not so reliable. + while (len super[len - 1] == '/' super[len - 2] == '/') + super[--len] = '\0'; /* strip all but one trailing slash */ + while (len super[--len] != '/') + ; /* scan backwards to next / */ + if (len = 0) + super[len--] = '\0'; + if (len = 0) { + need_root = 1; + break; + } + paths[i] = super; + } And maybe put the comment FIXME: consider merging this code with create_simplify() in dir.c somewhere. It's for me to look for things to do when I'm bored ;-) + } + + if (!need_root) + parse_pathspec(adjusted_pathspec, PATHSPEC_ALL_MAGIC, PATHSPEC_PREFER_CWD, NULL, paths); I would go with PATHSPEC_PREFER_FULL instead of _CWD as it's safer. Looking only at this function without caller context, it's hard to say if _CWD is the right choice. + + de = root_directory; + last_de = de; This statement is redundant. last_de is only used in one code block below and it's always re-initialized before entering the loop to skip subdirs. + while (de) { + if (need_root || +
Re: [PATCH v4 12/24] read-cache: read index-v5
Duy Nguyen pclo...@gmail.com writes: On Wed, Nov 27, 2013 at 7:00 PM, Thomas Gummerer t.gumme...@gmail.com wrote: --- a/cache.h +++ b/cache.h @@ -132,11 +141,17 @@ struct cache_entry { char name[FLEX_ARRAY]; /* more */ }; +#define CE_NAMEMASK (0x0fff) CE_NAMEMASK is redefined in read-cache-v2.c in read-cache: move index v2 specific functions to their own file. My gcc is smart enough to see the two defines are about the same value and does not warn me. But we should remove one (likely this one as I see no use of this macro outside read-cache-v2.c) Thanks for catching that, there's no need to have it here. I'll remove it in the re-roll. #define CE_STAGEMASK (0x3000) #define CE_EXTENDED (0x4000) #define CE_VALID (0x8000) +#define CE_SMUDGED (0x0400) /* index v5 only flag */ #define CE_STAGESHIFT 12 +#define CONFLICT_CONFLICTED (0x8000) +#define CONFLICT_STAGESHIFT 13 +#define CONFLICT_STAGEMASK (0x6000) + /* * Range 0x in ce_flags is divided into * two parts: in-memory flags and on-disk ones. diff --git a/read-cache-v5.c b/read-cache-v5.c new file mode 100644 index 000..9d8c8f0 --- /dev/null +++ b/read-cache-v5.c +static int read_index_v5(struct index_state *istate, void *mmap, +unsigned long mmap_size, struct filter_opts *opts) +{ + unsigned int entry_offset, foffsetblock, nr = 0, *extoffsets; + unsigned int dir_offset, dir_table_offset; + int need_root = 0, i; + uint32_t *offset; + struct directory_entry *root_directory, *de, *last_de; + const char **paths = NULL; + struct pathspec adjusted_pathspec; + struct cache_header *hdr; + struct cache_header_v5 *hdr_v5; + + hdr = mmap; + hdr_v5 = ptr_add(mmap, sizeof(*hdr)); + istate-cache_alloc = alloc_nr(ntohl(hdr-hdr_entries)); + istate-cache = xcalloc(istate-cache_alloc, sizeof(struct cache_entry *)); + extoffsets = xcalloc(ntohl(hdr_v5-hdr_nextension), sizeof(int)); + for (i = 0; i ntohl(hdr_v5-hdr_nextension); i++) { + offset = ptr_add(mmap, sizeof(*hdr) + sizeof(*hdr_v5)); + extoffsets[i] = htonl(*offset); + } + + /* Skip size of the header + crc sum + size of offsets to extensions + size of offsets */ + dir_offset = sizeof(*hdr) + sizeof(*hdr_v5) + ntohl(hdr_v5-hdr_nextension) * 4 + 4 + + (ntohl(hdr_v5-hdr_ndir) + 1) * 4; + dir_table_offset = sizeof(*hdr) + sizeof(*hdr_v5) + ntohl(hdr_v5-hdr_nextension) * 4 + 4; + root_directory = read_directories(dir_offset, dir_table_offset, + mmap, mmap_size); + + entry_offset = ntohl(hdr_v5-hdr_fblockoffset); + foffsetblock = dir_offset; + + if (opts opts-pathspec opts-pathspec-nr) { + paths = xmalloc((opts-pathspec-nr + 1)*sizeof(char *)); + paths[opts-pathspec-nr] = NULL; Put this statement here GUARD_PATHSPEC(opts-pathspec, PATHSPEC_FROMTOP | PATHSPEC_MAXDEPTH | PATHSPEC_LITERAL | PATHSPEC_GLOB | PATHSPEC_ICASE); This says the mentioned magic is safe in this code. New magic may or may not be and needs to be checked (soonest by me, I'm going to add negative pathspec and I'll need to look into how it should be handled in this code block). Thanks, I'll add the statement in the re-roll. + for (i = 0; i opts-pathspec-nr; i++) { + char *super = strdup(opts-pathspec-items[i].match); + int len = strlen(super); You should only check as far as items[i].nowildcard_len, not strlen(). The rest could be wildcards and stuff and not so reliable. Ok, will change it in the re-roll. + while (len super[len - 1] == '/' super[len - 2] == '/') + super[--len] = '\0'; /* strip all but one trailing slash */ + while (len super[--len] != '/') + ; /* scan backwards to next / */ + if (len = 0) + super[len--] = '\0'; + if (len = 0) { + need_root = 1; + break; + } + paths[i] = super; + } And maybe put the comment FIXME: consider merging this code with create_simplify() in dir.c somewhere. It's for me to look for things to do when I'm bored ;-) Heh, thanks, will do. + } + + if (!need_root) + parse_pathspec(adjusted_pathspec, PATHSPEC_ALL_MAGIC, PATHSPEC_PREFER_CWD, NULL, paths); I would go with PATHSPEC_PREFER_FULL instead of _CWD as it's safer. Looking only at this function without caller context, it's hard to say if _CWD is the right choice. Ok, thanks, will change. + + de = root_directory;
Re: [PATCH v4 12/24] read-cache: read index-v5
On Wed, Nov 27, 2013 at 1:00 PM, Thomas Gummerer t.gumme...@gmail.com wrote: Make git read the index file version 5 without complaining. This version of the reader reads neither the cache-tree nor the resolve undo data, however, it won't choke on an index that includes such data. Helped-by: Junio C Hamano gits...@pobox.com Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com Helped-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- [...] +static struct directory_entry *read_directories(unsigned int *dir_offset, + unsigned int *dir_table_offset, + void *mmap, int mmap_size) Minor nit: why is this mmap_size int while all others are unsigned long ? [...] +static int read_entry(struct cache_entry **ce, char *pathname, size_t pathlen, + void *mmap, unsigned long mmap_size, + unsigned int first_entry_offset, + unsigned int foffsetblock) [...] +static int read_entries(struct index_state *istate, struct directory_entry *de, + unsigned int first_entry_offset, void *mmap, + unsigned long mmap_size, unsigned int *nr, + unsigned int foffsetblock) [...] +static int read_index_v5(struct index_state *istate, void *mmap, +unsigned long mmap_size, struct filter_opts *opts) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 12/24] read-cache: read index-v5
On Wed, Nov 27, 2013 at 1:00 PM, Thomas Gummerer t.gumme...@gmail.com wrote: +static int verify_hdr(void *mmap, unsigned long size) +{ + uint32_t *filecrc; + unsigned int header_size; + struct cache_header *hdr; + struct cache_header_v5 *hdr_v5; + + if (size sizeof(struct cache_header) + + sizeof (struct cache_header_v5) + 4) + die(index file smaller than expected); + + hdr = mmap; + hdr_v5 = ptr_add(mmap, sizeof(*hdr)); + /* Size of the header + the size of the extensionoffsets */ + header_size = sizeof(*hdr) + sizeof(*hdr_v5) + hdr_v5-hdr_nextension * 4; + /* Initialize crc */ + filecrc = ptr_add(mmap, header_size); + if (!check_crc32(0, hdr, header_size, ntohl(*filecrc))) + return error(bad index file header crc signature); + return 0; +} I find it curious that we actually need a value from the header (and use it for pointer arithmetic) to check that the header is valid. The application will crash before the crc is checked if hdr_v5-hdr_nextensions is corrupted. Or am I missing something ? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 12/24] read-cache: read index-v5
Antoine Pelisse apeli...@gmail.com writes: On Wed, Nov 27, 2013 at 1:00 PM, Thomas Gummerer t.gumme...@gmail.com wrote: Make git read the index file version 5 without complaining. This version of the reader reads neither the cache-tree nor the resolve undo data, however, it won't choke on an index that includes such data. Helped-by: Junio C Hamano gits...@pobox.com Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com Helped-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- [...] +static struct directory_entry *read_directories(unsigned int *dir_offset, + unsigned int *dir_table_offset, + void *mmap, int mmap_size) Minor nit: why is this mmap_size int while all others are unsigned long ? Thanks for catching that. It should be unsigned long here to. Will fix in the re-roll. [...] +static int read_entry(struct cache_entry **ce, char *pathname, size_t pathlen, + void *mmap, unsigned long mmap_size, + unsigned int first_entry_offset, + unsigned int foffsetblock) [...] +static int read_entries(struct index_state *istate, struct directory_entry *de, + unsigned int first_entry_offset, void *mmap, + unsigned long mmap_size, unsigned int *nr, + unsigned int foffsetblock) [...] +static int read_index_v5(struct index_state *istate, void *mmap, +unsigned long mmap_size, struct filter_opts *opts) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 12/24] read-cache: read index-v5
Antoine Pelisse apeli...@gmail.com writes: On Wed, Nov 27, 2013 at 1:00 PM, Thomas Gummerer t.gumme...@gmail.com wrote: +static int verify_hdr(void *mmap, unsigned long size) +{ + uint32_t *filecrc; + unsigned int header_size; + struct cache_header *hdr; + struct cache_header_v5 *hdr_v5; + + if (size sizeof(struct cache_header) + + sizeof (struct cache_header_v5) + 4) + die(index file smaller than expected); + + hdr = mmap; + hdr_v5 = ptr_add(mmap, sizeof(*hdr)); + /* Size of the header + the size of the extensionoffsets */ + header_size = sizeof(*hdr) + sizeof(*hdr_v5) + hdr_v5-hdr_nextension * 4; + /* Initialize crc */ + filecrc = ptr_add(mmap, header_size); + if (!check_crc32(0, hdr, header_size, ntohl(*filecrc))) + return error(bad index file header crc signature); + return 0; +} I find it curious that we actually need a value from the header (and use it for pointer arithmetic) to check that the header is valid. The application will crash before the crc is checked if hdr_v5-hdr_nextensions is corrupted. Or am I missing something ? Good catch, I'm the one that was missing something here. We still need to use the value from the header before calculating the crc, but should check if header_size - 4 is less than the total size of the index file. Then even if the header is corrupted we won't read anything that is not mmap'ed and thus won't crash. This guard should also be included for everything else that checks the crc checksum, as that has the same problems and the calculated place in the file for the crc might be after the end of the file. Thanks, will fix in the re-roll. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html