Re: [PATCH v2 1/1] Use size_t instead of 'unsigned long' for data in memory

2018-11-21 Thread Derrick Stolee

On 11/20/2018 12:04 AM, tbo...@web.de wrote:

From: Torsten Bögershausen 

Currently the length of data which is stored in memory is stored
in "unsigned long" at many places in the code base.
This is OK when both "unsigned long" and size_t are 32 bits,
(and is OK when both are 64 bits).
On a 64 bit Windows system am "unsigned long" is 32 bit, and
that may be too short to measure the size of objects in memory,
a size_t is the natural choice.


Is this change enough to store 4GB files on Windows? Or is there more to 
do?



Thanks for all the comments on V1.
Changes since V1:
- Make the motivation somewhat clearer in the commit message
- Rebase to the November 19 pu

What we really need for this patch to fly are this branches:
mk/use-size-t-in-zlib
tb/print-size-t-with-uintmax-format


I believe communicating these direct dependencies is the correct way to 
go, and the rebase you mentioned is unnecessary (instead, test with a 
merge).


Hopefully the patch applies on top of a merge of those branches.


@@ -529,7 +530,7 @@ static void *unpack_raw_entry(struct object_entry *obj,
  }
  
  static void *unpack_data(struct object_entry *obj,

-int (*consume)(const unsigned char *, unsigned long, 
void *),
+int (*consume)(const unsigned char *, size_t, void *),
 void *cb_data)


This is the only instance that is not paired directly with a variable 
name like "size", "sz", or "len". However, it appears to be correct from 
context.


Thanks for this!

Reviewed-by: Derrick Stolee 



[PATCH v2 1/1] Use size_t instead of 'unsigned long' for data in memory

2018-11-19 Thread tboegi
From: Torsten Bögershausen 

Currently the length of data which is stored in memory is stored
in "unsigned long" at many places in the code base.
This is OK when both "unsigned long" and size_t are 32 bits,
(and is OK when both are 64 bits).
On a 64 bit Windows system am "unsigned long" is 32 bit, and
that may be too short to measure the size of objects in memory,
a size_t is the natural choice.

Improve the code base in "small steps", as small as possible.
The smallest step seems to be much bigger than expected.
See this code-snippet from convert.c:

const char *ret;
unsigned long sz;
void *data = read_blob_data_from_index(istate, path, );
ret = gather_convert_stats_ascii(data, sz);

The corrected version looks like this:
const char *ret;
size_t sz;
void *data = read_blob_data_from_index(istate, path, );
ret = gather_convert_stats_ascii(data, sz);

However, when the Git code base is compiled with a compiler that
complains that "unsigned long" is different from size_t, we end
up in this huge patch, before the code base cleanly compiles.

Signed-off-by: Torsten Bögershausen 
---

Thanks for all the comments on V1.
Changes since V1:
- Make the motivation somewhat clearer in the commit message
- Rebase to the November 19 pu

What we really need for this patch to fly are this branches:
mk/use-size-t-in-zlib
tb/print-size-t-with-uintmax-format

And then it is rebased on top of all cooking stuff, too many branches
to be mentioned here.

It may be usefull to examine all "unsigned long" which are left after
this patch and turn them into (what ? unsigned int? size_t? uint32_t ?).
And once they are settled, re-do this patch with help of a coccinelle script.
I don't know.
I probably will rebase it until Junio says stop or someone else comes with
a better solution.

apply.c  | 14 -
 archive-tar.c| 18 +--
 archive-zip.c|  2 +-
 archive.c|  2 +-
 archive.h|  2 +-
 bisect.c |  2 +-
 blame.c  |  6 ++--
 blame.h  |  2 +-
 builtin/cat-file.c   | 10 +++---
 builtin/difftool.c   |  2 +-
 builtin/fast-export.c|  6 ++--
 builtin/fmt-merge-msg.c  |  3 +-
 builtin/fsck.c   |  6 ++--
 builtin/grep.c   |  8 ++---
 builtin/index-pack.c | 27 
 builtin/log.c|  4 +--
 builtin/ls-tree.c|  2 +-
 builtin/merge-tree.c |  6 ++--
 builtin/mktag.c  |  4 +--
 builtin/notes.c  |  6 ++--
 builtin/pack-objects.c   | 56 +-
 builtin/reflog.c |  2 +-
 builtin/replace.c|  2 +-
 builtin/tag.c|  4 +--
 builtin/unpack-file.c|  2 +-
 builtin/unpack-objects.c | 35 ++---
 builtin/verify-commit.c  |  4 +--
 bundle.c |  2 +-
 cache.h  | 10 +++---
 combine-diff.c   | 11 ---
 commit.c | 22 +++---
 commit.h | 10 +++---
 config.c |  2 +-
 convert.c| 18 +--
 delta.h  | 20 ++--
 diff-delta.c |  4 +--
 diff.c   | 30 +-
 diff.h   |  2 +-
 diffcore-pickaxe.c   |  4 +--
 diffcore.h   |  2 +-
 dir.c|  6 ++--
 dir.h|  2 +-
 entry.c  |  4 +--
 fast-import.c| 26 
 fsck.c   | 12 
 fsck.h   |  2 +-
 fuzz-pack-headers.c  |  4 +--
 grep.h   |  2 +-
 http-push.c  |  2 +-
 list-objects-filter.c|  2 +-
 mailmap.c|  2 +-
 match-trees.c|  4 +--
 merge-blobs.c|  6 ++--
 merge-blobs.h|  2 +-
 merge-recursive.c|  4 +--
 notes-cache.c|  2 +-
 notes-merge.c|  4 +--
 notes.c  |  6 ++--
 object-store.h   | 20 ++--
 object.c |  4 +--
 object.h |  2 +-
 pack-check.c |  2 +-
 pack-objects.h   | 14 -
 pack.h   |  2 +-
 packfile.c   | 40 
 packfile.h   |  8 ++---
 patch-delta.c|  8 ++---
 range-diff.c |  2 +-
 read-cache.c | 48 ++---
 ref-filter.c | 30 +-
 remote-testsvn.c |  4 +--
 rerere.c |  2 +-
 sha1-file.c  | 66 
 sha1dc_git.c |  2 +-
 sha1dc_git.h |  2 +-
 streaming.c  | 12 
 streaming.h  |  2 +-
 submodule-config.c   |  2 +-
 t/helper/test-delta.c|  2 +-
 tag.c|  6 ++--
 tag.h|  2 +-
 tree-walk.c  | 14 -
 tree.c