Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-27 Thread Junio C Hamano
Sun He  writes:

> Signed-off-by: Sun He 
> ---
>  bulk-checkin.c |   10 +-
>  1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 118c625..8c47d71 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -23,7 +23,8 @@ static struct bulk_checkin_state {
>  static void finish_bulk_checkin(struct bulk_checkin_state *state)
>  {
>   unsigned char sha1[20];
> - char packname[PATH_MAX];
> + char *packname;
> +struct strbuf sb;

Funny indentation.

>   int i;
>  
>   if (!state->f)
> @@ -43,6 +44,10 @@ static void finish_bulk_checkin(struct bulk_checkin_state 
> *state)
>   close(fd);
>   }
>  
> +/* 64-1 is more than the sum of len(sha1_to_hex(sha1)) and len(".pack") 
> */
> +strbuf_init(&sb,strlen(get_object_directory())+64);
> +packname = sb.buf;
> +
>   sprintf(packname, "%s/pack/pack-", get_object_directory());

If you are using strbuf why not use strbuf_addf() instead?  Then you
do not have to worry about "Is 64-1 enough?" and things like that.

>   finish_tmp_packfile(packname, state->pack_tmp_name,
>   state->written, state->nr_written,
> @@ -54,6 +59,9 @@ clear_exit:
>   free(state->written);
>   memset(state, 0, sizeof(*state));
>  
> +/* release sb space */
> +strbuf_release(&sb);

The function name is more than enough to explain what it does.  Drop
that comment.

>   /* Make objects we just wrote available to ourselves */
>   reprepare_packed_git();
>  }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-28 Thread Eric Sunshine
On Fri, Feb 28, 2014 at 3:28 AM, Sun He  wrote:
> Signed-off-by: Sun He 
> ---

Nicely done.

Due to the necessary changes to finish_tmp_packfile(), the focus of
this patch has changed slightly from that suggested by the
microproject. It's really now about finish_tmp_packfile() rather than
finish_bulk_checkin(). As such, it may make sense to adjust the patch
subject to reflect this. For instance:

  Subject: finish_tmp_packfile(): use strbuf for pathname construction

More comments below.

]> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index c733379..72fb82b 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -803,7 +803,7 @@ static void write_pack_file(void)
>
> if (!pack_to_stdout) {
> struct stat st;
> -   char tmpname[PATH_MAX];
> +   struct strbuf tmpname = STRBUF_INIT;
>
> /*
>  * Packs are runtime accessed in their mtime
> @@ -823,26 +823,22 @@ static void write_pack_file(void)
> utb.modtime = --last_mtime;
> if (utime(pack_tmp_name, &utb) < 0)
> warning("failed utime() on %s: %s",
> -   tmpname, strerror(errno));
> +   pack_tmp_name, 
> strerror(errno));

Good catch finding this bug (as your commentary below states).
Ideally, each conceptual change should be presented distinctly from
others, so this bug should have its own patch (even though it's just a
one-liner).

> }
>
> -   /* Enough space for "-.pack"? */
> -   if (sizeof(tmpname) <= strlen(base_name) + 50)
> -   die("pack base name '%s' too long", 
> base_name);
> -   snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
> +   strbuf_addf(&tmpname, "%s-", base_name);
>
> if (write_bitmap_index) {
> bitmap_writer_set_checksum(sha1);
> bitmap_writer_build_type_index(written_list, 
> nr_written);
> }
>
> -   finish_tmp_packfile(tmpname, pack_tmp_name,
> +   finish_tmp_packfile(&tmpname, pack_tmp_name,
> written_list, nr_written,
> &pack_idx_opts, sha1);
>
> if (write_bitmap_index) {
> -   char *end_of_name_prefix = strrchr(tmpname, 
> 0);
> -   sprintf(end_of_name_prefix, "%s.bitmap", 
> sha1_to_hex(sha1));
> +   strbuf_addf(&tmpname, "%s.bitmap" 
> ,sha1_to_hex(sha1));
>
> stop_progress(&progress_state);
>
> diff --git a/pack-write.c b/pack-write.c
> index 9b8308b..2204aa9 100644
> --- a/pack-write.c
> +++ b/pack-write.c
> @@ -336,7 +336,7 @@ struct sha1file *create_tmp_packfile(char **pack_tmp_name)
> return sha1fd(fd, *pack_tmp_name);
>  }
>
> -void finish_tmp_packfile(char *name_buffer,
> +void finish_tmp_packfile(struct strbuf *name_buffer,
>  const char *pack_tmp_name,
>  struct pack_idx_entry **written_list,
>  uint32_t nr_written,
> @@ -344,7 +344,7 @@ void finish_tmp_packfile(char *name_buffer,
>  unsigned char sha1[])
>  {
> const char *idx_tmp_name;
> -   char *end_of_name_prefix = strrchr(name_buffer, 0);
> +   int pre_len = name_buffer->len;

Minor: Perhaps basename_len (or some such) would convey a bit more
meaning than pre_len.

> if (adjust_shared_perm(pack_tmp_name))
> die_errno("unable to make temporary pack file readable");
> @@ -354,17 +354,21 @@ void finish_tmp_packfile(char *name_buffer,
> if (adjust_shared_perm(idx_tmp_name))
> die_errno("unable to make temporary index file readable");
>
> -   sprintf(end_of_name_prefix, "%s.pack", sha1_to_hex(sha1));
> -   free_pack_by_name(name_buffer);
> +   strbuf_addf(name_buffer, "%s.pack", sha1_to_hex(sha1));
> +   free_pack_by_name(name_buffer->buf);
>
> -   if (rename(pack_tmp_name, name_buffer))
> +   if (rename(pack_tmp_name, name_buffer->buf))
> die_errno("unable to rename temporary pack file");
>
> -   sprintf(end_of_name_prefix, "%s.idx", sha1_to_hex(sha1));
> -   if (rename(idx_tmp_name, name_buffer))
> +   /* remove added suffix string(sha1.pack) */
> +   strbuf_remove(name_buffer, pre_len, name_buffer->len - pre_len);

Since you are merely truncating the buffer back to pre_len, perhaps

strbuf_setlen(name_buffer, pre_len)

would be more idiomatic and easier to read (and would allow you to
drop

Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-28 Thread Eric Sunshine
On Fri, Feb 28, 2014 at 4:46 AM, Eric Sunshine  wrote:
> On Fri, Feb 28, 2014 at 3:28 AM, Sun He  wrote:
>> Signed-off-by: Sun He 
>> ---
>
> Nicely done.
>
> Due to the necessary changes to finish_tmp_packfile(), the focus of
> this patch has changed slightly from that suggested by the
> microproject. It's really now about finish_tmp_packfile() rather than
> finish_bulk_checkin(). As such, it may make sense to adjust the patch
> subject to reflect this. For instance:
>
>   Subject: finish_tmp_packfile(): use strbuf for pathname construction

You may also want your commit message to explain why you chose this
approach over other legitimate approaches. For instance, your change
places extra burden on callers of finish_tmp_packfile() by leaking a
detail of its implementation: namely that it wants to manipulate
pathnames easily (via strbuf). An equally valid and more encapsulated
approach would be for finish_tmp_packfile() to accept a 'const char *'
and construct its own strbuf internally.

If the extra strbuf creation in the alternate approach is measurably
slower, then you could use that fact to justify your implementation
choice in the commit message. On the other hand, if it's not
measurably slower, then perhaps the more encapsulated approach with
cleaner API might be preferable.

> More comments below.
>
> ]> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index c733379..72fb82b 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -803,7 +803,7 @@ static void write_pack_file(void)
>>
>> if (!pack_to_stdout) {
>> struct stat st;
>> -   char tmpname[PATH_MAX];
>> +   struct strbuf tmpname = STRBUF_INIT;
>>
>> /*
>>  * Packs are runtime accessed in their mtime
>> @@ -823,26 +823,22 @@ static void write_pack_file(void)
>> utb.modtime = --last_mtime;
>> if (utime(pack_tmp_name, &utb) < 0)
>> warning("failed utime() on %s: %s",
>> -   tmpname, strerror(errno));
>> +   pack_tmp_name, 
>> strerror(errno));
>
> Good catch finding this bug (as your commentary below states).
> Ideally, each conceptual change should be presented distinctly from
> others, so this bug should have its own patch (even though it's just a
> one-liner).
>
>> }
>>
>> -   /* Enough space for "-.pack"? */
>> -   if (sizeof(tmpname) <= strlen(base_name) + 50)
>> -   die("pack base name '%s' too long", 
>> base_name);
>> -   snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
>> +   strbuf_addf(&tmpname, "%s-", base_name);
>>
>> if (write_bitmap_index) {
>> bitmap_writer_set_checksum(sha1);
>> bitmap_writer_build_type_index(written_list, 
>> nr_written);
>> }
>>
>> -   finish_tmp_packfile(tmpname, pack_tmp_name,
>> +   finish_tmp_packfile(&tmpname, pack_tmp_name,
>> written_list, nr_written,
>> &pack_idx_opts, sha1);
>>
>> if (write_bitmap_index) {
>> -   char *end_of_name_prefix = strrchr(tmpname, 
>> 0);
>> -   sprintf(end_of_name_prefix, "%s.bitmap", 
>> sha1_to_hex(sha1));
>> +   strbuf_addf(&tmpname, "%s.bitmap" 
>> ,sha1_to_hex(sha1));
>>
>> stop_progress(&progress_state);
>>
>> diff --git a/pack-write.c b/pack-write.c
>> index 9b8308b..2204aa9 100644
>> --- a/pack-write.c
>> +++ b/pack-write.c
>> @@ -336,7 +336,7 @@ struct sha1file *create_tmp_packfile(char 
>> **pack_tmp_name)
>> return sha1fd(fd, *pack_tmp_name);
>>  }
>>
>> -void finish_tmp_packfile(char *name_buffer,
>> +void finish_tmp_packfile(struct strbuf *name_buffer,
>>  const char *pack_tmp_name,
>>  struct pack_idx_entry **written_list,
>>  uint32_t nr_written,
>> @@ -344,7 +344,7 @@ void finish_tmp_packfile(char *name_buffer,
>>  unsigned char sha1[])
>>  {
>> const char *idx_tmp_name;
>> -   char *end_of_name_prefix = strrchr(name_buffer, 0);
>> +   int pre_len = name_buffer->len;
>
> Minor: Perhaps basename_len (or some such) would convey a bit more
> meaning than pre_len.
>
>> if (adjust_shared_perm(pack_tmp_name))
>> die_errno("unable to make temporary pack file readable");
>> @@ -354,17 +354,21 @@ void finish_tmp_packfile(char *name_buffer,
>> if (adjust_shared_pe

Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-28 Thread 孙赫
2014-02-28 21:12 GMT+08:00 Eric Sunshine [via git]
:
> On Fri, Feb 28, 2014 at 4:46 AM, Eric Sunshine <[hidden email]> wrote:
>
>> On Fri, Feb 28, 2014 at 3:28 AM, Sun He <[hidden email]> wrote:
>>> Signed-off-by: Sun He <[hidden email]>
>>> ---
>>
>> Nicely done.
>>
>> Due to the necessary changes to finish_tmp_packfile(), the focus of
>> this patch has changed slightly from that suggested by the
>> microproject. It's really now about finish_tmp_packfile() rather than
>> finish_bulk_checkin(). As such, it may make sense to adjust the patch
>> subject to reflect this. For instance:
>>
>>   Subject: finish_tmp_packfile(): use strbuf for pathname construction
>
> You may also want your commit message to explain why you chose this
> approach over other legitimate approaches. For instance, your change
> places extra burden on callers of finish_tmp_packfile() by leaking a
> detail of its implementation: namely that it wants to manipulate
> pathnames easily (via strbuf). An equally valid and more encapsulated
> approach would be for finish_tmp_packfile() to accept a 'const char *'
> and construct its own strbuf internally.
>
> If the extra strbuf creation in the alternate approach is measurably
> slower, then you could use that fact to justify your implementation
> choice in the commit message. On the other hand, if it's not
> measurably slower, then perhaps the more encapsulated approach with
> cleaner API might be preferable.
>

Thank you for your explaination. I get the point.
And I think if it is proved that the strbuf way is measurably slower.
We should add a check for the length of string before we sprintf().

>> More comments below.
>>
>> ]> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>>> index c733379..72fb82b 100644
>>> --- a/builtin/pack-objects.c
>>> +++ b/builtin/pack-objects.c
>>> @@ -803,7 +803,7 @@ static void write_pack_file(void)
>>>
>>> if (!pack_to_stdout) {
>>> struct stat st;
>>> -   char tmpname[PATH_MAX];
>>> +   struct strbuf tmpname = STRBUF_INIT;
>>>
>>> /*
>>>  * Packs are runtime accessed in their mtime
>>> @@ -823,26 +823,22 @@ static void write_pack_file(void)
>>> utb.modtime = --last_mtime;
>>> if (utime(pack_tmp_name, &utb) < 0)
>>> warning("failed utime() on %s:
>>> %s",
>>> -   tmpname,
>>> strerror(errno));
>>> +   pack_tmp_name,
>>> strerror(errno));
>>
>> Good catch finding this bug (as your commentary below states).
>> Ideally, each conceptual change should be presented distinctly from
>> others, so this bug should have its own patch (even though it's just a
>> one-liner).
>>
>>> }
>>>
>>> -   /* Enough space for "-.pack"? */
>>> -   if (sizeof(tmpname) <= strlen(base_name) + 50)
>>> -   die("pack base name '%s' too long",
>>> base_name);
>>> -   snprintf(tmpname, sizeof(tmpname), "%s-",
>>> base_name);
>>> +   strbuf_addf(&tmpname, "%s-", base_name);
>>>
>>> if (write_bitmap_index) {
>>> bitmap_writer_set_checksum(sha1);
>>>
>>> bitmap_writer_build_type_index(written_list, nr_written);
>>> }
>>>
>>> -   finish_tmp_packfile(tmpname, pack_tmp_name,
>>> +   finish_tmp_packfile(&tmpname, pack_tmp_name,
>>> written_list, nr_written,
>>> &pack_idx_opts, sha1);
>>>
>>> if (write_bitmap_index) {
>>> -   char *end_of_name_prefix =
>>> strrchr(tmpname, 0);
>>> -   sprintf(end_of_name_prefix, "%s.bitmap",
>>> sha1_to_hex(sha1));
>>> +   strbuf_addf(&tmpname, "%s.bitmap"
>>> ,sha1_to_hex(sha1));
>>>
>>> stop_progress(&progress_state);
>>>
>>> diff --git a/pack-write.c b/pack-write.c
>>> index 9b8308b..2204aa9 100644
>>> --- a/pack-write.c
>>> +++ b/pack-write.c
>>> @@ -336,7 +336,7 @@ struct sha1file *create_tmp_packfile(char
>>> **pack_tmp_name)
>>> return sha1fd(fd, *pack_tmp_name);
>>>  }
>>>
>>> -void finish_tmp_packfile(char *name_buffer,
>>> +void finish_tmp_packfile(struct strbuf *name_buffer,
>>>  const char *pack_tmp_name,
>>>  struct pack_idx_entry **written_list,
>>>  uint32_t nr_written,
>>> @@ -344,7 +344,7 @@ void finish_tmp_packfile(char *name_buffer,
>>>  unsigned char sha1[])
>>>  {
>>> const char *idx_tmp_name;
>>> -   char *end_of_name_prefix = strrchr

Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-28 Thread Eric Sunshine
On Fri, Feb 28, 2014 at 9:17 AM, 孙赫  wrote:
> 2014-02-28 21:12 GMT+08:00 Eric Sunshine [via git]
> :
>> On Fri, Feb 28, 2014 at 4:46 AM, Eric Sunshine <[hidden email]> wrote:
>>
>>> On Fri, Feb 28, 2014 at 3:28 AM, Sun He <[hidden email]> wrote:
 Signed-off-by: Sun He <[hidden email]>
 ---
>>>
>>> Nicely done.
>>>
>>> Due to the necessary changes to finish_tmp_packfile(), the focus of
>>> this patch has changed slightly from that suggested by the
>>> microproject. It's really now about finish_tmp_packfile() rather than
>>> finish_bulk_checkin(). As such, it may make sense to adjust the patch
>>> subject to reflect this. For instance:
>>>
>>>   Subject: finish_tmp_packfile(): use strbuf for pathname construction
>>
>> You may also want your commit message to explain why you chose this
>> approach over other legitimate approaches. For instance, your change
>> places extra burden on callers of finish_tmp_packfile() by leaking a
>> detail of its implementation: namely that it wants to manipulate
>> pathnames easily (via strbuf). An equally valid and more encapsulated
>> approach would be for finish_tmp_packfile() to accept a 'const char *'
>> and construct its own strbuf internally.
>>
>> If the extra strbuf creation in the alternate approach is measurably
>> slower, then you could use that fact to justify your implementation
>> choice in the commit message. On the other hand, if it's not
>> measurably slower, then perhaps the more encapsulated approach with
>> cleaner API might be preferable.
>>
>
> Thank you for your explaination. I get the point.
> And I think if it is proved that the strbuf way is measurably slower.
> We should add a check for the length of string before we sprintf().

I'm not sure what you mean by checking the string length before sprintf().

My point was that if there are multiple ways of solving the same
problem, it can be helpful for reviewers if your commit message
explains why the solution you chose is better than the others.

Slowness and/or cleanliness of API were just examples you might use in
your commit message for justifying why you chose one approach over
another.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-28 Thread Eric Sunshine
On Fri, Feb 28, 2014 at 10:54 AM, 孙赫  wrote:
> 2014-02-28 17:47 GMT+08:00 Eric Sunshine [via git]
> :
>> On Fri, Feb 28, 2014 at 3:28 AM, Sun He <[hidden email]> wrote:
>>> Signed-off-by: Sun He <[hidden email]>
>>> ---
>> > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>>
>>> index c733379..72fb82b 100644
>>> --- a/builtin/pack-objects.c
>>> +++ b/builtin/pack-objects.c
>>> @@ -803,7 +803,7 @@ static void write_pack_file(void)
>>>
>>> if (!pack_to_stdout) {
>>> struct stat st;
>>> -   char tmpname[PATH_MAX];
>>> +   struct strbuf tmpname = STRBUF_INIT;
>>>
>>> /*
>>>  * Packs are runtime accessed in their mtime
>>> @@ -823,26 +823,22 @@ static void write_pack_file(void)
>>> utb.modtime = --last_mtime;
>>> if (utime(pack_tmp_name, &utb) < 0)
>>> warning("failed utime() on %s:
>>> %s",
>>> -   tmpname, strerror(errno));
>>> +   pack_tmp_name,
>>> strerror(errno));
>>
>> Good catch finding this bug (as your commentary below states).
>> Ideally, each conceptual change should be presented distinctly from
>> others, so this bug should have its own patch (even though it's just a
>> one-liner).
>>
>
> OK. Should I send this patch?

Yes, it is perfectly acceptable (and encouraged) to send this patch as
a submission distinct from your microproject.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Rewrite bulk-checkin.c:finish_bulk_checkin() to use a strbuf for handling packname

2014-02-28 Thread He Sun
2014-03-01 4:42 GMT+08:00 Eric Sunshine :
> On Fri, Feb 28, 2014 at 9:17 AM, 孙赫  wrote:
>> 2014-02-28 21:12 GMT+08:00 Eric Sunshine [via git]
>> :
>>> On Fri, Feb 28, 2014 at 4:46 AM, Eric Sunshine <[hidden email]> wrote:
>>>
 On Fri, Feb 28, 2014 at 3:28 AM, Sun He <[hidden email]> wrote:
> Signed-off-by: Sun He <[hidden email]>
> ---

 Nicely done.

 Due to the necessary changes to finish_tmp_packfile(), the focus of
 this patch has changed slightly from that suggested by the
 microproject. It's really now about finish_tmp_packfile() rather than
 finish_bulk_checkin(). As such, it may make sense to adjust the patch
 subject to reflect this. For instance:

   Subject: finish_tmp_packfile(): use strbuf for pathname construction
>>>
>>> You may also want your commit message to explain why you chose this
>>> approach over other legitimate approaches. For instance, your change
>>> places extra burden on callers of finish_tmp_packfile() by leaking a
>>> detail of its implementation: namely that it wants to manipulate
>>> pathnames easily (via strbuf). An equally valid and more encapsulated
>>> approach would be for finish_tmp_packfile() to accept a 'const char *'
>>> and construct its own strbuf internally.
>>>
>>> If the extra strbuf creation in the alternate approach is measurably
>>> slower, then you could use that fact to justify your implementation
>>> choice in the commit message. On the other hand, if it's not
>>> measurably slower, then perhaps the more encapsulated approach with
>>> cleaner API might be preferable.
>>>
>>
>> Thank you for your explaination. I get the point.
>> And I think if it is proved that the strbuf way is measurably slower.
>> We should add a check for the length of string before we sprintf().
>
> I'm not sure what you mean by checking the string length before sprintf().
>

That's because one is not certain of the length of get_object_directory()

Micheal Mhaggerty explained this to me.
Saving stack space is nice, though given that it takes more time to
allocate space on the heap, it is nonetheless usually preferred to use
the stack for temporary variables of this size.

The problem has more to do with the fact that the old version fixes a
maximum length on the buffer, which could be a problem if one is not
certain of the length of get_object_directory().

The other point of strbuf is that you don't have to do the error-prone
bookkeeping yourself.  So it would be preferable to use strbuf_addf().

It is the same as yours about the space and time costs. ^_^

> My point was that if there are multiple ways of solving the same
> problem, it can be helpful for reviewers if your commit message
> explains why the solution you chose is better than the others.
>
> Slowness and/or cleanliness of API were just examples you might use in
> your commit message for justifying why you chose one approach over
> another.

OK, OK, Got it. Thank you very much.

Cheers,
He Sun
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html