Re: [PATCH v5 0/4] Improve abbreviation disambiguation

2017-10-12 Thread Jeff King
On Thu, Oct 12, 2017 at 09:21:15PM +0900, Junio C Hamano wrote:

> Derrick Stolee  writes:
> 
> > On 10/12/2017 8:02 AM, Derrick Stolee wrote:
> >> Changes since previous version:
> >>
> >> * Make 'pos' unsigned in get_hex_char_from_oid()
> >>
> >> * Check response from open_pack_index()
> >>
> >> * Small typos in commit messages
> >>
> >> Thanks,
> >>   Stolee
> >>
> > I forgot to mention that I rebased on master this morning to be sure
> > this doesn't conflict with the binary-search patch that was queued
> > this week.
> 
> Thanks for being extra careful.  
> 
> I've re-applied minor style fix s/(void\*)/(void \*)/ to 2/4 and 4/4
> but other than that, the difference between this round and what has
> been queued looked all reasonable.
> 
> Will replace.

This looks good to me, too.

At first I was going to point out the nit that unique_in_pack() is
quietly fixed in 4/4 for the empty-pack case. But I don't think it's
actually buggy. The examination of nth_packed_object_oid() would never
be triggered if "num" is 0. So it probably is fine simply to fix it
quietly along with adding the new function.

-Peff


Re: [PATCH v5 0/4] Improve abbreviation disambiguation

2017-10-12 Thread Junio C Hamano
Derrick Stolee  writes:

> On 10/12/2017 8:02 AM, Derrick Stolee wrote:
>> Changes since previous version:
>>
>> * Make 'pos' unsigned in get_hex_char_from_oid()
>>
>> * Check response from open_pack_index()
>>
>> * Small typos in commit messages
>>
>> Thanks,
>>   Stolee
>>
> I forgot to mention that I rebased on master this morning to be sure
> this doesn't conflict with the binary-search patch that was queued
> this week.

Thanks for being extra careful.  

I've re-applied minor style fix s/(void\*)/(void \*)/ to 2/4 and 4/4
but other than that, the difference between this round and what has
been queued looked all reasonable.

Will replace.



Re: [PATCH v5 0/4] Improve abbreviation disambiguation

2017-10-12 Thread Derrick Stolee

On 10/12/2017 8:02 AM, Derrick Stolee wrote:

Changes since previous version:

* Make 'pos' unsigned in get_hex_char_from_oid()

* Check response from open_pack_index()

* Small typos in commit messages

Thanks,
  Stolee

I forgot to mention that I rebased on master this morning to be sure 
this doesn't conflict with the binary-search patch that was queued this 
week.


Thanks,
 Stolee


[PATCH v5 0/4] Improve abbreviation disambiguation

2017-10-12 Thread Derrick Stolee
Changes since previous version:

* Make 'pos' unsigned in get_hex_char_from_oid()

* Check response from open_pack_index()

* Small typos in commit messages 

Thanks,
 Stolee

---

When displaying object ids, we frequently want to see an abbreviation
for easier typing. That abbreviation must be unambiguous among all
object ids.

The current implementation of find_unique_abbrev() performs a loop
checking if each abbreviation length is unambiguous until finding one
that works. This causes multiple round-trips to the disk when starting
with the default abbreviation length (usually 7) but needing up to 12
characters for an unambiguous short-sha. For very large repos, this
effect is pronounced and causes issues with several commands, from
obvious consumers `status` and `log` to less obvious commands such as
`fetch` and `push`.

This patch improves performance by iterating over objects matching the
short abbreviation only once, inspecting each object id, and reporting
the minimum length of an unambiguous abbreviation.

Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time
computing diffs. Below we report performance statistics for perf test
4211.6 from p4211-line-log.sh using three copies of the Linux repo:

| Packs | Loose  | Base Time | New Time | Rel%  |
|---||---|--|---|
|  1|  0 |   41.27 s |  38.93 s | -4.8% |
| 24|  0 |   98.04 s |  91.35 s | -5.7% |
| 23| 323952 |  117.78 s | 112.18 s | -4.8% |

Derrick Stolee (4):
  p4211-line-log.sh: add log --online --raw --parents perf test
  sha1_name: unroll len loop in find_unique_abbrev_r
  sha1_name: parse less while finding common prefix
  sha1_name: minimize OID comparisons during disambiguation
 sha1_name.c  | 135 +--
 t/perf/p4211-line-log.sh |   4 ++
 2 files changed, 123 insertions(+), 16 deletions(-)

-- 
2.14.1.538.g56ec8fc98.dirty