Re: Delta compression not so effective

2017-03-07 Thread Thomas Braun
> Marius Storm-Olsen hat am 4. März 2017 um 09:27 > geschrieben: [...] > I really don't want the files on the mailinglist, so I'll send you a > link directly. However, small snippets for public discussions about > potential issues would be fine, obviously. git

Re: Delta compression not so effective

2017-03-06 Thread Marius Storm-Olsen
On 3/5/2017 19:14, Linus Torvalds wrote: On Sat, Mar 4, 2017 at 12:27 AM, Marius Storm-Olsen wrote: I guess you could do the printout a bit earlier (on the "to_pack.objects[]" array - to_pack.nr_objects is the count there). That should show all of them. But the small objects

Re: Delta compression not so effective

2017-03-05 Thread Linus Torvalds
On Sat, Mar 4, 2017 at 12:27 AM, Marius Storm-Olsen wrote: > > I reran the repack with the options above (dropping the zlib=9, as you > suggested) > > $ time git -c pack.threads=4 repack -a -d -F \ >--window=350 --depth=250 --window-memory=30g > > and ended

Re: Delta compression not so effective

2017-03-04 Thread Marius Storm-Olsen
On 3/1/2017 18:43, Linus Torvalds wrote: So, this repo must be knocking several parts of Git's insides. I was curious about why it was so slow on the writing objects part, since the whole repo is on a 4x RAID 5, 7k spindels. Now, they are not SSDs sure, but the thing has ~400MB/s continuous

Re: Delta compression not so effective

2017-03-01 Thread Linus Torvalds
On Wed, Mar 1, 2017 at 4:12 PM, Marius Storm-Olsen wrote: > > No, the list of git verify-objects in the previous post was from the bottom > of the sorted list, so those are the largest blobs, ~249MB.. .. so with a 6GB window, you should easily sill have 20+ objects. Not a huge

Re: Delta compression not so effective

2017-03-01 Thread Marius Storm-Olsen
On 3/1/2017 12:30, Linus Torvalds wrote: On Wed, Mar 1, 2017 at 9:57 AM, Marius Storm-Olsen wrote: Indeed, I did do a -c pack.threads=20 --window-memory=6g to 'git repack', since the machine is a 20-core (40 threads) machine with 126GB of RAM. So I guess with these

Re: Delta compression not so effective

2017-03-01 Thread Marius Storm-Olsen
On 3/1/2017 14:19, Martin Langhoff wrote: On Wed, Mar 1, 2017 at 8:51 AM, Marius Storm-Olsen wrote: BUT, even still, I would expect Git's delta compression to be quite effective, compared to the compression present in SVN. jar files are zipfiles. They don't delta in any

Re: Delta compression not so effective

2017-03-01 Thread Martin Langhoff
On Wed, Mar 1, 2017 at 8:51 AM, Marius Storm-Olsen wrote: > BUT, even still, I would expect Git's delta compression to be quite > effective, compared to the compression present in SVN. jar files are zipfiles. They don't delta in any useful form, and in fact they differ even

Re: Delta compression not so effective

2017-03-01 Thread Martin Langhoff
On Wed, Mar 1, 2017 at 1:30 PM, Linus Torvalds wrote: > For example, the sorting code thinks that objects with the same name > across the history are good sources of deltas. Marius has indicated he is working with jar files. IME jar and war files, which are

Re: Delta compression not so effective

2017-03-01 Thread Linus Torvalds
On Wed, Mar 1, 2017 at 9:57 AM, Marius Storm-Olsen wrote: > > Indeed, I did do a > -c pack.threads=20 --window-memory=6g > to 'git repack', since the machine is a 20-core (40 threads) machine with > 126GB of RAM. > > So I guess with these sized objects, even at 6GB per

Re: Delta compression not so effective

2017-03-01 Thread Marius Storm-Olsen
On 3/1/2017 11:36, Linus Torvalds wrote: On Wed, Mar 1, 2017 at 5:51 AM, Marius Storm-Olsen wrote: When first importing, I disabled gc to avoid any repacking until completed. When done importing, there was 209GB of all loose objects (~670k files). With the hopes of quick

Re: Delta compression not so effective

2017-03-01 Thread Linus Torvalds
On Wed, Mar 1, 2017 at 5:51 AM, Marius Storm-Olsen wrote: > > When first importing, I disabled gc to avoid any repacking until completed. > When done importing, there was 209GB of all loose objects (~670k files). > With the hopes of quick consolidation, I did a > git -c

Re: Delta compression not so effective

2017-03-01 Thread Junio C Hamano
On Wed, Mar 1, 2017 at 5:51 AM, Marius Storm-Olsen wrote: > ... which brought it down to 206GB in a single pack. I then ran > git repack -a -d -F --window=350 --depth=250 > which took it down to 203GB, where I'm at right now. Just a hunch. s/F/f/ perhaps? "-F" does not

Re: Delta compression not so effective

2017-03-01 Thread Junio C Hamano
On Wed, Mar 1, 2017 at 8:06 AM, Junio C Hamano wrote: > Just a hunch. s/F/f/ perhaps? "-F" does not allow Git to recover from poor Nah, sorry for the noise. Between -F and -f there shouldn't be any difference.

Delta compression not so effective

2017-03-01 Thread Marius Storm-Olsen
I have just converted an SVN repo to Git (using SubGit), where I feel delta compression has let me down :) Suffice it to say, this is a "traditional" SVN repo, with an extern/ blown out of proportion with many binary check-ins. BUT, even still, I would expect Git's delta compression to be