On 22/03/13 09:59, Johan Hake wrote: > On 03/22/2013 10:57 AM, Anders Logg wrote: >> On Fri, Mar 22, 2013 at 10:52:25AM +0100, Johan Hake wrote: >>> On 03/22/2013 10:36 AM, Anders Logg wrote: >>>> On Fri, Mar 22, 2013 at 10:32:50AM +0100, Johan Hake wrote: >>>>>> >>>>>> >>>>>> Not exactly: >>>>>> >>>>>> - Meshes in demos --> remove (already done) >>>>> I suggest we keep these. There aren't any big files >>>>> anyhow, are there? >>>> >>>> They have already been removed and there's a good system in >>>> place for handling them. Keeping the meshes elsewhere will >>>> encourage use of the mesh gallery and keeping better track >>>> of which meshes to use. There were lots of meshes named >>>> 'mesh.xml' or 'mesh2d.xml' which were really copies of other >>>> meshes used in other demos, some of them were gzipped, some >>>> not etc. That's all very clean now. Take a look at how it's >>>> done in trunk. I think it looks quite nice. >>> >>> Nice and clean, but it really is just 30 meshes. Duplications >>> are mostly related to dolfin_fine.xml.gz, which there are 7 >>> copies of, and that file is 86K.
If they're bit-by-bit identical git will only store a single copy in the repository anyway, regardless of how many copies you happen to have in the working tree. On the note of storing gzipped meshes: Do they change frequently? Why are they stored gzipped? Compressed files have a few issues: 1) they're treated as binary i.e. any change requires a new copy of the entire file to be stored 2) they can't be diffed 3) git compresses its packfiles anyway, so there is little (if any) space gain through compression >>>> Most of the example meshes are not that big, but multiply >>>> that by 30 and then some when meshes are moved around or >>>> renamed. >>> >>> I just question if it is worth it. Seems convenient to just >>> have the meshes there. >> >> Keeping the meshes there will put a limit on which demos we can >> add. I think it would be good to allow for more complex demos >> requiring bigger meshes (not necessarily run on the buildbot >> every day). > > Ok. > >>> If we keep them out of the repo I think we should include some >>> automagic downloading when building the demos. >> >> Yes, or at least a message stating: "You have not downloaded demo >> data. Please run the script foo." >> >>> Also should we rename the script to download-demo-meshes, or >>> something more descriptive, as this is what that script now >>> basically does? >> >> It is not only meshes, but also markers and velocity fields. >> Perhaps it can be renamed download-demo-data? > > Sounds good. > > Johan I did some more experimenting: 1) Repository size: there is quite some mileage repacking the repos with the following steps: $ git reflog expire --expire=now --all $ git gc --aggressive --prune=now $ git repack -ad e.g. DOLFIN: 372MiB -> 94MiB 2) Stripping out the files suggested by Anders (https://gist.github.com/alogg/5213171#file-files_to_strip-txt) brings the repo size down to 172MiB and 24MiB after repacking. 3) I haven't yet found a reliable way to migrate feature branches to the filtered repository. Filtering the repository rewrites its history and therefore changes/invalidates all commit ids (SHA1s) and therefore the marks files created when initially converting the repository. There are 2 possible options for filtering the repository during conversion: a) bzr fast-import-filter: seems to be a pain to use with many files (need to pass each path individually as an argument) and seems not to support writing marks files, therefore haven't tried. b) git_fast_filter: when using to filter the converted git repo, the exported marks file in the last step contains 83932 marks instead of the expected 14399 - I can't say why. Unfortunately I haven't been able to use it directory in the conversion pipeline, it's not compatible to a bzr fast-export stream. That's probably fixable, but I can't estimate how much work it would be to fix it since I'm not familiar enough with details of the fast-import format. TL;DR: Repacking repos saves a lot of space already without stripping large files. Stripping files is easy to do and saves even considerably more space, but I haven't been able to reliably import feature branches into a filtered repository. Florian
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Mailing list: https://launchpad.net/~fenics Post to : [email protected] Unsubscribe : https://launchpad.net/~fenics More help : https://help.launchpad.net/ListHelp

