Martin Desruisseaux created SIS-422:
---------------------------------------

             Summary: Migrate from SVN to Git as the main SIS code repository
                 Key: SIS-422
                 URL: https://issues.apache.org/jira/browse/SIS-422
             Project: Spatial Information Systems
          Issue Type: Improvement
            Reporter: Martin Desruisseaux
            Assignee: Martin Desruisseaux
             Fix For: 1.0


Migrate to git as the main source code repository. After this work:

* The source code repository will become 
https://gitbox.apache.org/repos/asf/sis.
* The https://svn.apache.org/repos/asf/sis/trunk/ repository will become 
read-only.

We will continue to use Subversion for the {{site}}, {{sis-data}} and 
{{non-free}}. Before to make the new git repository ready for use, we will try 
to cleanup its history by removing large files, especially:

* {{California_Restaurants.csv}} (19 Mb)
* {{DEPARTEMENT.SHP}} (3 Mb)
* {{ANC90Ply_4326.shp}} (0.7 Mb)

Those large files were identified as below (source: 
[stackoverflow|https://stackoverflow.com/questions/10622179/how-to-find-identify-large-files-commits-in-git-history]):

{code:Bash}
git rev-list --objects --all | sort -k 2 > allfileshas.txt
git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ 
blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt
for SHA in `cut -f 1 -d\  < bigobjects.txt`; do
echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print 
$1,$3,$7}' >> bigtosmall.txt
done;
{code}

Commands executed for removing them:

{code:Bash}
git filter-branch --tree-filter 'find . -name "California_Restaurants.csv" 
-delete' -- --all
git filter-branch --tree-filter 'find . -name "DEPARTEMENT.*" -delete' -- --all
git filter-branch --tree-filter 'find . -name "ANC90Ply_4326*" -delete' -- --all
git filter-branch --tree-filter 'find . -name "*~" -delete' -- --all
git filter-branch --tree-filter 'rm -rf "sis-data"' -- --all

git update-ref -d refs/original/refs/heads/master
git reflog expire --expire=now --all
git gc --prune=now
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to