OK, this seems to have succeeded without any big problems! I've re-enabled the git mirrors and the hudson builds. Feel free to commit to the new trees.
Here are some instructions for the migration: === SVN users === Next time you "svn up" in your "common" working directory you'll end up seeing the combined tree - ie a mapreduce/, hdfs/, and common/ subdirectory. This is probably the easiest place from which to work, now. The URLs for the combined SVN trees are: trunk: https://svn.apache.org/repos/asf/hadoop/common/trunk/ branch-0.22: http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.22 branch-0.21: http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21 yahoo-merge: http://svn.apache.org/repos/asf/hadoop/common/branches/yahoo-merge (this one has the yahoo-merge branches from common, hdfs, and mapred) MR-279: http://svn.apache.org/repos/asf/hadoop/common/branches/MR-279 (this one has the yahoo-merge common and hdfs, and the MR-279 mapred) The same kind of thing happened for HDFS-1073 and branch-0.21-old. Pre-project-split branches like branch-0.20 should have remained untouched. You can proceed to delete your checkouts of the individual mapred and hdfs trees, since they exist within the combined trees above. If for some reason you prefer to 'svn switch' an old MR or HDFS-specific checkout to point to its new location, you can use the following incantation: svn sw $(svn info | grep URL | awk '{print $2}' | sed 's,\(hdfs\|mapreduce\|common\)/\(.*\),common/\2/\1,') === Git Users === The git mirrors of the above 7 branches should now have a set of 4 commits near the top that look like this: Merge: 928d485 cd66945 77f628f Author: Todd Lipcon <t...@apache.org> Date: Sun Jun 12 22:53:28 2011 +0000 HADOOP-7106. Reorganize SVN layout to combine HDFS, Common, and MR in a single tree (project unsplit) git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@113499413f79535-47bb-0310-9956-ffa450edef68 commit 77f628ff5925c25ba2ee4ce14590789eb2e7b85b Author: Todd Lipcon <t...@apache.org> Date: Sun Jun 12 22:53:27 2011 +0000 Relocate mapreduce into mapreduce/ commit cd66945f62635f589ff93468e94c0039684a8b6d Author: Todd Lipcon <t...@apache.org> Date: Sun Jun 12 22:53:26 2011 +0000 Relocate hdfs into hdfs/ commit 928d485e2743115fe37f9d123ce9a635c5afb91a Author: Todd Lipcon <t...@apache.org> Date: Sun Jun 12 22:53:25 2011 +0000 Relocate common into common/ The first of these 4 is a 3-parent "octopus" merge commit of the pre-project-unsplit branches. In theory, git is smart enough to track changes through this merge, so long as you pass the right flags (eg --follow). For example: todd@todd-w510:~/git/hadoop-common$ git log --pretty=oneline --abbrev-commit --follow mapreduce/src/java/org/apache/hadoop/mapred/JobTracker.java | head -10 77f628f Relocate mapreduce into mapreduce/ 90df0cb MAPREDUCE-2455. Remove deprecated JobTracker.State in favour of JobTrackerStatus. ca2aba0 MAPREDUCE-2490. Add logging to graylist and blacklist activity to aid diagnosis of related issues. Contributed by Jonathan Eagles 32aaa2a MAPREDUCE-2515. MapReduce code references some deprecated options. Contributed by Ari Rabkin. If you want to be able to have git follow renames all the way through the project split back to the beginning of time, put the following in hadoop-common/.git/info/grafts: 5128a9a453d64bfe1ed978cf9ffed27985eeef36 6c16dc8cf2b28818c852e95302920a278d07ad0c 6a3ac690e493c7da45bbf2ae2054768c427fd0e1 6c16dc8cf2b28818c852e95302920a278d07ad0c 546d96754ffee3142bcbbf4563c624c053d0ed0d 6c16dc8cf2b28818c852e95302920a278d07ad0c In terms of rebasing git branches, git is actually pretty smart. For example, I have a local "HDFS-1073" branch in my hdfs repo. To transition it to the new combined repo, I did the following: # Add my project-split hdfs git repo as a remote: git remote add splithdfs /home/todd/git/hadoop-hdfs/ git fetch splithdfs # Checkout a branch in my combined repo git checkout -b HDFS-1073 splithdfs/HDFS-1073 # Rebase it on the combined 1073 branch git rebase origin/HDFS-1073 ...and it actually applies my patches inside the appropriate subdirectory (I was surprised and impressed by this!) If the branch you're rebasing has added or moved files, it might not be smart enough and you'll have to manually rename them in your branch inside of the appropriate subtree.. but for simple patches this seems to work. For less simple things, the best bet may be to use "git filter-branch" on the patch series to relocate it inside a subdirectory, and then try to rebase. Let me know if you need a hand with any git cleanup, happy to help. == Outstanding issues == The one outstanding issue I'm aware of is that the test-patch builds should be smart enough to be able to deal with patches that are relative to the combined root instead of the original project. Right now, if you export a diff from git, it will include "hdfs/" or "mapreduce/" in the changed file names, and the QA bot won't know how to apply it. The workaround for this is to change directory into the relative subproject dir, and then pass "--relative" to "git diff" or "git show", for example: todd@todd-w510:~/git/hadoop-common/mapreduce$ git diff --relative --no-prefix diff --git CHANGES.txt CHANGES.txt ... I imagine there are probably some other things that fell through the cracks. Please get in touch if there's anything that seems amiss. -Todd On Sun, Jun 12, 2011 at 2:50 PM, Todd Lipcon <t...@cloudera.com> wrote: > All of the nits I ran into should be resolved and we should be good to go. > I will start this in just about 10 minutes (3pm PST). > > ***Please hold all commits until further notice!*** I anticipate that this > should take under an hour, but if there are any bumps along the way it might > stretch into the evening. I'll send out an "all clear" email when things are > ready to go on the new layout. > > I've disabled all of the Hudson builds for now and will be re-enabling them > one by one after reconfiguring their SVN URLs. > > -Todd > > On Sat, Jun 11, 2011 at 8:25 PM, Todd Lipcon <t...@cloudera.com> wrote: > >> Hi all, >> >> I'm figuring out one more small nit I noticed in my testing this evening. >> Hopefully I will figure out what's going wrong and be ready to press the big >> button tomorrow. >> >> Assuming I don't have to "abort mission", my hope is to do this at around >> 3PM PST tomorrow (Sunday). I'll send out a message asking folks to please >> hold commits to all branches while the move is in progress. >> >> Thanks >> -Todd >> >> >> On Fri, Jun 10, 2011 at 11:20 AM, Todd Lipcon <t...@cloudera.com> wrote: >> >>> Hi all, >>> >>> Pending any unforeseen issues, I am planning on committing HADOOP-7106 >>> this weekend. I have the credentials from Jukka to take care of the git >>> trees as well, and have done a "practice" move several times on a local >>> mirror of the svn. >>> >>> I'll send out an announcement of the exact time in advance of when I >>> actually do the commit. >>> >>> Thanks >>> -Todd >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera