Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/288#issuecomment-39485564 Rebasing is no big deal if the commits that you are rebasing are only in a private repo. For example, even though https://github.com/markhamstra/spark is technically a public repo and people could have forked it and be depending on the history in my repo, as a practical matter nobody is going to be using it that way, but instead will be relying upon the canonical upstream repo, https://github.com/apache/spark. That means that if I am working on a PR in a feature branch Foo, then the easiest and cleanest way to keep my Foo work mergeable with the master branch of Spark is to only 'pull --rebase' the upstream master, not merge pull, thereby avoiding merge commit clutter and the interleaving of my work with other commits to master. Instead, my PR commits will appear last and be applied only after the current state of master. That means that the SHAs of my commits will be changing in my repo (but not those of any commits already in master), but nobody should be relying upon those until after they are in master anyway. Each rebase on an open PR will also cause Jenkins/Travis to re-run the tests, but that's no big deal unless you are rebasing really frequently and unnecessarily.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---