Hi all,

It sounds like we will need a redirect server that issues 301s from each
druid.io page to the corresponding druid.apache.org page. Charles and I
spoke offline and thought that something like Jon's original proposal is
the best way to go. I am going to suggest we get started on this, as it's
the last major piece of infra to move to ASF.

1) Set up a redirect server to perform 301 redirects to druid.apache.org
2) Post all druid.io content on druid.apache.org
3) Update druid.io DNS to point to the redirect server
4) Shut down GitHub pages hosting for druid.io

Steps (2) and (3) should be done as close in time as possible so there is
no confusion as to which version of the pages is canonical.

For the redirect server, two viable options are an nginx server or an S3
webpage redirect (
https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html).
Just like we did with the HTML-level redirect, I suggest we test this first
on a single page. We can do that by having the redirect server initially
start off by hosting all druid.io content (so it's indistinguishable from
the GitHub-pages-based site) except for a single page, which it redirects
using HTTP 301 to druid.apache.org.

I'm planning to start looking into this, so anyone around please speak up
if you have any advice or alternative approaches to suggest.

On Mon, Apr 29, 2019 at 4:01 PM Jonathan Wei <jon...@apache.org> wrote:

> Thanks for checking the SEO state, that's somewhat disappointing.
>
> For Bing, it sounds like they really want you to use 301s (
> https://www.bing.com/webmaster/help/webmaster-guidelines-30fba23a):
>
> > Bing prefers you use a 301 permanent redirect when moving content, should
> the move be permanent.  If the move is temporary, then a 302 temporary
> redirect will work fine.  Do not use the rel=canonical tag in place of a
> proper redirect.
>
> I wasn't able to find similar guidance re: this issue for DuckDuckGo.
>
> On Mon, Apr 29, 2019 at 10:42 AM Gian Merlino <g...@apache.org> wrote:
>
> > Another update: SEO is not looking great after another day passed. For a
> > search for "druid community", both http://druid.io/community and
> > https://druid.apache.org/community/ have dropped off the front page of
> > Bing
> > completely. On Google, the legacy version is gone (as expected) but the
> > Apache version has dropped to the #3 spot (down from #2 yesterday; and
> down
> > from where the legacy page was pre-migration, which was #1).
> >
> > I think this means we do need to try to get 301s figured out.
> >
> > On Sun, Apr 28, 2019 at 3:06 PM Gian Merlino <g...@apache.org> wrote:
> >
> > > Google has picked up the new URL as of today but Bing hasn't. Neither
> has
> > > DuckDuckGo for that matter.
> > >
> > > Currently, Google is showing https://druid.apache.org/community/ in
> the
> > > #2 spot and Bing/DDG are showing http://druid.io/community in the top
> > > spot. Ominously, the latter two _have_ picked up a page title change to
> > > "Redirecting..."
> > >
> > > On Fri, Apr 26, 2019 at 11:00 AM Gian Merlino <g...@apache.org> wrote:
> > >
> > >> An update: this is done now since a couple of days ago, but Google and
> > >> Bing are still showing http://druid.io/community for a search for
> > "druid
> > >> community" or even "apache druid community":
> > >>
> > >> - https://www.google.com/search?q=druid+community
> > >> - https://www.bing.com/search?q=druid+community
> > >>
> > >> I suggest we keep an eye on the search engines and make sure they can
> > >> figure out that the site has changed (I'm not sure how often they
> > crawl).
> > >> If they can then it would make sense to me to move forward with
> > migrating
> > >> the entire web site.
> > >>
> > >> On Mon, Apr 22, 2019 at 7:49 PM Jonathan Wei <jon...@apache.org>
> wrote:
> > >>
> > >>> Correction: Xavier was suggesting we use
> > >>>
> > >>>
> >
> https://github.com/druid-io/druid-io.github.io/blob/src/_layouts/redirect_page.html
> > >>> ,
> > >>> the existing redirect system used by the Druid website.
> > >>>
> > >>> I've opened PRs to do the community page migration test:
> > >>> https://github.com/apache/incubator-druid-website/pull/3
> > >>> https://github.com/druid-io/druid-io.github.io/pull/591
> > >>>
> > >>> On Mon, Apr 22, 2019 at 3:04 PM Gian Merlino <g...@apache.org>
> wrote:
> > >>>
> > >>> > That sounds good to me. I would also consider adding canonical tags
> > to
> > >>> all
> > >>> > druid.apache.org pages so we don't have druid.incubator.apache.org
> > and
> > >>> > druid.apache.org both floating around (not to mention http/https
> > >>> version
> > >>> > of
> > >>> > both).
> > >>> >
> > >>> > On Mon, Apr 22, 2019 at 2:59 PM Jonathan Wei <jon...@apache.org>
> > >>> wrote:
> > >>> >
> > >>> > > For redirects, Xavier has suggested using
> > >>> > > https://help.github.com/en/articles/redirects-on-github-pages to
> > >>> > redirect
> > >>> > > to druid.apache.org as a way to transition before the domain
> > >>> migration
> > >>> > > occurs, and believes that it would have the same SEO effects as a
> > 301
> > >>> > > redirect after the new pages are indexed.
> > >>> > >
> > >>> > > I think we could try migrating the current Community page to
> > >>> > > druid.apache.org with Github redirects and canonical links
> > pointing
> > >>> to
> > >>> > the
> > >>> > > https://druid.apache.org version. If that goes well, we could
> > >>> continue
> > >>> > > migrating more pages.
> > >>> > >
> > >>> > > What are the community's thoughts on that?
> > >>> > >
> > >>> > > Thanks,
> > >>> > > Jon
> > >>> > >
> > >>> > > On Tue, Mar 12, 2019 at 7:19 PM Gian Merlino <g...@apache.org>
> > >>> wrote:
> > >>> > >
> > >>> > > > OpenOffice and Groovy both chose to sort of "meld" their
> classic
> > >>> and
> > >>> > > Apache
> > >>> > > > sites together: https://www.openoffice.org/,
> > >>> http://groovy-lang.org/.
> > >>> > > Note
> > >>> > > > how when you click around, you get shuttled between the classic
> > >>> domain
> > >>> > > and
> > >>> > > > the Apache domain. Some pages are available on both sites, like
> > >>> > > > http://groovy-lang.org/download.html and
> > >>> > > > https://groovy.apache.org/download.html (which don't use
> > canonical
> > >>> > link
> > >>> > > > tags -- does not seem like a good example to follow!).
> > >>> > > >
> > >>> > > > NetBeans (still incubating) also has a "melded" site at
> > >>> > > > https://netbeans.org/ but doesn't seem to consider itself done
> > >>> yet.
> > >>> > They
> > >>> > > > are discussing plans on their lists & wiki to do redirects from
> > >>> > > > netbeans.org
> > >>> > > > to netbeans.apache.org:
> > >>> > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://cwiki.apache.org/confluence/display/NETBEANS/netbeans.org+Transition+Process
> > >>> > > > ,
> > >>> > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://lists.apache.org/thread.html/ad10fb9d4c8fee571a2f6232b268a3b835f7b823d3a0983b84aeb18a@%3Cdev.netbeans.apache.org%3E
> > >>> > > > .
> > >>> > > > As of today the domain has been donated to ASF, but the server
> is
> > >>> still
> > >>> > > run
> > >>> > > > by Oracle, so the plan doesn't seem to be finished yet. (WHOIS
> > for
> > >>> > > > netbeans.org shows ASF as the registrant; netbeans.org
> resolves
> > to
> > >>> > > > lb-netbeans-cms-adc.oracle.com.)
> > >>> > > >
> > >>> > > > The melded sites don't really seem better to me than
> redirecting
> > >>> all
> > >>> > urls
> > >>> > > > on the domain. I guess it depends on if we want to keep
> druid.io
> > >>> as
> > >>> > the
> > >>> > > > official domain forever, or if we think druid.apache.org is
> > >>> cooler. I
> > >>> > > > definitely think druid.apache.org is cooler so my vote is
> there
> > >>> :).
> > >>> > It's
> > >>> > > > also nice that it supports https. (druid.io does not today,
> > since
> > >>> it's
> > >>> > > on
> > >>> > > > GitHub pages, which doesn't support https for custom domains.)
> > >>> > > >
> > >>> > > > On Tue, Mar 12, 2019 at 7:47 PM Charles Allen
> > >>> > > > <charles.al...@snap.com.invalid> wrote:
> > >>> > > >
> > >>> > > > > Are there other projects who have transitioned an
> independently
> > >>> > > > successful
> > >>> > > > > domain name to an apache one?
> > >>> > > > >
> > >>> > > > > On Tue, Mar 5, 2019 at 2:13 PM David Lim <
> david...@apache.org>
> > >>> > wrote:
> > >>> > > > >
> > >>> > > > > > Who has control over the druid.io domain? Charles would
> that
> > >>> be
> > >>> > you?
> > >>> > > > > >
> > >>> > > > > > We'd need support from them for the DNS redirect.
> > >>> > > > > >
> > >>> > > > > > On Tue, Mar 5, 2019 at 2:04 PM Jonathan Wei <
> > jon...@apache.org
> > >>> >
> > >>> > > wrote:
> > >>> > > > > >
> > >>> > > > > > > We still need to complete the website migration to Apache
> > >>> > > > > infrastructure.
> > >>> > > > > > >
> > >>> > > > > > > I'll propose the following plan:
> > >>> > > > > > >
> > >>> > > > > > > Proposed Apache Druid website migration plan
> > >>> > > > > > > ========================================
> > >>> > > > > > >
> > >>> > > > > > > These links have some previous discussion on the website
> > >>> > migration:
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_7cae100b684e0b33e0adda993efea3d6088978700988a0ae632fdd80-40-253Cdev.druid.apache.org-253E&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=G1dTS7FlYGauxNOaQECZix2YwroWVCqJB-cT0nEeNwM&e=
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_INFRA-2D17340&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=pwg0jE385gqei6EEEbxugKHWll7oyKoCloFc8ByhlUc&e=
> > >>> > > > > > >
> > >>> > > > > > > From the discussions above, the recommendation is to
> have 2
> > >>> > > separate
> > >>> > > > > > repos
> > >>> > > > > > > for the website: one for source and another for built
> > content
> > >>> > that
> > >>> > > > will
> > >>> > > > > > be
> > >>> > > > > > > served.
> > >>> > > > > > >
> > >>> > > > > > > Generating site files
> > >>> > > > > > > =======================
> > >>> > > > > > >
> > >>> > > > > > > The Apache site update process will be similar to our
> > current
> > >>> > > > process.
> > >>> > > > > > >
> > >>> > > > > > > Current process:
> > >>> > > > > > > 1. Push changes to
> > >>> > > > > > https://github.com/druid-io/druid-io.github.io/tree/src
> > >>> > > > > > > 2. metamx bot picks up changes, builds, and commits to
> > >>> > > > > > >
> https://github.com/druid-io/druid-io.github.io/tree/master
> > >>> > > > > > > 3.
> > >>> https://github.com/druid-io/druid-io.github.io/tree/master is
> > >>> > > > > served
> > >>> > > > > > by
> > >>> > > > > > > github pages
> > >>> > > > > > >
> > >>> > > > > > > Apache process:
> > >>> > > > > > > 1. Push changes to
> > >>> > > > > https://github.com/apache/incubator-druid-website-src
> > >>> > > > > > > 2. Jenkins bot from Apache will build the website from
> > source
> > >>> > repo,
> > >>> > > > > > commit
> > >>> > > > > > > to https://github.com/apache/incubator-druid-website
> > >>> > > > > > > 3. Apache Druid website will be served from the content
> in
> > >>> > > > > > > https://github.com/apache/incubator-druid-website
> > (asf-site
> > >>> > > branch)
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > > Hosting and SEO
> > >>> > > > > > > ================
> > >>> > > > > > >
> > >>> > > > > > > The Apache site will be hosted at druid.apache.org on
> > Apache
> > >>> > > > > > > infrastructure:
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apache.org_dev_project-2Dsite.html&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=_rHEo_asMXKypaunuBTXFkB6Ni3F6KqbEfkck18L7Ag&e=
> > >>> > > > > > >
> > >>> > > > > > > To preserve our search rankings, we can setup 301
> redirects
> > >>> from
> > >>> > > the
> > >>> > > > > old
> > >>> > > > > > > druid.io site to the corresponding pages on the
> > >>> druid.apache.org
> > >>> > > > > site. (
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_redirection&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=lUeWU0dT9thy8gp11RO-Vry7zkYl_W4BXz01fyXJO0A&e=
> > >>> > > > > > )
> > >>> > > > > > >
> > >>> > > > > > > However, Github pages (which currently hosts the
> druid.io
> > >>> site)
> > >>> > > does
> > >>> > > > > not
> > >>> > > > > > > support 301 redirects, so we propose the following:
> > >>> > > > > > > - Setup a new Nginx server that will perform 301
> redirects
> > to
> > >>> > > > > > > druid.apache.org for the druid.io. Imply can host this
> if
> > >>> > needed.
> > >>> > > > > > > - Update the druid.io DNS entry to point to this new
> Nginx
> > >>> > server
> > >>> > > > > > > - Shut down Github pages hosting for druid.io
> > >>> > > > > > >
> > >>> > > > > > > In addition, we can also set canonical tags on our pages:
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__moz.com_learn_seo_canonicalization&d=DwIBaQ&c=ncDTmphkJTvjIDPh0hpF_w&r=HrLGT1qWNhseJBMYABL0GFSZESht5gBoLejor3SqMSo&m=uPTu9gAHxe2KnNDGURBYp1G94UBX5LCRMknoapXwTwI&s=T8G2c6d4EbQ_YDLFQXVebcj0UN9FNrbpPY5Xq4LAR8w&e=
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > > Action items
> > >>> > > > > > > ===============
> > >>> > > > > > > - Setup a Jenkins bot that builds the Apache website
> > content
> > >>> from
> > >>> > > > > source
> > >>> > > > > > > - Get the Apache website up
> > >>> > > > > > > - Setup Nginx redirect server for druid.io
> > >>> > > > > > > - Shutdown github pages and redirect DNS for druid.io to
> > >>> Nginx
> > >>> > > > > redirect
> > >>> > > > > > > server
> > >>> > > > > > > - Add canonical tags to pages
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> >
>

Reply via email to