TLDR:
- PAWS can now connect to the new replicas, see News/Wiki Replicas 2020
Redesign#How should I connect to databases in PAWS?
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign#How_should_I_connect_to_databases_in_PAWS%3F>
for
more info.
- Report issues here: T276284 Establish a working setup for PAWS with
multi-instance wikireplicas <https://phabricator.wikimedia.org/T276284>

Hi!

I'm forwarding this message from the cloud lists, in case you use PAWS and
didn't see the message.

PAWS is now capable of connecting and using the new replicas. For
background on the new replicas, please see News/Wiki_Replicas_2020_Redesign
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign>

Here are some resources you can check:

- News/Wiki Replicas 2020 Redesign#How should I connect to databases in
PAWS?
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign#How_should_I_connect_to_databases_in_PAWS%3F>
- Accessing the new replicas, changes from the previous cluster
<https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20the%20new%20replicas,%20changes%20from%20the%20previous%20cluster.ipynb>
- Using Wikireplicas from PAWS with Python
<https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20Wikireplicas%20from%20PAWS.ipynb>

In summary, due to issues with mysql-proxy and the new architecture,
connecting to the replicas will be more in line with the Toolforge approach.

There is a credentials file in $HOME/.my.cnf that you can use when
connecting, instead of the environment variables. For the host name, you
can use the same ones you would use when connecting from Toolforge ("
{wiki}.{analytics,web}.db.svc.wikimedia.cloud").

To update a notebook, here is an example of the couple of changes when
connecting:

- import os
  import pymysql

  conn = pymysql.connect(
-     host = os.environ['MYSQL_HOST'],
+     host = "eswiki.analytics.db.svc.wikimedia.cloud",

-     user = os.environ['MYSQL_USERNAME'],
-     password = os.environ['MYSQL_PASSWORD'],
+     read_default_file = ".my.cnf",
      database = "eswiki_p"
  )

Note you have to connect to the host name of the DB you are going to query
against.

Existing notebooks remain readable with the output cached, and we are
working on updating the documentation.

In two weeks -April 15- the old cluster will migrate the old cluster to
utilize new replication hosts, at which point replication may stop and
running PAWS notebooks connecting to the old cluster may get stale results.

In ~four weeks -April 28- the old hostnames will be redirected to the new
cluster, and running notebooks connecting to MYSQL_HOST will not work and
will need updating the credentials and DB host name.

If you find any issues or problems or need help, please reach out via IRC
on #wikimedia-cloud, mailing list (cl...@lists.wikimedia.org), or in the
phabricator task T276284 Establish a working setup for PAWS with
multi-instance wikireplicas <https://phabricator.wikimedia.org/T276284>

Feel free to forward this as needed to spread the word to PAWS users, thank
you!
-- 
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to