TLDR:
- PAWS can now connect to the new replicas, see News/Wiki Replicas 2020
Redesign#How should I connect to databases in PAWS?
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign#How_should_I_connect_to_databases_in_PAWS%3F>
for
more info.
- Report issues here: T276284 Establish a working setup for PAWS with
multi-instance wikireplicas <https://phabricator.wikimedia.org/T276284>

Hi!

PAWS is now capable of connecting and using the new replicas.

Here are some resources you can check:

- News/Wiki Replicas 2020 Redesign#How should I connect to databases in
PAWS?
<https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign#How_should_I_connect_to_databases_in_PAWS%3F>
- Accessing the new replicas, changes from the previous cluster
<https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20the%20new%20replicas,%20changes%20from%20the%20previous%20cluster.ipynb>
- Using Wikireplicas from PAWS with Python
<https://public.paws.wmcloud.org/User:JHernandez_(WMF)/Accessing%20Wikireplicas%20from%20PAWS.ipynb>

In summary, due to issues with mysql-proxy and the new architecture,
connecting to the replicas will be more in line with the Toolforge approach.

There is a credentials file in $HOME/.my.cnf that you can use when
connecting, instead of the environment variables. For the host name, you
can use the same ones you would use when connecting from Toolforge ("
{wiki}.{analytics,web}.db.svc.wikimedia.cloud").

To update a notebook, here is an example of the couple of changes when
connecting:

- import os
  import pymysql

  conn = pymysql.connect(
-     host = os.environ['MYSQL_HOST'],
+     host = "eswiki.analytics.db.svc.wikimedia.cloud",

-     user = os.environ['MYSQL_USERNAME'],
-     password = os.environ['MYSQL_PASSWORD'],
+     read_default_file = ".my.cnf",
      database = "eswiki_p"
  )

Note you have to connect to the host name of the DB you are going to query
against.

Existing notebooks remain readable with the output cached, and we are
working on updating the documentation.

In two weeks -April 15- the old cluster will migrate the old cluster to
utilize new replication hosts, at which point replication may stop and
running PAWS notebooks connecting to the old cluster may get stale results.

In ~four weeks -April 28- the old hostnames will be redirected to the new
cluster, and running notebooks connecting to MYSQL_HOST will not work and
will need updating the credentials and DB host name.

If you find any issues or problems or need help, please reach out via IRC,
mailing list, or in the phabricator task T276284 Establish a working setup
for PAWS with multi-instance wikireplicas
<https://phabricator.wikimedia.org/T276284>

-- 
Joaquin Oltra Hernandez
Developer Advocate - Wikimedia Foundation
_______________________________________________
Wikimedia Cloud Services announce mailing list
cloud-annou...@lists.wikimedia.org (formerly labs-annou...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to