Re: [Wikitech-l] [Wikidata-tech] Normalization of change tag schema

2018-11-29 Thread Amir Sarabadani
Hey,
The last update on this thread.

The day has finally come, there is no part of mediawiki that points out to
ct_tag column. This column will be dropped in the next couple of weeks. It
had only empty value for a several weeks already.

Best

On Thu, 11 Oct 2018 at 13:47, Amir Sarabadani 
wrote:

> Hello,
> One other update regarding this: We just set reading to use the new change
> tag backend (and stop writing to ct_tag column) in mediawiki.org, test
> wikis and several other small wikis. This means if you are depending on
> ct_tag column, your tool/service will be broken soon.
>
> Also, since this changes queries of recentChanges, watchlist, User
> contributions, history action, a handful of API modules and some other
> special pages, let us know if anything in that regard doesn't look right to
> you. Dear developers, keep this in mind if something pops up in logs or
> tendril.
>
> We are planning to move forward on bigger wikis next week.
>
> Best
>
> On Tue, 31 Jul 2018 at 09:19, Jon Robson  wrote:
>
>> đź‘Ź
>>
>> On Tue, Jul 31, 2018, 3:42 AM Derk-Jan Hartman <
>> d.j.hartman+wmf...@gmail.com>
>> wrote:
>>
>> > That is an impressive difference !
>> >
>> > On Mon, Jul 30, 2018 at 6:22 PM Amir Sarabadani <
>> > amir.sarabad...@wikimedia.de> wrote:
>> >
>> > > And this is the load on vslow database nodes on s7:
>> > >
>> > >
>> >
>> https://grafana.wikimedia.org/dashboard/db/mysql?panelId=3&fullscreen&orgId=1&from=1532794373712&to=1532967173714&var-dc=eqiad%20prometheus%2Fops&var-server=db1090&var-port=13317
>> > >
>> > > You can see similar drops on other sections from exactly the moment it
>> > got
>> > > deployed:
>> > > s1:
>> > >
>> > >
>> >
>> https://grafana.wikimedia.org/dashboard/db/mysql?panelId=3&fullscreen&orgId=1&from=1531757700702&to=1532967300702&var-dc=eqiad%20prometheus%2Fops&var-server=db1106&var-port=9104
>> > > s2
>> > > <
>> >
>> https://grafana.wikimedia.org/dashboard/db/mysql?panelId=3&fullscreen&orgId=1&from=1531757700702&to=1532967300702&var-dc=eqiad%20prometheus%2Fops&var-server=db1106&var-port=9104s2
>> > >
>> > > :
>> > >
>> > >
>> >
>> https://grafana.wikimedia.org/dashboard/db/mysql?panelId=3&fullscreen&orgId=1&from=1532794561870&to=1532967361872&var-dc=eqiad%20prometheus%2Fops&var-server=db1090&var-port=13312
>> > >
>> > > Best
>> > >
>> > > On Mon, 30 Jul 2018 at 13:13, Amir Sarabadani <
>> > > amir.sarabad...@wikimedia.de>
>> > > wrote:
>> > >
>> > > > Hey,
>> > > > Using the new table as backend of Special:Tags (and similar APIs) is
>> > now
>> > > > enabled everywhere. Contact me if there's any issues with that.
>> > > >
>> > > > Best
>> > > >
>> > > > On Wed, 25 Jul 2018 at 19:17, Amir Sarabadani <
>> > > > amir.sarabad...@wikimedia.de> wrote:
>> > > >
>> > > >> Hello,
>> > > >> One update regarding this.
>> > > >> We enabled using the new table for Special:Tags in several large
>> wikis
>> > > >> which caused a massive improvement in the performance of the page.
>> For
>> > > >> example loading Special:Tags on Wikidata used to take around a
>> minute
>> > > and
>> > > >> now it takes less than a second. English Wikipedia is down from ten
>> > > seconds
>> > > >> to less than one and so on.
>> > > >>
>> > > >> There is a lot of work needs to be done and maintenance scripts is
>> > being
>> > > >> ran to backpopulate the ct_tag_id column in change_tag table (If
>> you
>> > > want
>> > > >> to follow the progress, see
>> https://phabricator.wikimedia.org/T193873
>> > )
>> > > >> and then we need start reading from the new table in mediawiki and
>> > > finally
>> > > >> we can drop ct_tag column entirely. If you want to help in review,
>> > > writing
>> > > >> code or anything, just let me know.
>> > > >>
>> > > >> Best
>> > > >>
>> > > >> On Wed, 27 Jun 2018 at 15:15, LĂ©a Lacroix <
>> lea.lacr...@wikimedia.de>
>> > > >> wrote:
>> > > >>
>> > > >>> Hello all,
>> > > >>>
>> > > >>> Our team is refactoring some code around the change tags on Recent
>> > > >>> Changes. This can impact people using the database on ToolForge.
>> > > >>>
>> > > >>> Currently, the tags are stored in the table change_tag in the
>> column
>> > > >>> ct_tag.
>> > > >>>
>> > > >>> In the next days, we will add a column ct_tag_id with a unique
>> > > >>> identifier for these tags. A new table change_tag_def that will
>> store
>> > > >>> the tag id, the message, and more information like how many times
>> > this
>> > > tag
>> > > >>> is used on the local wiki.
>> > > >>>
>> > > >>> On the long term, we plan to drop the column ct_tag since the tag
>> > will
>> > > >>> be identified with ct_tag_id.
>> > > >>>
>> > > >>> This change will happen on:
>> > > >>> - French Wikipedia: Monday July 2nd
>> > > >>> - All other wikis: from July 9th
>> > > >>>
>> > > >>> If there is any problem (trouble with saving edits, slow down of
>> > recent
>> > > >>> changes…) please  create a subtask of T185355
>> > > >>>  or contact Ladsgroup
>> > > >>> 

[Wikitech-l] Deprecation of tag_summary table

2018-11-29 Thread Amir Sarabadani
Hello,
tag_summary table was introduced in 2009 as a roll up table for change_tag.
One of the reasons it was being used was that MySQL databases that were
using earlier versions of 4.1 (Released at 15 February 2005) could not use
GROUP_CONCAT feature.

Around five years ago, developers started to replace usages of tag_summary
with change_tag primarily because GROUP_CONCAT became available then and it
most cases it was faster. For example [1] but it wasn't done fully which
led us to having discrepancies. For example, Special:RecentChanges uses
change_tag table but its API counterpart uses tag_summary table.
Maintaining two extremely large tables is a technical debt that have been
biting us since its deployment. Also, with normalization of change_tag
table in place [2], it's more performant than tag_summary.

So we are replacing usages of this table with change_tag and in the next
couple of weeks, and then we will drop the whole table. If you're using it
in cloud replicas, please change it to change_tag. If you have any concerns
or notes, feel free to chime in at https://phabricator.wikimedia.org/T209525
(Also, review of the patches would be extremely appreciated)

Thank you and sorry for any inconvenience.

[1]: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/95584
[2]: https://phabricator.wikimedia.org/T185355

Best
-- 
Amir Sarabadani
Software Engineer

Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Tel. (030) 219 158 26-0
http://wikimedia.de

Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen
Wissens frei teilhaben kann. Helfen Sie uns dabei!
http://spenden.wikimedia.de/

Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnĂĽtzig anerkannt durch das Finanzamt fĂĽr
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] TechCom Radar 2018-11-28

2018-11-29 Thread Kate Chapman
Hi All,

Here are the minutes from this week's TechCom meeting:

* Hosting IRC discussion: Proposal for partial opt-out method for
Content security policy  on
Wednesday December 5th 11pm PST(December 6th 07:00 UTC, 08:00 CET) in
#wikimedia-office

* Reviewed: Introduce a new namespace for collaborative judgments
about wiki entities 
determined need further review from DBAs to determine if more
discussion is needed due to updates on the proposal to add filtering
which changes the scope.

* On Last Call: RfC: Session storage service interface:
  last call ends Wednesday
December 12th 1pm PST(21:00 UTC, 22:00 CET)

* On Last Call: RFC: Modern Event Platform: Schema Registry
 last call ends Wednesday
December 5th 10pm PST(December 6th 06:00 UTC, 07:00 CET)

* On Last Call: RFC: Modern Event Platform: Stream Intake Service
 : last call ends Wednesday
December 5th 10pm PST(December 6th 06:00 UTC, 07:00 CET)

You can also find our meeting minutes at


See also the TechCom RFC board
.

If you prefer you can subscribe to our newsletter here


Thanks,
Kate
--
Kate Chapman
Senior Program Manager, Core Platform
Wikimedia Foundation
kchap...@wikimedia.org

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l