Addshore added a comment.

Okay, I'm getting very confused as to where these 'tags' have come from

the NULL values in the title field where the title is the same as the normalized title (to cut back on duplicated data)

This is the kind of thing what makes SELECT not simple- you have to do an IF on the code based on if the value is NULL or not. That is not normalized. Even if it takes less space, it is not a good practice.

Actual selects never use title, they always use the normalized title. Deletes and will use title, but naturally there is a much lower flow of deletes.

And despite having more tables, it avoids duplication. Renaming a title, or a tag is changing a single row. And it saves a lot of space by doing references instead of full contents, repeated many times.

If a title is renamed, the normalized title also has to be updated. If it is in 2 tables, 2 tables / 2 rows must be updated.

99.9% of the time the normalization step outputs the exact same title value

If that is the case, store only the ones that have been normalized, do not add NULL values.
And rename the table to something like cognate_normalized_titles.

This is exactly what i proposed in T148988#2759329, but this makes the selects very hard either requiring the UNION query in that comment or the multi join query in T148988#2762718, but of which you have said are bad.

As the selects always happen on the normalized titles to make the selects easy all normalized values should be stored (no nulls in that field), thus in cases where the normalization and the original title match we do not need the original title. This is what is proposed in T148988#2763922.


TASK DETAIL
https://phabricator.wikimedia.org/T148988

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: daniel, Tobi_WMDE_SW, hoo, Aklapper, jcrespo, Addshore, Marostegui, Minhnv-2809, D3r1ck01, Izno, Luke081515, Wikidata-bugs, aude, Darkdadaah, Mbch331, Jay8g, Krenair
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to