Re: [Wikidata] [Wikidata-tech] wb_terms redesign

2019-05-04 Thread Maarten Dammers

Hi Alaa,

On 25-04-19 16:38, Alaa Sarhan wrote:
> This is really a defective redesign. It reintroduced numeric IDs to 
be removed by T114902. See also T179928. We should reconsider 
reintroduce a new table to link unperfixed and perfixed entity ID.


The new schema has been optimized as much as possible to allow maximum 
scalability as it will contain a massive amount of data that we hope 
it doubles or even triple in size as soon as we can.


The new schema has been optimized for your use cases and complete breaks 
any tools combining page table data with wikibase data. If you really 
would care about tool developers, you wouldn't trash the unprefixed ID.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [Wikidata-tech] wb_terms redesign

2019-04-25 Thread Léa Lacroix
Hello all,
Since this email thread is shared accross various mailing-lists, in order
to not create noise for too many people, I kindly ask you to continue the
discussions oh Phabricator, where a task
 is dedicated to questions and
issues.
Thanks for your understanding :)
Léa

On Thu, 25 Apr 2019 at 18:04, Federico Leva (Nemo) 
wrote:

> Alaa Sarhan, 25/04/19 17:38:
> > Full migration is not possible unfortunately due to the current capacity
> > of database master node.
>
> Can you clarify whether it would also be too much load to write both to
> the new table and the old wb_terms table for a transition period
> (controlled by a configuration setting)?
>
> (I'm not advocating for it, just asking because we did something of the
> sort in the past for other transitions.)
>
> Federico
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Léa Lacroix
Project Manager Community Communication for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [Wikidata-tech] wb_terms redesign

2019-04-25 Thread Federico Leva (Nemo)

Alaa Sarhan, 25/04/19 17:38:
Full migration is not possible unfortunately due to the current capacity 
of database master node.


Can you clarify whether it would also be too much load to write both to 
the new table and the old wb_terms table for a transition period 
(controlled by a configuration setting)?


(I'm not advocating for it, just asking because we did something of the 
sort in the past for other transitions.)


Federico

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] [Wikidata-tech] wb_terms redesign

2019-04-25 Thread Alaa Sarhan
> This is really a defective redesign. It reintroduced numeric IDs to be
removed by T114902. See also T179928. We should reconsider reintroduce a
new table to link unperfixed and perfixed entity ID.

The new schema has been optimized as much as possible to allow maximum
scalability as it will contain a massive amount of data that we hope it
doubles or even triple in size as soon as we can.
The ids changed here to integer as we have seperate tables at the top level
and prefixes in those tables are redundant and only take up redundant space
that accumulate to big amount very quickly.

The old `wb_terms`, as well as the new schema, are not actually design for
public use unless really necessary for your needs. If your needs can be
addressed via Wikidata available APIs, you are very much encourage to
switch to using those instead. In that case, you need not to worry about
migrations and schema changes ever.

> Also oppose any "partial migration for first XXX items" process in
T221765: this makes queries much more complicated. Please first fill all
data in the new schema, then discontinue the old table.

Full migration is not possible unfortunately due to the current capacity of
database master node.
We tried to find a trade-off between the overhead we introduce to both disk
usage and application logic complexity that will access those schemas.

This will make our life at Wikidata also a little less pleasant for the
migration period. We understand this is unusual migration and isn't a very
easy one for a little while, that's why we want to help out with those
queries and other inquiries as much as we can.

If you have some queries you are running on `wb_terms`, it would of great
help if you add them to a new Phab task on the Tool Builders migration
board, in the Backlog column.
https://phabricator.wikimedia.org/project/view/4014/

If you have any concrete suggestions regarding making this migration
easier, we would also love to hear them. Please go ahead and add them on
the same board in their own Phabricator tasks so that we can keep track of
things more easily and follow up on them as soon as possible.

On Thu, 25 Apr 2019 at 15:25, data_querier 
wrote:

> This is really a defective redesign. It reintroduced numeric IDs to be
> removed by T114902. See also T179928. We should reconsider reintroduce a
> new table to link unperfixed and perfixed entity ID.
>
> Also oppose any "partial migration for first XXX items" process in
> T221765: this makes queries much more complicated. Please first fill all
> data in the new schema, then discontinue the old table.
> ___
> Wikidata-tech mailing list
> wikidata-t...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
>


-- 

Alaa Sarhan
Full Stack Developer

Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0https://wikimedia.de

Imagine a world in which every single human being can freely share in
the sum of all knowledge. Help us to achieve our
vision!https://spenden.wikimedia.de

Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e.
V. Eingetragen im Vereinsregister des Amtsgerichts
Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig
anerkannt durch das Finanzamt für Körperschaften I Berlin,
Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata