I have a large table with ~350k records for which I'm in the process of
standardising data.  The fields I'm most interested in standardising are
albumartist, artist, performer and composer.  I'm dealing with one at a
time and the logic for each is the same.   I'll state in advance that I'm a
SQL and SQLite newbie.  All my code is being run through SQLiteStudio ...
there's no program that this code forms part of.

For the purposes of discussion the main table is called audio and I've
created an albumartists table using an export query using distinct
albumartist from audio.

tables:
audio (unique_id, albumartist)
albumartists (albumartist, newalbumartist)

I copied albumartist to newalbumartist for each record and then set about
the task of vetting each entry in newalbumartist, changing it where
necessary.

The intent is now to run an update query on audio matching the
corresponding albumartist entry in albumartists and then replacing
albumartist in audio with its corresponding newalbumartist value.  For
illustration:

albumartists:
albumartist  newalbumartist
Al di Meola   Al Di Meola


Every record in audio where albumartist matches "Al di Meola" should be
replaced with "Al Di Meola" from albumartists

I thought I'd construct the match as a select query first to see that it's
doing what's expected:

SELECT t.albumartist,

s.albumartist,

s.newalbumartist

FROM audio t

INNER JOIN

albumartists s ON t.albumartist = s.albumartist

WHERE t.albumartist IS NOT NULL

ORDER BY t.albumartist;


The query returns 273,378 rows.  Looking at some records I can see that the
query appears to return what I want and if I'm not mistaken turning that
into an update query using that match would replace all non null
albumartist entries in t with the newalbumartist equivalent in s

If I'm not mistaken the update query would look like this:

UPDATE audio

SET albumartist = (

SELECT newalbumartist

FROM albumartists

WHERE albumartist = audio.albumartist

)

WHERE EXISTS (

SELECT 1

FROM albumartists

WHERE albumartist = audio.albumartist AND

albumartist IS NOT NULL

);


Am I on the right track before I unleash this on my table?

Reply via email to