[SlimDevices: Ripping] New normalized database for classical music tagging

nwestbury Sun, 08 Jan 2006 16:45:40 -0800

I am interested in the tagging of classical music files and starting up
a new database for classical music tagging.  I find freedb to be quite
useless for reasons that are well known so I won't repeat them here.  I
have a Squeezebox and I can see that it would be a great machine if I
had an easy way of tagging my files consistently.


I very much believe that we need a new database.  The flat single-table
freedb schema is not adequate and is the cause of many problems.  I have
attached schema.txt (really a .sql file).  This still is not fully
normalized, but I compromised normalization to reduce the number of
tables.  You will see, for example, that each person (composer,
conductor, performer) has a single record and so their name and details
appears only once.  If someone, say, adds a 'year of death' to a
composer then that information will automatically appear in the data
for every CD containing that composer's music.  Searching by composer
is currently un-usable for me because every CD is tagged with the
composer's name in a different format.  With a normalized database, the
composer's name appears only once and so must be consistent.

I have already populated the 'Persons' table with almost 3000
composers, performers, and conductors.  I would upload it but it was
too big.

I am currently writing a program that scans the freedb database.  It
does various techniques of pattern matching to extract the information
(composer, performers, piece, movement number and description, the key
signature etc.).  This is not easy because everyone uses a different
format.  However, I believe it should be possible to extract most of
the information, leaving only a little to be manually corrected.

I am undecided whether we need to populate the table of compositions
first and get the freedb processor to try to match to entries in the
compositions table, or to get the freedb processor to populate the
compositions table itself as it finds new compositions.  (Note:
Multiple performances of the same piece should all reference a single
row in the compositions table).

Another problem is that I believe the database needs to allow for
continuous updating.  Freedb does not, but amazingly allows users to
submit alternative data if a different category is used!  We may need a
system for providing alternatives.  Say, for example, that a user
submits a change in the spelling of a composer's name.  Other users
will then be asked, when fetching data for a CD with pieces by that
composer, to choose which of the two spellings they want to use.  If,
say, 10 users select one form with no one selecting the other then the
other gets deleted.

Please note that the database does not specify what data is to go into
a tag.  The user is free to select what data goes into, say, the TRACK
tag.  For example, I saw a discussion about whether composer names
should be in the form FirstName LastName or LastName, FirstName.  Such
a discussion would no longer be relevent as the user could configure
the tagger with '%F %L' or with '%L, %F'.  This does mean that we have
to write taggers to work with the database schema.  People could
periodically export the data to freedb format files for those users who
use software that can only read freedb format files.

I would be interested to hear if anyone would be interested in such a
database or would be interested in helping out in any way (writing
taggers, tools for maintaining the database, etc.) 

Nigel


+-------------------------------------------------------------------+
|Filename: schema.txt                                               |
|Download: http://forums.slimdevices.com/attachment.php?attachmentid=649|
+-------------------------------------------------------------------+

-- 
nwestbury
------------------------------------------------------------------------
nwestbury's Profile: http://forums.slimdevices.com/member.php?userid=3284
View this thread: http://forums.slimdevices.com/showthread.php?t=19858

_______________________________________________
ripping mailing list
[email protected]
http://lists.slimdevices.com/lists/listinfo/ripping

[SlimDevices: Ripping] New normalized database for classical music tagging

Reply via email to