2011/10/30 Jim DeLaHunt <from.nab...@jdlh.com> > Frederic Da Vitoria wrote: > > > > I believe that as Calvin pointed out, simply applying this idea to the > > data > > would make the data partially unusable. OTOH, very few users would be > > willing to enter the level of data duplication which would be needed, so > > that the data quality in classical music would become very poor because > > too > > many useful AR duplications would not be entered. > > > > Let me highlight the words "would become very poor". That seems to imply > that the data quality -- of Works entities with Parts relationships, and > Relationships on those entities, that's the topic here -- is good now. I > believe that the number of Works entities with both sub-parts and > Relationships is quite small. This is because the editing tools for > multi-part Works and Relationships are very poor. I entered just the child > Works for one opera, and it was a very long and tedious process. > > Is there a way to measure how good the data is? For instance, do we have a > count of how many Works have both multiple levels using the Parts > Relationship, and have Relationships for things like composer? I looked > for > a way to test this using MB Search, and > http://musicbrainz.org/doc/Text_Search_Syntax, but I couldn't find a way > to > looks for Relationships. > > I argue that the data quality, i.e. the number and importance of > Relationships on Works entities which also have Parts Relationships, is > poor > in the MusicBrainz database right now. I base this on perceptions, not > facts. Can anyone cite facts, search results, etc., to prove otherwise? If > we have no proof that the data quality is currently good, then RFC-339 will > either have no effect, or will make bad data bad in a different way. > > I believe the data quality here will remain poor until we have better > editing tools, and we have to make design choices like RFC-339 before the > editing tools can improve. RFC-339 won't improve the data greatly. It will > just be a brick in the bridge leading to better data. We need many other > bricks also. >
You are right, I should have said this with better words. First "poor" was a bad word, I should have used "incomplete". What I meant is that ARs would not be entered consistently in the work trees, so that searches would give inconsistent results. Frederic Da Vitoria wrote: > > > > There is a discrepancy between making input user-friendly and keeping the > > data retrieval efficient... > > > > I agree. A "distinction" might be an even better word than "discrepancy", > but I understand your meaning. > Well, there is probably a better word than "discrepancy", but what I meant was definitely stronger than "distinction". The first word which came to my French mind was "hiatus". Frederic Da Vitoria wrote: > > > > I believe that since data retrieval is after all > > the justification of data entry, then the database should physically make > > retrieval as efficient as possible, which means that ARs should be > > duplicated. > > > > As a software engineer, I do not completely agree with this statement. The > database should physically make retrieval efficient /enough/ to accomplish > its purpose. But design is about tradeoffs, and efficiency is not the only > factor we are trying to balance. Once the retrieval is efficient enough, > then other factors -- like data consistency and clarity of meaning -- > should > get greater weight. > Of course, but my whole argument came from my feeling (which I would be completely unable to sustain, it is only a feeling) that using such recursive searches would have a price which would be too heavy. I may be wrong, especially since this would mainly concern classical works which is only a small subset of MB's data. -- Frederic Da Vitoria (davitof) Membre de l'April - « promouvoir et défendre le logiciel libre » - http://www.april.org
_______________________________________________ MusicBrainz-style mailing list MusicBrainz-style@lists.musicbrainz.org http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-style