On Wed, Jan 14, 2009 at 10:46 PM, Toby A Inkster <m...@tobyinkster.co.uk> wrote: > Brian Suda wrote: > >> But this isn´t unique to microformats, other semantic technologies >> would have this issue as well. > > FOAF (and RDF more generally) has a set of well-established conventions for > merging data. Certain properties are taken to be what is called "inverse > functional properties" (IFPs) - what that means in English is that if P is > an IFP, and two people have a property P with the same value, then they're > really the same person. >
Wow.. I wasn't aware of this. Thanks for the tip. > foaf:mbox is for example defined as an IFP - each mailbox marked up with > foaf:mbox belongs to exactly one person. If two people share a foaf:mbox, > then they are the same person, so their data can be merged. (I know what > you're thinking... there are people who share a mailbox, so doesn't this > break? In theory, no it doesn't break - the specification says that it's for > "personal mailboxes" only, "ie. an Internet mailbox associated with exactly > one owner". In practice, people occasionally ignore the spec, but for the > most part it works well.) There are other IFPs too, such as foaf:jabberID, > foaf:openid, etc. > > So, for hCard/vCard, what are candidates for IFPs? We've discussed "uid" > before, and the general agreement is that that should be fairly safe. > "Photo" looks like it might be a good candidate to begin with, and probably > will do in practice, but in theory the vCard spec defines it far too loosely > - two people could allowably have the same photo. "Key" is pretty much in > the same bucket as "photo", but is probably less useful as few people use it > anyway. So really, "uid" is just about it - shame not many people use that > either. > Hmmm.. can't we use emails? if two hcards have the same email, aren't they the same entity? >> wouldn't you just keep a list of the pages you have already >> crawled? So if you find a tagcloud on page /item1.html and it links to >> /tags/tag1 then on page item2.htm you re-find the tag cloud which >> links to /tags/tag1 you don't follow it again? > > > I don't think that that's quite André's point. A lot of blogs have tag > clouds - long lists of perhaps a hundred tags, in various sized fonts which > act as jumping off points to other parts of the site. They are not tags in > the rel=tag sense of the word in that they do not describe the content of > the current page, but of the site as a whole. People should not be marking > them with rel=tag, but nonetheless some people do. And it means that > essentially every single page on their site has the same massive set of tags > - rel=tag becomes useless on the whole site. Exactly. I agree that this is not the purpose of rel-tags but I only brought it up because out of a very small sample, quite a few examples popped out. The only way out of this mess that I can think of, is to create a microformat for tagclouds, like a root element with class="tagcloud" (the actual name could be based on the most used term) and that would give parsers the mechanism to either exclude all rel-tags inside .tagcloud or to grab the rel-tags inside of the .tagcloud and bail out... This brings me to yet another point that I considered when I gave that talk... if there was a semantic way of attaching a site-wide weight to a rel-tag, that would be *awesome* for these cases. :) But we've seen that embedding machine-data into microformats is a dangerous path... ;) Thanks for your feedback, André Luís _______________________________________________ microformats-new mailing list microformats-new@microformats.org http://microformats.org/mailman/listinfo/microformats-new