[uf-discuss] Microformats search engine: virel
I just got several automated emails from http://www.virel.org/index.php that they found uF of mine on sites and indexed them. Does anybody know the people behind it? I am not sure if that is cool or creepy :) Chris ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Hi, The site looks authentic and useful. I don't know about the emails. I have added my blog to the site. Let's see what happens next. Ameer On Mon, Jul 7, 2008 at 4:40 PM, Christian Heilmann [EMAIL PROTECTED] wrote: I just got several automated emails from http://www.virel.org/index.php that they found uF of mine on sites and indexed them. Does anybody know the people behind it? I am not sure if that is cool or creepy :) Chris ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss -- Paul Lynde - I sang in the choir for years, even though my family belonged to another church. ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Hi, look what happened now. I jusst got the same kind of email. Looks like they are sending emails to email addresses found in hCards. *It's just like spam. I just dropped them a mail saying so. Ameer On Mon, Jul 7, 2008 at 8:48 PM, Ameer Dawood [EMAIL PROTECTED] wrote: Hi, The site looks authentic and useful. I don't know about the emails. I have added my blog to the site. Let's see what happens next. Ameer On Mon, Jul 7, 2008 at 4:40 PM, Christian Heilmann [EMAIL PROTECTED] wrote: I just got several automated emails from http://www.virel.org/index.php that they found uF of mine on sites and indexed them. Does anybody know the people behind it? I am not sure if that is cool or creepy :) Chris ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss -- Paul Lynde - I sang in the choir for years, even though my family belonged to another church. -- George Burns - Don't stay in bed, unless you can make money in bed. ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
[uf-discuss] Reviving hProduct
All, It seems like the hProduct microformat hasn't seen a lot of revisions since it's initial brainstorming in 2006 (feel free to correct me on this if there are current efforts taking place :-) ). I'm attempting to revise the schema for use in an upcoming August / September project release. I have taken the current brainstorming schema and added on some new items. I would like to open this up to discussion and move this format forward, and any assistance the community would be able to provide would be helpful. Altered schema: http://jay.beweep.com/hproduct/hproduct-schema.txt Unstyled HTML example: http://jay.beweep.com/hproduct/hproduct-example.html The altered schema: hProduct * version. optional. text. * name. required. * image. optional. IMG element or rel='image'. could be further refined as image type ( thumb || full, photo || illo). * description. optional. could be denoted as 'summary' or 'extended'. * brand. text | hCard * uri. optional. URI to product page, href could contain rel='product'. * price. optional. could be further refined as specific type (sale || regular || msrp || clearance || savings). should follow currency format. * p-v. optional. opens up possibilities for custom property-value pairs in more complex examples. o property. required. property types could include: - artist - author - released - hCal event for date of release - upc - isbn - sku - sn - vin - batch - size - color - uid - unique id, item number as provided by manufacturer or retailer - offer - others. possibly around product specs, features. o value. required. (label may be implied) * availability. optional. * shipping. optional. shipping messaging. * reviews. text | hReview * buy. optional. purchase URL. Thanks, Jay Myers (e)[EMAIL PROTECTED] ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
[uf-discuss] Re: Reviving hProduct
Hi Jay, On 7 Jul 2008, at 17:11, Jay Myers wrote: All, It seems like the hProduct microformat hasn't seen a lot of revisions since it's initial brainstorming in 2006 (feel free to correct me on this if there are current efforts taking place :-) ). I'm attempting to revise the schema for use in an upcoming August / September project release. I have taken the current brainstorming schema and added on some new items. I would like to open this up to discussion and move this format forward, and any assistance the community would be able to provide would be helpful. It's great that you're keen to take a lead on further brainstorming. Please conduct work on new microformats on the [uf-new] mailing list, rather than discuss. Thanks! Altered schema: http://jay.beweep.com/hproduct/hproduct-schema.txt Unstyled HTML example: http://jay.beweep.com/hproduct/hproduct-example.html The old product brainstorm is discarded, so please edit the wiki. Either start a fresh brainstorm section on the current page, or work with the existing text. The altered schema: For reference, much of the schema you describe there has been rolled into hListing, which whilst also technically a proposal, is more mature and has been implemented successfully by a number of people. That covers the price/merchant side of things. The documentation for listing, and interating on it to reach draft also needs doing, I apologise for not following through my intent to take a lead on that. Lots of other µf things have come up that always seem more urgent. I'd suggest that product-specific fields focus on the product _item_, e.g. where you have .hListing .item, or .hReview .item, you could insert an ‘hProduct’ there, enhancing the semantics, and achieving listing products with prices through the formats being used in combination. Regards, Ben (This post has been cross-posted to µf-new. Please reply *only* to µf- new) ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Ameer Dawood wrote: look what happened now. I jusst got the same kind of email. Looks like they are sending emails to email addresses found in hCards. *It's just like spam. I just dropped them a mail saying so. As a hardliner on this issue, my feeling is that any sentence that reads that's {like/almost like/a kind of/close to/etc} spam can be reduced to that's spam without loss of meaning or accuracy. The issue of spam and microformats is a dead horse that's already taken a fair amount of punishment, and I think the words out of scope were used last time the question came up. Still, I wanted to add a couple of comments. As far as consumers of microformats are concerned, I think that any system that generates automated mail to an address included in an hCard has crossed the line. Outside various rather improbable scenarios, there's no justification for doing this. As far as users of microformats are concerned, the choice is (a) include your address and expect to get spam, (b) leave your address out, or (c) obscure your address. I currently favor options (b) and (c). For (c), I actually recommend having a human-intelligible version (e.g. 'myaddress at example dot com') and then - if you like - having a run-on-document-ready Javascript function to convert it to a mailto: link for human consumption. Crawlers - both benign and malign - typically don't execute JS, so they won't see the actual email address. I don't think that's a bad thing for reasons indicated above. Tools that actually run in a browser context, such as Operator, should get the right result (Operator does). Angus -- ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Hi, Personally, I don't get the use of microformats if the data is obscured. I mean, how far would you go to retain your privacy with hCard? There has to be a point where it becomes useless. So, I suppose that puts me in favour of case A. Paul 2008/7/7 Angus McIntyre [EMAIL PROTECTED]: Ameer Dawood wrote: look what happened now. I jusst got the same kind of email. Looks like they are sending emails to email addresses found in hCards. *It's just like spam. I just dropped them a mail saying so. As a hardliner on this issue, my feeling is that any sentence that reads that's {like/almost like/a kind of/close to/etc} spam can be reduced to that's spam without loss of meaning or accuracy. The issue of spam and microformats is a dead horse that's already taken a fair amount of punishment, and I think the words out of scope were used last time the question came up. Still, I wanted to add a couple of comments. As far as consumers of microformats are concerned, I think that any system that generates automated mail to an address included in an hCard has crossed the line. Outside various rather improbable scenarios, there's no justification for doing this. As far as users of microformats are concerned, the choice is (a) include your address and expect to get spam, (b) leave your address out, or (c) obscure your address. I currently favor options (b) and (c). For (c), I actually recommend having a human-intelligible version (e.g. 'myaddress at example dot com') and then - if you like - having a run-on-document-ready Javascript function to convert it to a mailto: link for human consumption. Crawlers - both benign and malign - typically don't execute JS, so they won't see the actual email address. I don't think that's a bad thing for reasons indicated above. Tools that actually run in a browser context, such as Operator, should get the right result (Operator does). Angus -- ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Reviving hProduct
I'm glad to see someone else showing an interest in this. I'd love to see a product microformat revived as well for an application I'm working on. We might want to move this discussion to microformats-dev but I'm not sure... Anyway, some comments on your proposed changes: 1) The rel=product. Does that really describe the relationship between this instance/description of the product and the product page? I'm not sure it does. I get what you're saying though, I think, which is that this is the canonical product page. 2) I think the change from an msrp attribute described on the wiki to a price attribute makes sense. I'm not sure about including savings in there. It's not technically the price of the item. It's more of an adjustment. 3) What would you see as a value for the availability? I see in your example this is a text description of how the user can acquire it (e.g. in store pickup). When I initially read that attribute, it made me think more of an inventory quantity. Might there need to be an attribute for that as well? 4) As for the URL, I would think that might be better represented as a rel purchase value indicating that the target URI is for purchasing this item. Hayes ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
As far as users of microformats are concerned, the choice is (a) include your address and expect to get spam, (b) leave your address out, or (c) obscure your address. I currently favor options (b) and (c). For (c), I actually recommend having a human-intelligible version (e.g. 'myaddress at example dot com') and then - if you like - having a run-on-document-ready Javascript function to convert it to a mailto: link for human consumption. Crawlers - both benign and malign - typically don't execute JS, so they won't see the actual email address. I don't think that's a bad thing for reasons indicated above. Tools that actually run in a browser context, such as Operator, should get the right result (Operator does). Angus That's got nothing to do with microformats but when you really think that any obfuscation like bla dot domain is not indexed by spammers then you are in for a treat. There is no way to protect emails online without hurting usability or accessibility. Don't waste your time with JavaScript (de)obfuscation, it is a glass shield or - even closer - a pacifier button. What you put in microformats you should be happy with to be put out there to be found, indexed and converted. Obfuscated microformats that expect the reader technology to convert it before turning it for example into a vcard are just a nuisance for the end user. This is about unearthing information we already publish and make easier to access and re-use it, which is the opposite of obfuscating. ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Christian Heilmann wrote: That's got nothing to do with microformats ... With due respect, I don't completely accept that. A case could be made that factors that influence people's adoption of microformats are legitimate topics for discussion. Uneasiness about the 'spammability' of addresses published in hCard is a deterrent to full adoption of that microformat for many users. While these considerations don't belong in the spec, they can usefully be mentioned in texts about the spec, such as 'getting started' guides. ... when you really think that any obfuscation like bla dot domain is not indexed by spammers then you are in for a treat. There is no way to protect emails online without hurting usability or accessibility. Don't waste your time with JavaScript (de)obfuscation, it is a glass shield or - even closer - a pacifier button. Again, I'm not in complete agreement with you. My experience - and I have actually tested this, although not as rigorously or extensively as I'd like - is that very few spammers seem to be doing much de-obfuscation, and even trivial obfuscations _currently_ offer a good degree of protection. However, I don't expect that state of affairs to last, so it's a moot point. What you put in microformats you should be happy with to be put out there to be found, indexed and converted. Obfuscated microformats that expect the reader technology to convert it before turning it for example into a vcard are just a nuisance for the end user. In the Javascript-based approach that I mentioned, the browser takes care of everything, with no extra work needed by the reader. However, I concede that that might not extend to screen readers (although choosing a sane, human-readable representation for the basic form can help here). ... This is about unearthing information we already publish and make easier to access and re-use it, which is the opposite of obfuscating. OK, so there's an implicit challenge here. For users who are unwilling to expose their email address through hCard, what alternative mechanisms can microformats support? Many website owners use mail forms instead of publishing their email addresses. Is there a need for something like a simple 'rel=contactform' microformat to signal the availability and location of a mail contact form? Angus ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
2008/7/7 Angus McIntyre [EMAIL PROTECTED]: Christian Heilmann wrote: That's got nothing to do with microformats ... With due respect, I don't completely accept that. A case could be made that factors that influence people's adoption of microformats are legitimate topics for discussion. Uneasiness about the 'spammability' of addresses published in hCard is a deterrent to full adoption of that microformat for many users. --- the argument is orthogonal to microformats because this is not unique to microformats. Any time you add more semantic information to your data it potentially increases the 'spammability' of it. This goes for RDFa, eRDF, RDF, POSH, microformats, RSS and anything else might come along in the future. ... This is about unearthing information we already publish and make easier to access and re-use it, which is the opposite of obfuscating. OK, so there's an implicit challenge here. For users who are unwilling to expose their email address through hCard, what alternative mechanisms can microformats support? Many website owners use mail forms instead of publishing their email addresses. Is there a need for something like a simple 'rel=contactform' microformat to signal the availability and location of a mail contact form? You could simply use class=URL with a new rel-value. You can also mark-up your Chat profiles with their specific protocols, aim: msn: jabber: etc. Other people only vend the data after someone has authenticated themselves, so the microformats are NOT available to the general public, but instead to a white-list of contacts. -brian -- brian suda http://suda.co.uk ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
... This is about unearthing information we already publish and make easier to access and re-use it, which is the opposite of obfuscating. OK, so there's an implicit challenge here. For users who are unwilling to expose their email address through hCard, what alternative mechanisms can microformats support? Many website owners use mail forms instead of publishing their email addresses. Is there a need for something like a simple 'rel=contactform' microformat to signal the availability and location of a mail contact form? Angus Again: we are marking up content that is already published. If that person is taking steps to prevent the email or contact form to the available there is nothing microformats (or well, hcard for email) can do for that person. I love people that use email forms to make sure spammers can't get their mails, especially those that don't protect their forms against XSS and SQL injection and thus become a spam hub themselves :) Microformats are nothing that needs to be sold. A person that is unhappy to disclose information on the web will certainly not get them anyways. ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
On Mon, Jul 7, 2008 at 7:52 PM, Brian Suda [EMAIL PROTECTED] wrote: Other people only vend the data after someone has authenticated themselves, so the microformats are NOT available to the general public, but instead to a white-list of contacts. If you host some service that allows connections between users, you can use these connections and only reveal sensitive data to connected users... (if they require authorization by the invitee and you inform them of this) -- André Luís ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Brian Suda wrote: 2008/7/7 Angus McIntyre [EMAIL PROTECTED]: Christian Heilmann wrote: That's got nothing to do with microformats ... With due respect, I don't completely accept that. A case could be made that factors that influence people's adoption of microformats are legitimate topics for discussion. Uneasiness about the 'spammability' of addresses published in hCard is a deterrent to full adoption of that microformat for many users. --- the argument is orthogonal to microformats because this is not unique to microformats. Any time you add more semantic information to your data it potentially increases the 'spammability' of it. This goes for RDFa, eRDF, RDF, POSH, microformats, RSS and anything else might come along in the future. Yup. And we need to get much better (across various of these projects) in making clear to users what's going on, including the bad things that might happen. If user understanding and consent is handled better, downstream sites will know what they can or can't do with the data. Some examples: 1. tribe.net FOAF was repackaged on ex.plode.us; users freaked out: What is ex.plode.us and have we been sold out? topic posted Thu, February 28, 2008 - 6:02 PM http://brainstorm.tribe.net/thread/34fb1a79-351d-4251-8318-829623c1c9cb Result: tribe.net switched off their FOAF feeds. This could just as easily have been microformats. 2. Google Social Graph API (XFN and FOAF) The Google SGAPI makes it much easier to find out who the owner is of a YouTube account. This is currently relevant due to the Viacom/Google court case, in which Google have been asked to turn over all YouTube viewing logs, including both IP address and usernames. The judge took the view that the latter are essentially anonymous, despite the fact that the SGAPI makes it rather easy to associate YouTube URIs with FOAF and microformat data from elsewhere in the Web. Details here: http://danbri.org/words/2008/07/03/359 3. identi.ca, twitter-like microblog (opensource as laconi.ca) This microblogging platform encourages users to attach a Creative Commons license to their postings, which should give downstream aggregators a clearer sense of what can and can't be done with the data. We lack similar practice for FOAF and microformat content. Where I'd like to see this go, is via some survey of users, figuring out how rich an understanding of the situation we can expect of them (not much I fear) and some attempt to make a CC-like simplification through which they can express their preferences about how their profile data is aggregated and re-used. Considering the Tribe case, it would be nice if users could've said no commercial reuse (including banner adds) unless x% of profits go to http://charityofmychoice.example.com/. But we're a long way from that now. If the only concrete affect on users is spam and confusion, we'll find outselves back with data hidden in GIFs, I fear... cheers, Dan -- http://danbri.org/ ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Microformats search engine: virel
Dan Brickley wrote: ... we need to get much better (across various of these projects) in making clear to users what's going on, including the bad things that might happen. Agreed. Some examples: These are good examples, not least because they relate to 'good actors' (rather than 'bad actors', such as spammers, who can be expected to behave badly). Even well-intentioned (re-)use has implications and consequences. 3. identi.ca, twitter-like microblog (opensource as laconi.ca) This microblogging platform encourages users to attach a Creative Commons license to their postings, which should give downstream aggregators a clearer sense of what can and can't be done with the data. We lack similar practice for FOAF and microformat content. I think this is an interesting point. It might be worth reflecting on some other mechanisms that are used for expressing directives as to how content can be used. Of the mechanisms that I've come across, the most obvious are the CC-licenses that Dan mentioned. Next up is the robots exclusion protocol [1], and the extensions to it now supported by Google and Yahoo! [2]. The X-Robots-Tag with its 'noarchive' and 'nosnippet' directives provides fairly granular control over what may be done with content. Finally, there's the 'media:restriction' element used in mediaRSS [3]. In the standard, that's limited to specifying a country and deny to indicate that a given piece of media isn't for distribution to that country. However, some video hosting services overload it to specify restrictions on how their content may or may not be aggregated (and by whom). Possible directives governing use might include: individual only - for use by tools like Operator, but not to be crawled do not republish - allows automated processing, but not republishing non-commercial - only non-commercial republishing allowed no-spam - commercial republishing OK, but don't make unsolicited contact unrestricted - any legal use permissible If this actually represents a continuum, then you can make it a principle that data can only be republished under the same or more restrictive terms: if A publishes data with 'non-commercial' republishing allowed, then B may only republish it as 'non-commercial', 'do-not-republish' or 'individual only'. Angus [1] http://www.robotstxt.org/ [2] http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html [3] https://www.google.com/webmasters/tools/video/en/video.html ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] hoard.it
Sounds great! How does it deal with dates commonly found in genealogy, such as ABT 7 July 1950 or AFT 25 Dec 2000 or BEF Jan 1925? or even ABT 2000 ? --Bob. On 3 Jul 2008 at 23:03, Jim O'Donnell wrote: Hello, This might be of interest to members of this group, as it deals with extracting data from semantic HTML. Prior to this year's Mashed Museum event at the University of Leicester, Dan Zambonini put together a prototype which aggregates data by spidering online museum catalogues: http://hoardit.pbwiki.com/ It's a pretty fantastic demo of how information can be extracted from well-structured HTML, even before you think of putting microformats etc. on top. In particular, it does a pretty good job of figuring out when an object was made: http://feeds.boxuk.com/museums/object_100yrs.php The date parser is based on some code Dan I knocked together at Mashed Museum 2007, which looks at strings like 'late Victorian', 'early 20th Century', '4th January 1853' and so on, and converts them to machine-readable ISO dates. Our original idea, which we never got round to actually implementing, was that this would be useful as a web service - you give it a string, it gives you a machine-parsable representation of that string. The recent discussion here about dates has made me wonder if such a web service woud be useful for microformats parsers. What do others think? Cheers Jim Jim O'Donnell [EMAIL PROTECTED] http://eatyourgreens.org.uk http://flickr.com/photos/eatyourgreens ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss -- -- -- -- Bob Jonkman [EMAIL PROTECTED] http://sobac.com/sobac/ SOBAC Microcomputer Services Voice: +1-519-669-0388 6 James Street, Elmira ON Canada N3B 1L5 Cel: +1-519-635-9413 Software --- Office Business Automation --- Consulting ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] hoard.it
Jim O'Donnell wrote: The recent discussion here about dates has made me wonder if such a web service woud be useful for microformats parsers. What do others think? It seems to me that this type of date extraction might present risks if used by uf parsers to extract date/time from published content (and lead to the people showing up on the wrong date error mentioned in earlier posts). On the other hand, it might be great at the time content is authored, to convert ambiguous natural language dates into unambiguous microformats, as a way to reduce the pain of micro-formatting content (especially it can detect dates in plain text rather than parsing something it knows is a date). Authors could confirm the generated microformats before publishing in a way similar to how Yahoo! shortcuts Wordpress plugin works [1] Guillaume [1] http://lebleu.org/blog/2008/02/09/trying-out-yahoo-shortcuts/ ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss