RE: [uf-discuss] Currency Quickpoll: Preliminary results
It's not just about identifying which symbol represents the currency, but also which currency that symbol represents. That's handled in my example. For a program to do so, it would have to be aware of every single alphanumeric character in Unicode. That does not just include [A-Za-z0-9]. It might be easier to do the reverse and know of every character that isn't a known currency symbol, but then even that list of symbols is missing some. Is there not a regular expression that can provide every single alphanumeric character? Alternately, wouldn't it be preferred to have minimal markup if [A-Za-z0-9] can be assumed and more complex markup if it cannot as opposed to having all cases be the more complex markup? However, the format could make provisions for some form of quantity, even if it doesn't explicitly define what such quantities are. I assume you are suggesting it would be optional, not required? OTOH, if there is another microformat planned for measure, is it advisable to design in overlap? One of the problems I have with hCard is that those abbreviated class names are difficult to comprehend and remember. In programming I generally prefer long well described names, but I called the question in case there would be people not implementing it just because they wanted to avoid bloat. I am not suggesting that I know that this is an issue, just posing the question. Abbreviations can be good in many cases, but you have to be careful not to introduce too much confusion or ambiguity for authors. However, I would disagree with abbreviations; the more ways it can be done, the more complexity in the spec and in the parser. Better to have just one way until desired functionality requires multiple ways. -Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lachlan Hunt Sent: Friday, October 13, 2006 4:19 AM To: Microformats Discuss Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results Mike Schinkel wrote: Thanks for the clarification. Further questions (and forgive me if I missed any of this before I joined): Currency symbol identification This is a naïve question: Doesn't the ISO 4217 code *imply* a symbol? It appears so here: http://www.xe.com/symbols.htm Doesn't including this in the microformat create redundancy? It's not just about identifying which symbol represents the currency, but also which currency that symbol represents. Alternately, can't the symbols be extracted as not being alphanumeric characters? For a program to do so, it would have to be aware of every single alphanumeric character in Unicode. That does not just include [A-Za-z0-9]. It might be easier to do the reverse and know of every character that isn't a known currency symbol, but then even that list of symbols is missing some. e.g. * U+FE69 ﹩ (Small Dollar Sign) * U+FF04 $ (Fullwidth Dollar Sign) * U+FFE5 ¥ (Fullwidth Yen Sign) * etc. It's much easier for the author to explicitly state which character(s) represent the symbol, than implementing heuristics to guess. Broader Question: Isn't the idea behind Microformats to be as consise, cohesive, and single purposed as possible? If so, wouldn't that argue for combination with units (ex. $34 per gallon, $2 per miles) being out of scope and begging the need for a microformat that allows unit designation, i.e. hUnits? Yes. Tackling the problem of identifying specific units under the currency format is far too complicated when you consider the sheer number of units there are, including SI units, Imperial units and US customary units, used for various quantities including number of units, length, mass, time, volume, area, energy, etc. However, the format could make provisions for some form of quantity, even if it doesn't explicitly define what such quantities are. e.g. price per Litre: span class=money abbr class=currency unit title=AUD$/abbr1.23 span class=quantityL/span /span Or for each unit: span class=money abbr class=unit$/abbr4.95 span class=quantityeach/span /span That way, if and when a microformat for units of measurement is introduced, that could easily be expanded to the following. e.g. span class=quantity si-unitL/span My last thought on the subject, is why are we using full names for currency and amount instead of cur and amt to minimize bloat when hCard uses names like fn? One of the problems I have with hCard is that those abbreviated class names are difficult to comprehend and remember. e.g. It's easy to get confused about what 'fn' means, since it could easily stand for family name, though it doesn't. (I'm not exactly sure what it stands for, though I assume it means formatted name even though it's not explicitly stated as such in the vCard RFC) Abbreviations can be good in many cases, but you have to be careful not to introduce too much confusion or ambiguity for authors. -- Lachlan Hunt
Re: [uf-discuss] Currency Quickpoll: Preliminary results
Mike Schinkel wrote: Lachlan Hunt wrote: For a program to do so, it would have to be aware of every single alphanumeric character in Unicode. That does not just include [A-Za-z0-9]. It might be easier to do the reverse and know of every character that isn't a known currency symbol, but then even that list of symbols is missing some. Is there not a regular expression that can provide every single alphanumeric character? No, that's my point. Have you seen how many characters there are in Unicode? It may be theoretically possible to write such a regular expression, but it would very complex. Alternately, wouldn't it be preferred to have minimal markup if [A-Za-z0-9] can be assumed and more complex markup if it cannot as opposed to having all cases be the more complex markup? [A-Za-z0-9] effectively only covers English. There are hundreds of languages and thousands of characters covered by Unicode. However, the format could make provisions for some form of quantity, even if it doesn't explicitly define what such quantities are. I assume you are suggesting it would be optional, not required? Yes. OTOH, if there is another microformat planned for measure, is it advisable to design in overlap? I don't see it as overlapping, but rather leaving room for future expansion. One of the problems I have with hCard is that those abbreviated class names are difficult to comprehend and remember... Abbreviations can be good in many cases, but you have to be careful not to introduce too much confusion or ambiguity for authors. However, I would disagree with abbreviations; the more ways it can be done, the more complexity in the spec and in the parser. Better to have just one way until desired functionality requires multiple ways. What? I have no idea what you're talking about, I think you misunderstood what I was saying. By abbreviations, I was referring to abbreviated class names, like those used in hCard. -- Lachlan Hunt http://lachy.id.au/ ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Currency Quickpoll: Preliminary results
In message [EMAIL PROTECTED], Mike Schinkel [EMAIL PROTECTED] writes Alternately, can't the symbols be extracted as not being alphanumeric characters? Consider (for example): The £ was worth 2.50 dollars or: £1 was worth 2.50 dollars The currency proposal at http://microformats.org/wiki/currency-brainstorming#Andy_Mabbett just seems really complex to me (but maybe it has to be.) I'd say the latter, but if, not, try paring away whatever you consider unnecessary, and see at what point you'd stop. Note also my recent message about the reasons for needing to identify the 'symbol'. -- Andy Mabbett Say NO! to compulsory ID Cards: http://www.no2id.net/ Free Our Data: http://www.freeourdata.org.uk ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Currency Quickpoll: Preliminary results
In message [EMAIL PROTECTED], Lachlan Hunt [EMAIL PROTECTED] writes It's easy to get confused about what 'fn' means, since it could easily stand for family name, though it doesn't. (I'm not exactly sure what it stands for, though I assume it means formatted name even though it's not explicitly stated as such in the vCard RFC) Family Name First Name (note that these first two are directly contradictory!) Full Name Formatted Name Familiar Name ... -- Andy Mabbett Say NO! to compulsory ID Cards: http://www.no2id.net/ Free Our Data: http://www.freeourdata.org.uk ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
RE: [uf-discuss] Currency Quickpoll: Preliminary results
[A-Za-z0-9] effectively only covers English. There are hundreds of languages and thousands of characters covered by Unicode. I concur, but your statement does not make my suggestion invalid. I was suggested (said a different way) a default that doesn't require the additional complexity of always having to define the currency symbol, letting it instead be assumed (i.e. if symbol is not specificed then assume that any non [A-Za-z0-9] characters comprise currency symbols), and if it is required then include the symbol. Complexity of implementation will be the bane of adoption; I'm pushing to reduce complexity. This after being someone the prior 20 years always advocated to approach perfection which increased complexity. I'm learning some valuable lessons from other's Web 2.0 successes. I don't see it as overlapping, but rather leaving room for future expansion. Okay. What? I have no idea what you're talking about, I think you misunderstood what I was saying. By abbreviations, I was referring to abbreviated class names, like those used in hCard. I may have misunderstood. I thought you were saying it would be possible to support *both* a long name and an abbreviation. If I misunderstood, sorry for my missing the point. -Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lachlan Hunt Sent: Saturday, October 14, 2006 6:41 AM To: Microformats Discuss Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results Mike Schinkel wrote: Lachlan Hunt wrote: For a program to do so, it would have to be aware of every single alphanumeric character in Unicode. That does not just include [A-Za-z0-9]. It might be easier to do the reverse and know of every character that isn't a known currency symbol, but then even that list of symbols is missing some. Is there not a regular expression that can provide every single alphanumeric character? No, that's my point. Have you seen how many characters there are in Unicode? It may be theoretically possible to write such a regular expression, but it would very complex. Alternately, wouldn't it be preferred to have minimal markup if [A-Za-z0-9] can be assumed and more complex markup if it cannot as opposed to having all cases be the more complex markup? [A-Za-z0-9] effectively only covers English. There are hundreds of languages and thousands of characters covered by Unicode. However, the format could make provisions for some form of quantity, even if it doesn't explicitly define what such quantities are. I assume you are suggesting it would be optional, not required? Yes. OTOH, if there is another microformat planned for measure, is it advisable to design in overlap? I don't see it as overlapping, but rather leaving room for future expansion. One of the problems I have with hCard is that those abbreviated class names are difficult to comprehend and remember... Abbreviations can be good in many cases, but you have to be careful not to introduce too much confusion or ambiguity for authors. However, I would disagree with abbreviations; the more ways it can be done, the more complexity in the spec and in the parser. Better to have just one way until desired functionality requires multiple ways. What? I have no idea what you're talking about, I think you misunderstood what I was saying. By abbreviations, I was referring to abbreviated class names, like those used in hCard. -- Lachlan Hunt http://lachy.id.au/ ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
RE: [uf-discuss] Currency Quickpoll: Preliminary results
I'm not a retailer, but if I was, I'm sure I wouldn't consider the prospect of a 20% failure rate very satisfactory... I didn't imply that at all. Explicitly stated, I was saying that edge cases would be in the 20 percentile, not that we'd have a 20% failure rate. Further, I said I believe that it would be much more likely to see adoption if the 80 percentile case were much easier to implement. It is easier to get people to learn and adopt complexity if they can get started by not having to learn and use complexity. Once bought in to a concept, assuming the complexity slope is not to steep, people will incremetally accept complexity. But they won't accept complexity up front. It is a concept I learned from one of my college professors called transitionality. I blogged about it a few years ago: http://www.mikeschinkel.com/blog/DevelopmentToolsNeedTransitionality.aspx We can look look back at OS/2 and Windows; in the early days one was optimized for quality and one was optimized for adoption. Look which one won. That said, why not make the symbol markup optional? That's IMO is an additional good idea. -Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andy Mabbett Sent: Saturday, October 14, 2006 4:22 PM To: Microformats Discuss Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results In message [EMAIL PROTECTED], Mike Schinkel [EMAIL PROTECTED] writes £1 was worth 2.50 dollars Those are edge cases which require additional complexity. I'm advocating that edge cases, which are certainly in the 20 percentile or less have the complexity whereas the more common use-cases (certainly more than 80 percentile) should require less complex markup. Most of the time we see just $2.50 or just £1. My point is Why require all the overhead (which will likely cause this microformat not to get used very often) in order to support far less common use-cases with the same markup? From my experience of running a small internet retailer for 12 years I'm not a retailer, but if I was, I'm sure I wouldn't consider the prospect of a 20% failure rate very satisfactory... That said, why not make the symbol markup optional? -- Andy Mabbett Say NO! to compulsory ID Cards: http://www.no2id.net/ Free Our Data: http://www.freeourdata.org.uk ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
RE: [uf-discuss] Currency Quickpoll: Preliminary results
Thanks for considering my input. As for money vs. currency for some intangible reason I prefer currency, maybe because currency datatype always seemed more natural than money data type in programming, but I don't prefer it strongly enough to argue the point! :) P.S. I added my vote to your poll, but only selected three of eight thinking the rest shouldn't be included. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Guillaume Lebleu Sent: Friday, October 13, 2006 1:28 AM To: Microformats Discuss Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results Mike Schinkel wrote: This is a naïve question: Doesn't the ISO 4217 code *imply* a symbol? It appears so here: http://www.xe.com/symbols.htm Doesn't including this in the microformat create redundancy? Alternately, can't the symbols be extracted as not being alphanumeric characters? I tend to agree with you and see this as a bit redundant, but I felt I would reproduce the suggestion for the sake of not ignoring anyone's in the vote. I wouldn't have guessed that meaning; I thought your were talking worldwide, not document scope. :) So how would you mark up http://tonto.eia.doe.gov/dnav/pet/pet_pri_spt_s1_d.htm ? Can you show the actual HTML to help me better understand? (not for the entire file, just a snippet.) One solution is to use the include-pattern only; another solution is to use the th scope only (if the currency is present in the column header), or a combination of the two: amounts in span id=#u1 class=currencyUSD/span. ... tr th scope=colpricea href=#u1 class=include/a/th td24/td /tr If so, wouldn't that argue for combination with units (ex. $34 per gallon, $2 per miles) being out of scope and begging the need for a microformat that allows unit designation, i.e. hUnits? We came to the same conclusion. A separate measure effort was started: http://microformats.org/wiki/measure Anyway, I made a proposal here: http://microformats.org/wiki/currency-brainstorming#Mike_Schinkel with the idea of trying to minimize the burden placed on the author of the HTML, and only use lots of markup in the exceptional cases. You have some good points there. That said, I think that currency should not be the root class, money should be, since semantically (to me) $35 is not a currency, it is money, and currency is part of money. But I see the benefits of briefness. My last thought on the subject, is why are we using full names for currency and amount instead of cur and amt to minimize bloat when hCard uses names like fn? Good point too. I will try to document the different options presented over the last days. It does not seem that we will get a 100% on all feature implementations, so I guess it will be up for the community to decide through a vote, or limit the feature scope to what is 100% agreed, namely currency disambiguation. Thank you, Guillaume ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Currency Quickpoll: Preliminary results
Mike Schinkel wrote: Thanks for the clarification. Further questions (and forgive me if I missed any of this before I joined): Currency symbol identification This is a naïve question: Doesn't the ISO 4217 code *imply* a symbol? It appears so here: http://www.xe.com/symbols.htm Doesn't including this in the microformat create redundancy? It's not just about identifying which symbol represents the currency, but also which currency that symbol represents. Alternately, can't the symbols be extracted as not being alphanumeric characters? For a program to do so, it would have to be aware of every single alphanumeric character in Unicode. That does not just include [A-Za-z0-9]. It might be easier to do the reverse and know of every character that isn't a known currency symbol, but then even that list of symbols is missing some. e.g. * U+FE69 ﹩ (Small Dollar Sign) * U+FF04 $ (Fullwidth Dollar Sign) * U+FFE5 ¥ (Fullwidth Yen Sign) * etc. It's much easier for the author to explicitly state which character(s) represent the symbol, than implementing heuristics to guess. Broader Question: Isn't the idea behind Microformats to be as consise, cohesive, and single purposed as possible? If so, wouldn't that argue for combination with units (ex. $34 per gallon, $2 per miles) being out of scope and begging the need for a microformat that allows unit designation, i.e. hUnits? Yes. Tackling the problem of identifying specific units under the currency format is far too complicated when you consider the sheer number of units there are, including SI units, Imperial units and US customary units, used for various quantities including number of units, length, mass, time, volume, area, energy, etc. However, the format could make provisions for some form of quantity, even if it doesn't explicitly define what such quantities are. e.g. price per Litre: span class=money abbr class=currency unit title=AUD$/abbr1.23 span class=quantityL/span /span Or for each unit: span class=money abbr class=unit$/abbr4.95 span class=quantityeach/span /span That way, if and when a microformat for units of measurement is introduced, that could easily be expanded to the following. e.g. span class=quantity si-unitL/span My last thought on the subject, is why are we using full names for currency and amount instead of cur and amt to minimize bloat when hCard uses names like fn? One of the problems I have with hCard is that those abbreviated class names are difficult to comprehend and remember. e.g. It's easy to get confused about what 'fn' means, since it could easily stand for family name, though it doesn't. (I'm not exactly sure what it stands for, though I assume it means formatted name even though it's not explicitly stated as such in the vCard RFC) Abbreviations can be good in many cases, but you have to be careful not to introduce too much confusion or ambiguity for authors. -- Lachlan Hunt http://lachy.id.au/ ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
RE: [uf-discuss] Currency Quickpoll: Preliminary results
I want to vote on the poll but can you clarify what certain options mean exactly, maybe by hypothetical examples (quoted parts are what confuse me)? * Currency symbol identification from other part of the text * Global currency definition * Amount identification from other part of the text Thanks in advance. -Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Guillaume Lebleu Sent: Thursday, October 12, 2006 10:59 AM To: Microformats Discuss Subject: [uf-discuss] Currency Quickpoll: Preliminary results I thought I'd share these results with you. Voters were asked to select up to 4 features in a list of 8. We only had a handful of votes so far, so please cast yours at: http://www.vizu.com/poll-vote.html?n=15067 Features deemed most important: 1. (100%) Currency used identification (ex. US dollars versus Canadian dollars) 2. (83.3%) Currency unit/denomination used identification (ex. dollar versus cent, pound versus shilling) 3. (50%) * Amount identification from other part of the text * Support for combination with units (ex. $34 per gallon, $2 per miles) 4. (33.3%) Global currency definition (ex. all amounts in table are in US dollars) 5. (16.7%) Currency symbol identification from other part of the text (ex. $ is the dollar sign) 6. (0%) * Dated money amounts (ex. Five 1929 US dollars) * Support for non-numerical representation (ex. 10 dollars 99 cents, five pounds 23 pence) Guillaume ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Currency Quickpoll: Preliminary results
Mike Schinkel wrote: * Currency symbol identification from other part of the text This means that in $25 dollars, we would mark up $ as the currency symbol. See http://microformats.org/wiki/currency-brainstorming#Andy_Mabbett under symbol bullet for an explanation of this. * Global currency definition This means that a currency can be defined once in the document (just like you define once a global variable in a program) and then refer to when needed, instead of locally defined every time. See for instance: http://tonto.eia.doe.gov/dnav/pet/pet_pri_spt_s1_d.htm where there is a global legend Products in Cents per Gallon, and then the numbers have no currency symbol. * Amount identification from other part of the text This means that in $25 dollars or twenty five USD dollars, we would mark up 25 and twenty-five as amount so that it can easily be extracted from the rest of the string. With a numerical value it may not be necessary, with a textual representation, it may be necessary. So, depending on the scope of the proposal (do we want to support textual, another feature choice in the poll), this may be a related important feature or not. Hope this helps. Guillaume ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss
Re: [uf-discuss] Currency Quickpoll: Preliminary results
In my previous table example, you should read: tr th scope=colpricea href=#u1 class=include/a/th /tr tr td24/td /tr Guillaume ___ microformats-discuss mailing list microformats-discuss@microformats.org http://microformats.org/mailman/listinfo/microformats-discuss