RE: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-14 Thread Mike Schinkel
 It's not just about identifying which symbol represents the currency, but 
 also which currency that symbol represents.

That's handled in my example.

 For a program to do so, it would have to be aware of every single 
 alphanumeric character in Unicode.  That does not just include [A-Za-z0-9].  
 It might be easier to do the reverse and know of every character that isn't 
 a known currency symbol, but then even that list of symbols is missing some.

Is there not a regular expression that can provide every single alphanumeric 
character?  Alternately, wouldn't it be preferred to have minimal markup if 
[A-Za-z0-9] can be assumed and more complex markup if it cannot as opposed to 
having all cases be the more complex markup?

 However, the format could make provisions for some form of quantity, even if 
 it doesn't explicitly define what such quantities are.

I assume you are suggesting it would be optional, not required?

OTOH, if there is another microformat planned for measure, is it advisable to 
design in overlap?

 One of the problems I have with hCard is that those abbreviated class names 
 are difficult to comprehend and remember.  

In programming I generally prefer long well described names, but I called the 
question in case there would be people not implementing it just because they 
wanted to avoid bloat. I am not suggesting that I know that this is an issue, 
just posing the question.

 Abbreviations can be good in many cases, but you have to be careful not to 
 introduce too much confusion or ambiguity for authors.

However, I would disagree with abbreviations; the more ways it can be done, the 
more complexity in the spec and in the parser.  Better to have just one way 
until desired functionality requires multiple ways.

-Mike

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lachlan Hunt
Sent: Friday, October 13, 2006 4:19 AM
To: Microformats Discuss
Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results

Mike Schinkel wrote:
 Thanks for the clarification.
 
 Further questions (and forgive me if I missed any of this before I joined):
 
 Currency symbol identification
 
 This is a naïve question: Doesn't the ISO 4217 code *imply* a symbol?  
 It appears so here: http://www.xe.com/symbols.htm  Doesn't including 
 this in the microformat create redundancy?

It's not just about identifying which symbol represents the currency, but also 
which currency that symbol represents.

 Alternately, can't the symbols be extracted as not being alphanumeric 
 characters?

For a program to do so, it would have to be aware of every single alphanumeric 
character in Unicode.  That does not just include [A-Za-z0-9].  It might be 
easier to do the reverse and know of every character that isn't a known 
currency symbol, but then even that list of symbols is missing some.

e.g.
* U+FE69 ﹩ (Small Dollar Sign)
* U+FF04 $ (Fullwidth Dollar Sign)
* U+FFE5 ¥ (Fullwidth Yen Sign)
* etc.

It's much easier for the author to explicitly state which character(s) 
represent the symbol, than implementing heuristics to guess.

 Broader Question: 
 Isn't the idea behind Microformats to be as consise, cohesive, and 
 single purposed as possible?  If so, wouldn't that argue for 
 combination with units (ex. $34 per gallon, $2 per miles) being out 
 of scope and begging the need for a microformat that allows unit designation, 
 i.e. hUnits?

Yes.  Tackling the problem of identifying specific units under the currency 
format is far too complicated when you consider the sheer number of units there 
are, including SI units, Imperial units and US customary units, used for 
various quantities including number of units, length, mass, time, volume, area, 
energy, etc.

However, the format could make provisions for some form of quantity, even if it 
doesn't explicitly define what such quantities are.

e.g. price per Litre:

span class=money
   abbr class=currency unit title=AUD$/abbr1.23
   span class=quantityL/span
/span

Or for each unit:

span class=money
   abbr class=unit$/abbr4.95
   span class=quantityeach/span
/span

That way, if and when a microformat for units of measurement is introduced, 
that could easily be expanded to the following.  e.g.

   span class=quantity si-unitL/span

 My last thought on the subject, is why are we using full names for 
 currency and amount instead of cur and amt to minimize bloat when 
 hCard uses names like fn?

One of the problems I have with hCard is that those abbreviated class names are 
difficult to comprehend and remember.  e.g. It's easy to get confused about 
what 'fn' means, since it could easily stand for family name, though it 
doesn't.  (I'm not exactly sure what it stands for, though I assume it means 
formatted name even though it's not explicitly stated as such in the vCard 
RFC)

Abbreviations can be good in many cases, but you have to be careful not to 
introduce too much confusion or ambiguity for authors.

--
Lachlan Hunt

Re: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-14 Thread Lachlan Hunt

Mike Schinkel wrote:

Lachlan Hunt wrote:

For a program to do so, it would have to be aware of every single
alphanumeric character in Unicode.  That does not just include
[A-Za-z0-9].  It might be easier to do the reverse and know of
every character that isn't a known currency symbol, but then even
that list of symbols is missing some.


Is there not a regular expression that can provide every single
alphanumeric character?


No, that's my point.  Have you seen how many characters there are in 
Unicode?  It may be theoretically possible to write such a regular 
expression, but it would very complex.


Alternately, wouldn't it be preferred to have minimal markup if 
[A-Za-z0-9] can be assumed and more complex markup if it cannot 
as opposed to having all cases be the more complex markup?


[A-Za-z0-9] effectively only covers English.  There are hundreds of 
languages and thousands of characters covered by Unicode.



However, the format could make provisions for some form of
quantity, even if it doesn't explicitly define what such quantities
are.


I assume you are suggesting it would be optional, not required?


Yes.


OTOH, if there is another microformat planned for measure, is it
advisable to design in overlap?


I don't see it as overlapping, but rather leaving room for future expansion.

One of the problems I have with hCard is that those abbreviated class 
names are difficult to comprehend and remember...


Abbreviations can be good in many cases, but you have to be careful
not to introduce too much confusion or ambiguity for authors.


However, I would disagree with abbreviations; the more ways it can be
done, the more complexity in the spec and in the parser.  Better to
have just one way until desired functionality requires multiple ways.


What?  I have no idea what you're talking about, I think you 
misunderstood what I was saying.  By abbreviations, I was referring to 
abbreviated class names, like those used in hCard.


--
Lachlan Hunt
http://lachy.id.au/
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


Re: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-14 Thread Andy Mabbett
In message [EMAIL PROTECTED], Mike Schinkel
[EMAIL PROTECTED] writes


Alternately, can't the symbols be extracted as not being alphanumeric
characters?

Consider (for example):

The £ was worth 2.50 dollars
or:

£1 was worth 2.50 dollars

The
currency proposal at
http://microformats.org/wiki/currency-brainstorming#Andy_Mabbett just seems
really complex to me (but maybe it has to be.)

I'd say the latter, but if, not, try paring away whatever you consider
unnecessary, and see at what point you'd stop.

Note also my recent message about the reasons for needing to identify
the 'symbol'.
-- 
Andy Mabbett
Say NO! to compulsory ID Cards:  http://www.no2id.net/

Free Our Data:  http://www.freeourdata.org.uk
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


Re: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-14 Thread Andy Mabbett
In message [EMAIL PROTECTED], Lachlan Hunt
[EMAIL PROTECTED] writes

It's easy to get confused about what 'fn' means, since it could easily
stand for family name, though it doesn't.  (I'm not exactly sure what
it stands for, though I assume it means formatted name even though
it's not explicitly stated as such in the vCard RFC)

Family Name
First Name (note that these first two are directly contradictory!)
Full Name
Formatted Name
Familiar Name
...
-- 
Andy Mabbett
Say NO! to compulsory ID Cards:  http://www.no2id.net/

Free Our Data:  http://www.freeourdata.org.uk
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


RE: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-14 Thread Mike Schinkel
 [A-Za-z0-9] effectively only covers English.  There are hundreds of
languages and thousands of characters covered by Unicode. 

I concur, but your statement does not make my suggestion invalid. I was
suggested (said a different way) a default that doesn't require the
additional complexity of always having to define the currency symbol,
letting it instead be assumed (i.e. if symbol is not specificed then assume
that any non [A-Za-z0-9] characters comprise currency symbols), and if it is
required then include the symbol.  

Complexity of implementation will be the bane of adoption; I'm pushing to
reduce complexity. This after being someone the prior 20 years always
advocated to approach perfection which increased complexity. I'm learning
some valuable lessons from other's Web 2.0 successes.

 I don't see it as overlapping, but rather leaving room for future
expansion.

Okay.

 What?  I have no idea what you're talking about, I think you
misunderstood what I was saying.  By abbreviations, I was referring to
abbreviated class names, like those used in hCard.

I may have misunderstood. I thought you were saying it would be possible to
support *both* a long name and an abbreviation. If I misunderstood, sorry
for my missing the point.

-Mike

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Lachlan
Hunt
Sent: Saturday, October 14, 2006 6:41 AM
To: Microformats Discuss
Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results

Mike Schinkel wrote:
 Lachlan Hunt wrote:
 For a program to do so, it would have to be aware of every single 
 alphanumeric character in Unicode.  That does not just include 
 [A-Za-z0-9].  It might be easier to do the reverse and know of every 
 character that isn't a known currency symbol, but then even that list 
 of symbols is missing some.
 
 Is there not a regular expression that can provide every single 
 alphanumeric character?

No, that's my point.  Have you seen how many characters there are in
Unicode?  It may be theoretically possible to write such a regular
expression, but it would very complex.

 Alternately, wouldn't it be preferred to have minimal markup if 
 [A-Za-z0-9] can be assumed and more complex markup if it cannot as 
 opposed to having all cases be the more complex markup?

[A-Za-z0-9] effectively only covers English.  There are hundreds of
languages and thousands of characters covered by Unicode.

 However, the format could make provisions for some form of quantity, 
 even if it doesn't explicitly define what such quantities are.
 
 I assume you are suggesting it would be optional, not required?

Yes.

 OTOH, if there is another microformat planned for measure, is it 
 advisable to design in overlap?

I don't see it as overlapping, but rather leaving room for future expansion.

 One of the problems I have with hCard is that those abbreviated class 
 names are difficult to comprehend and remember...

 Abbreviations can be good in many cases, but you have to be careful 
 not to introduce too much confusion or ambiguity for authors.
 
 However, I would disagree with abbreviations; the more ways it can be 
 done, the more complexity in the spec and in the parser.  Better to 
 have just one way until desired functionality requires multiple ways.

What?  I have no idea what you're talking about, I think you misunderstood
what I was saying.  By abbreviations, I was referring to abbreviated class
names, like those used in hCard.

--
Lachlan Hunt
http://lachy.id.au/
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


RE: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-14 Thread Mike Schinkel
 I'm not a retailer, but if I was, I'm sure I wouldn't consider the
prospect of a 20% failure rate very satisfactory...

I didn't imply that at all.  Explicitly stated, I was saying that edge cases
would be in the 20 percentile, not that we'd have a 20% failure rate.

Further, I said I believe that it would be much more likely to see adoption
if the 80 percentile case were much easier to implement.  It is easier to
get people to learn and adopt complexity if they can get started by not
having to learn and use complexity. Once bought in to a concept, assuming
the complexity slope is not to steep, people will incremetally accept
complexity. But they won't accept complexity up front. It is a concept I
learned from one of my college professors called transitionality.  I
blogged about it a few years ago:

http://www.mikeschinkel.com/blog/DevelopmentToolsNeedTransitionality.aspx

We can look look back at OS/2 and Windows; in the early days one was
optimized for quality and one was optimized for adoption. Look which one
won.

 That said, why not make the symbol markup optional? 

That's IMO is an additional good idea.  

-Mike


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andy
Mabbett
Sent: Saturday, October 14, 2006 4:22 PM
To: Microformats Discuss
Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results

In message [EMAIL PROTECTED], Mike Schinkel
[EMAIL PROTECTED] writes

£1 was worth 2.50 dollars

Those are edge cases which require additional complexity. I'm 
advocating that edge cases, which are certainly in the 20 percentile or 
less have the complexity whereas the more common use-cases (certainly 
more than 80 percentile) should require less complex markup.

Most of the time we see just $2.50 or just £1. My point is Why require 
all the overhead (which will likely cause this microformat not to get 
used very often) in order to support far less common use-cases with the 
same markup?

From my experience of running a small internet retailer for 12 years

I'm not a retailer, but if I was, I'm sure I wouldn't consider the prospect
of a 20% failure rate very satisfactory...

That said, why not make the symbol markup optional?

--
Andy Mabbett
Say NO! to compulsory ID Cards:  http://www.no2id.net/

Free Our Data:  http://www.freeourdata.org.uk
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


RE: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-13 Thread Mike Schinkel
Thanks for considering my input.   As for money vs. currency for some
intangible reason I prefer currency, maybe because currency datatype always
seemed more natural than money data type in programming, but I don't prefer
it strongly enough to argue the point! :)

P.S. I added my vote to your poll, but only selected three of eight thinking
the rest shouldn't be included.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Guillaume Lebleu
Sent: Friday, October 13, 2006 1:28 AM
To: Microformats Discuss
Subject: Re: [uf-discuss] Currency Quickpoll: Preliminary results


Mike Schinkel wrote:

This is a naïve question: Doesn't the ISO 4217 code *imply* a symbol?  
It appears so here: http://www.xe.com/symbols.htm  Doesn't including 
this in the microformat create redundancy?

Alternately, can't the symbols be extracted as not being alphanumeric 
characters?

I tend to agree with you and see this as a bit redundant, but I felt I would
reproduce the suggestion for the sake of not ignoring anyone's in the vote.


I wouldn't have guessed that meaning; I thought your were talking
worldwide,
not document scope. :)  So how would you mark up
http://tonto.eia.doe.gov/dnav/pet/pet_pri_spt_s1_d.htm ?  Can you show the
actual HTML to help me better understand?  (not for the entire file, just a
snippet.)

One solution is to use the include-pattern only; another solution is to 
use the th scope only (if the currency is present in the column header), 
or a combination of the two:

amounts in span id=#u1 class=currencyUSD/span.

...
tr
th scope=colpricea href=#u1 class=include/a/th   
td24/td
/tr

If so, wouldn't that argue for combination with
units (ex. $34 per gallon, $2 per miles) being out of scope and begging
the
need for a microformat that allows unit designation, i.e. hUnits?
  

We came to the same conclusion. A separate measure effort was started: 
http://microformats.org/wiki/measure

Anyway, I made a proposal here:
http://microformats.org/wiki/currency-brainstorming#Mike_Schinkel with the
idea of trying to minimize the burden placed on the author of the HTML, and
only use lots of markup in the exceptional cases.

  

You have some good points there. That said, I think that currency should 
not be the root class, money should be, since semantically (to me) $35 
is not a currency, it is money, and currency is part of money. But I see 
the benefits of briefness.

My last thought on the subject, is why are we using full names for currency
and amount instead of cur and amt to minimize bloat when hCard uses
names like fn?
  

Good point too.

I will try to document the different options presented over the last 
days. It does not seem that we will get a 100% on all feature 
implementations, so I guess it will be up for the community to decide 
through a vote, or limit the feature scope to what is 100% agreed, 
namely currency disambiguation.

Thank you,

Guillaume

  

___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


Re: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-13 Thread Lachlan Hunt

Mike Schinkel wrote:

Thanks for the clarification.

Further questions (and forgive me if I missed any of this before I joined):


Currency symbol identification


This is a naïve question: Doesn't the ISO 4217 code *imply* a symbol?  It
appears so here: http://www.xe.com/symbols.htm  Doesn't including this in
the microformat create redundancy?


It's not just about identifying which symbol represents the currency, 
but also which currency that symbol represents.



Alternately, can't the symbols be extracted as not being alphanumeric
characters?


For a program to do so, it would have to be aware of every single 
alphanumeric character in Unicode.  That does not just include 
[A-Za-z0-9].  It might be easier to do the reverse and know of every 
character that isn't a known currency symbol, but then even that list of 
symbols is missing some.


e.g.
* U+FE69 ﹩ (Small Dollar Sign)
* U+FF04 $ (Fullwidth Dollar Sign)
* U+FFE5 ¥ (Fullwidth Yen Sign)
* etc.

It's much easier for the author to explicitly state which character(s) 
represent the symbol, than implementing heuristics to guess.


Broader Question: 
Isn't the idea behind Microformats to be as consise, cohesive, and single

purposed as possible?  If so, wouldn't that argue for combination with
units (ex. $34 per gallon, $2 per miles) being out of scope and begging the
need for a microformat that allows unit designation, i.e. hUnits?


Yes.  Tackling the problem of identifying specific units under the 
currency format is far too complicated when you consider the sheer 
number of units there are, including SI units, Imperial units and US 
customary units, used for various quantities including number of units, 
length, mass, time, volume, area, energy, etc.


However, the format could make provisions for some form of quantity, 
even if it doesn't explicitly define what such quantities are.


e.g. price per Litre:

span class=money
  abbr class=currency unit title=AUD$/abbr1.23
  span class=quantityL/span
/span

Or for each unit:

span class=money
  abbr class=unit$/abbr4.95
  span class=quantityeach/span
/span

That way, if and when a microformat for units of measurement is 
introduced, that could easily be expanded to the following.  e.g.


  span class=quantity si-unitL/span


My last thought on the subject, is why are we using full names for currency
and amount instead of cur and amt to minimize bloat when hCard uses
names like fn?


One of the problems I have with hCard is that those abbreviated class 
names are difficult to comprehend and remember.  e.g. It's easy to get 
confused about what 'fn' means, since it could easily stand for family 
name, though it doesn't.  (I'm not exactly sure what it stands for, 
though I assume it means formatted name even though it's not 
explicitly stated as such in the vCard RFC)


Abbreviations can be good in many cases, but you have to be careful not 
to introduce too much confusion or ambiguity for authors.


--
Lachlan Hunt
http://lachy.id.au/
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


RE: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-12 Thread Mike Schinkel
I want to vote on the poll but can you clarify what certain options mean
exactly, maybe by hypothetical examples (quoted parts are what confuse me)? 

* Currency symbol identification from other part of the text  
* Global currency definition 
* Amount identification from other part of the text

Thanks in advance.

-Mike


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Guillaume Lebleu
Sent: Thursday, October 12, 2006 10:59 AM
To: Microformats Discuss
Subject: [uf-discuss] Currency Quickpoll: Preliminary results

I thought I'd share these results with you. Voters were asked to select up
to 4 features in a list of 8.

We only had a handful of votes so far, so please cast yours at: 
http://www.vizu.com/poll-vote.html?n=15067

Features deemed most important:

   1. (100%) Currency used identification (ex. US dollars versus
  Canadian dollars)
   2. (83.3%) Currency unit/denomination used identification (ex. dollar
  versus cent, pound versus shilling)
   3. (50%)
  * Amount identification from other part of the text
  * Support for combination with units (ex. $34 per gallon, $2
per miles)
   4. (33.3%) Global currency definition (ex. all amounts in table are
  in US dollars)
   5. (16.7%) Currency symbol identification from other part of the text
  (ex. $ is the dollar sign)
   6. (0%)
  * Dated money amounts (ex. Five 1929 US dollars)
  * Support for non-numerical representation (ex. 10 dollars 99
cents, five pounds 23 pence)


Guillaume


___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


Re: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-12 Thread Guillaume Lebleu

Mike Schinkel wrote:

* Currency symbol identification from other part of the text
  
This means that in $25 dollars, we would mark up $ as the currency 
symbol. See 
http://microformats.org/wiki/currency-brainstorming#Andy_Mabbett under 
symbol bullet for an explanation of this.

* Global currency definition
  
This means that a currency can be defined once in the document (just 
like you define once a global variable in a program) and then refer to 
when needed, instead of locally defined every time.  See for instance: 
http://tonto.eia.doe.gov/dnav/pet/pet_pri_spt_s1_d.htm where there is a 
global legend Products in Cents per Gallon, and then the numbers have 
no currency symbol.

* Amount identification from other part of the text
  
This means that in $25 dollars or twenty five USD dollars, we would 
mark up 25 and twenty-five as amount so that it can easily be extracted 
from the rest of the string. With a numerical value it may not be 
necessary, with a textual representation, it may be necessary. So, 
depending on the scope of the proposal (do we want to support textual, 
another feature choice in the poll), this may be a related important 
feature or not.


Hope this helps.

Guillaume



___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


Re: [uf-discuss] Currency Quickpoll: Preliminary results

2006-10-12 Thread Guillaume Lebleu

In my previous table example, you should read:

tr
  th scope=colpricea href=#u1 class=include/a/th
/tr
tr
td24/td
/tr

Guillaume
___
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss