Re: [whatwg] Deprecating small , b ?

2008-11-29 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:

On Wed, Nov 26, 2008 at 4:48 PM, Calogero Alex Baldacchino
[EMAIL PROTECTED] wrote:
  


[cut]


We don't have to touch parsing at all to accomplish essentially this.The issue 
you're worried about is getting crazy semantics applied to
individual letters.  Semantic parsers (which honestly the average
browser is *not*) can easily just ignore the semantic value of b or
small or i when they don't wrap a full word, assuming that the use
is either stylistic or too complex/subtle to easily capture.

  


Well, such is a 'semantic' solution equivalent to leaving all to the 
implementation; a 'parsing' solution would solve the 'problem' at the 
bottom, but I acknowledge the question is too marginal to be seriously 
taken into account.


   


Agree to disagree, I guess.  I don't find We hope you'll find bProduct
A/b to be the best laundry detergent you've ever used! to be denoting
emphasis or importance, really.
  

I think 'Product A' is the core of the message, the thing some people are
trying to sell you, the name you *must* remember when you want to by a
laundry detergent, so those people become rich. The bold presentation aims
to capture your attention and keep your eyes on it a bit longer; on a
tv/radio spot the name of the product would be spoken out with some
isolation, with at least a bit of emphasis, for the same reasons. It denotes
importance meaning you need to pay a special attention to it in order to
understand *what the author wants you to understand*. I think that the same
semantics can be expressed by strong, since the importance of a piece of
text is not (only) in its meaning, or in the message overall meaning, or in
one's way to take it as important or not, but (also, or mainly) in the
author's intention to mark it as different from the rest of the content, as
a reading key, to drive your attention and as well your thoughts (ok, that's
like saying that truth is a chimera, but such can be a crude truth :-P ).



If I was contrasting Product A with another item, I could perhaps
agree.  But we're not, so I don't.  ^_^  However, we're obviously
splitting hairs here.
  


But you're implicitly contrasting 'Product A' with a bounce of generic 
items, all items of the same category your potential buyer might happen 
to know (I think this needs some clarification with one example, I guess 
you were referring to comparative advertising, which has not been legal 
in Italy for several years, so what you wrote - with some makup - has 
always been one of the most common advertisement here). Anyway, I agree 
we're splitting hairs, but there's some reason for me to push those 
concepts, and I hope I'll be able to make it clear.


  

Well, a foreign-language word, specially if correctly pronounced (by someone
else), can be more or less hard to 'catch', so a bit of emphasis in its
pronounce might help the listener to correctly distinguish sounds.



That's stretching quite a bit more than I think is appropriate.  Just
because I use a foreign phrase, does not mean that I'm emphasizing it.
 If I, in audible speech, would put a bit of inflection on the phrase,
that still doesn't mean I'm emphasizing it in anything like the way I
emphasize I'm emnot/em going to the dance with you!.
  


Isn't a bit of inflection also a bit of emphasis in pronounce? Perhaps 
I'm misusing the English term; that sounded correct at me in a wider 
sense, but I'll leave that concept, or modify it...



In other words, at most I might slightly stylistically offset the
phrase from my surrounding spoken words, but I wouldn't be
*emphasizing* them.  So the i semantics are correct here.  ^_^

  

After all, most of times bold and italicized texts (try and) reflect our way to
pronounce sentences, with more or less isolation, more or less emphasis,
quicker or slower, so changing their meaning, telling the listener that any
part requires a greater or a lesser attention, is somehow 'special', with
somehow different grades of 'speciality'. From this point of view, I think
either b/i can be semantically the very same thing as strong/em, or
their semantic should be redefined so to indicate a different (and lower)
grade of 'speciality' on the same speciality scale, but not as a different
kind of 'speciality' (i.e., b-text stands out for some - opaque - reason
which has nothing in common with strong-text).



You're overreaching your definition of importance and emphasis.  I
don't think it's valuable to denote *everything* that is in some way
special as important or emphatic - you lose a sense of scale.  If you
wish to define the words as such, then sure, b and i are lesser
grades of importance and emphasis by definition.  By more conventional
definitions, though, they're not, and their stated semantics are fine.

  


Ok, let's define 'special' in a more correct manner. What should be a 
slight offset? What does 'outstanding for some reason' mean, in a less 
ambigous definition? How should the offset 

Re: [whatwg] Deprecating small , b ?

2008-11-26 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:
On Tue, Nov 25, 2008 at 3:08 PM, Calogero Alex Baldacchino 
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:


Tab Atkins Jr. ha scritto:



On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] wrote:








Do you mean that if you had markup like pbW/bhen I was 
young.../p, it would be read out as I was young...?  If so, that's 
clearly a bug in the reader, and has nothing to do with semantics or the 
lack of it.  There is *no* legitimate interpretation of that markup that 
would lead one to discard the first word.
 


I agree that a reading software unable to understand some text with 
unexpected typographic variants, should read it as normal text; however, 
 I guess how the above can result in an unexpected situation, when 
looking for non-typographic semantics.





Basically, there is a subset of authors who are morons, and they'll 
screw up anything we do.  Most of us aren't like that, but trying to 
design around that subset is a game you can't win.  Their pages will be 
FUBAR no matter what we do, until browsers' rendering engines are 
literally hooked up to a sentient semantic parser.


Arghh!! Such a software would be too smart and dominate the world... 
That could think, morons are bothering; human beings generate morons; 
no more human beings means no more bother for me


 


Ah, the default style could be slightly or very different from the
small one, i.e. the text could be surrounded by parenthesis or
hyphens, despite of the font size (and the new elements could be
designed such to accept just non-empty strings consisting of more
than one non-spacing character).


We could, but is there any reason to have it do that?  Making the text 
small is a good visual representation of the small print or aside 
semantics.




The concept was (or could be - let me modify it), the more we provide 
alternative visual representations of the aside semantic element, the 
more likely a moron designer will stay far from it, since he could be 
confused about the style he's creating. As well, the rule there is no 
default style for the element could prevent authoring tools from just 
changing the name of a button used to style some text. But I know, all 
would fail because most popular browser would choose very similar 
rendering (or would they just follow rendering small fonts).


Anyway, I wouldn't underestimate the latter characteristic (ok, that 
wasn't clear), that is establishing the use of the element is legal if 
it sorrunds a piece of text made up of one or more whole words (or at 
least one readable character) and if it's bounded by spacing or 
punctuation characters (that is, the 'semantic element' cannot be a part 
of a word). Of course, the misuse concern would just move from the 
messed-up word to a messed-up sentence, but at least, in this case, an 
assistive reader would be less likely fouled up and, without any need 
for luck, it could speak out something funny, yet understandable. Of 
course the same could be done redefining b and small parsing rules, 
but such would result in a break with a bounch of (possible) legacy 
uses, and if we had to break somehow with the past, why don't have a 
look for some more significant names? - Just to say, not hoping to 
persuade you :-P




Here it is me not understanding. I think that any reason to offset
some text from the surrounding one can be reduced to the different
grade of 'importance' the author gives it, in the same meaning as
Smylers used in his mails (that is, not the importance of the
content, but the relevance it gets as attention focus - he made the
example of the English small print idiom, and in another mail
clarified that It's less important in the sense that it isn't the
point of what the author wants users to have conveyed to them; it's
less important to the message. (Of course, to users any caveats in
the small print may be very important indeed!)). From this point of
view, unless we aimed to avail of b as an intermediate grade of
relevance between 'normal text' and 'em/strong' (but, aren't these
enough to attract a reader's attention?), redefining its semantic
might be redundant with lesser utility. (In my crazy mind, this
applies to the headings too, since a 'good' heading focuses
attention on the core subject of its following section, so have to
be evidenced as an important slice of text). Furthermore, I meant
that strong and em would have been a better choice than b in
Smylers' examples because their *original semantics* is very close
together with that of a more relevant text/a text needing greater
attention, while b *original semantics* is very different and
needs to be redefined for this purpose (but we have still got
possible alternatives to this).



Re: [whatwg] Deprecating small , b

2008-11-25 Thread Smylers
Pentasis writes:

 [Asbjørn Ulsberg writes:]
 
  However, as you write and as HTML5 defines it, there is nothing
  wrong with small per se, and I agree that as an element indicating
  smallprint, it works just fine.
 
  Since my initial reply might have been a bit too colored by the HTML4  
  definition of the element and its current usage on the web, I hereby  
  withdraw my comment and conclude that I mostly agree with you. :-)

Yay, consensus!  Thanks, Asbjørn.

 But isn't this just the reason why it should be dis-used?  The HTML4
 spec defined it as a styling tag, and that is how it is *mostly* used
 and understood by the majority of the users/authors.

That may be true (though authors who want smaller text just because they
think the default looks too large could also use font size=2 or CSS),
but authors who wanted to diminished the emphasis of certain content to
users are likely to pick small because there isn't much else
available.

Just because an element is currently widely used for a purpose we deem
inappropriate doesn't mean that its appropriate uses aren't important.
Tables are widely used for layout; br-s are widely misused.  Both of
those clearly have other valid uses, so are still in HTML.

 Just because HTML5 redefines the element does not mean that the
 element will suddenly be semantic. Even if people start using it
 purely semantically from now on (and what is the chance of that?), the
 existing websites still carry small-tags that are not compliant with
 the new definition.

Yes.  But the suggested alternative was to deprecate small entirely
and invent a new element to convey the semantic of 'small print'.  That
would of course make _all_ current uses of small non-conforming.
Presentational small-s are going to be non-conforming either way;
allowing semantic small-s to conform doesn't change that.

 By redefining it the (existing) web breaks; allbeit purely in the
 semantic area. 

That's intentional.  If anybody checks legacy content against the new
standard they will discover that what they did is no longer recommended.
However, browsers will 100% support it and continue to render it as it
always has been, so the 'breakage' is no way visible; if the author
chooses not to care about it then no harm is done.

Smylers

PS: Pentasis, please could you send mails that do at least one of
attributing who you're quoting or include In-Reply-To: headers so that
they continue the existing thread rather than starting a new one.
Without either it's rather tedious to have to look up who said the text
you quote.  Thanks.


Re: [whatwg] Deprecating small , b ?

2008-11-25 Thread Calogero Alex Baldacchino

Smylers wrote:

Asbjørn Ulsberg writes:

  

On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED]
wrote:



In printed material users are typically given no out-of-band
information about the semantics of the typesetting.  However,
smaller things are less noticeable, and it's generally accepted that
the author of the document wishes the reader to pay less attention
to them than more prominent things.

That works fine with small .
  

No, it doesn't, and you explain why yourself here:



User-agents which can't literally render smaller fonts can choose
alternative mechanisms for denoting lower importance to users.
  


I don't see how that explains why small is an inappropriate tag to use
for things which an author wishes to be less noticeable.
[...]
  


Of course that's possible, but, as you noticed too, only by redefining 
the small semantics, and is not a best choice per se. That's both 
because the original semantics for the small tag was targeted to 
styling and nothing else (the html 4 document type definitions declared 
it as a member of the fontstyle entity, while, for instance, strong 
and em were parts of the phrase entity), and because the term 'small', 
at first glance, suggests the idea of a typographical function, 
regardless any other related concept which might be specific for the 
English (or whatever else) culture, but might not be as well immediate 
for non-English developers all around the world. As a consequence, since 
any average developer could just rely on the old semantics, being he 
intuitively confident with it, the semantics redefinition could find a 
first counter-indication: let's think on a word written with alternate 
b and small letters, or just to a paragraph first letter evidenced 
by a b, obviously the application of the new semantics here would be 
untrivial (i.e. an assistive software for blind users would be fouled by 
this and give unpredictable results). Despite the previous use case 
would be a misuse of the b and small markup, yet it would be 
possible, meaning not prohibited, and so creating a new element with a 
proper semantic could be a better choice.


But, you're right, we have to deal with backward compatibility, and 
redefining the small and b semantics can be a good compromise, since 
a new element would face some heavy concerns, mainly related to 
rendering and to the state of the art implementations in non-visual user 
agents (and the alike).


However, I think that a solution, at least partial, can be found for the 
rendering concern (and I'd push for this being done anyway, since there 
are several new elements defined for HTML 5). Most user agents are 
capable to interpret a dtd to some extent, so it could be worth the 
effort to define an html 5 specific dtd in addition to the parsing 
roules - which aim to overcome all problems arising by previous dtd-only 
html specifications - so that a non html5-fully-compliant browser can 
somehow interpret any new elements. HTML 5 Doctype declaration could 
accept a dtd just for backward compatibility purpose, and any fully 
compliant user agent would just ignore such dtd. More specifically, such 
a dtd could define default values for some attributes, such as the style 
attribute (to have any new element properly rendered - some assistive 
technologies are capable to interpret style sheets too), and, anyway, 
there should be a way, in SMGL, to create an alias for an element (i.e., 
a new element - let's call it incidental - could be aliased to small 
for better compatibility).


Let's come to the non-typographical interpretation a today u.a. may be 
capable of, as in your example about lynx. This can be a very good 
reason to deem small a very good choice. But, are we sure that *every* 
existing user agent can do that? If the answer is yes, we can stop here: 
small is a perfect choise. Better: small is all we need, so let's 
stop bothering each other about this matter. But if the answer is no, we 
have to face a number of user agents needing an update to understand the 
new semantics for the small tag, and so, if the new semantics can be 
assumed as *surely* reliable only with new/updated u.a.'s (that is, with 
those ones fully compatible with html 5 specifications), that's somehow 
like to be starting from scratch, and consequently there is space for a 
new, more appropriate element.



However, you would appreciate that the author had wished for some
particular words to stand out from the surrounding text.
  

That's a job for the style sheet, whether it's provided by the author
or by the user agent.



The style-sheet can only pick out particular words if those words have
been marked-up as special in the document, so it doesn't solve the
problem of how to mark them up.

Further, this isn't using b because the house style is to have all
text in a bold weight (that can be done by style-sheets, and if the
style-sheet is missing all the content is still there); it's using b
to 

Re: [whatwg] Deprecating small , b ?

2008-11-25 Thread Tab Atkins Jr.
On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino 
[EMAIL PROTECTED] wrote:

 Smylers wrote:

 Asbjørn Ulsberg writes:



 On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED]
 wrote:



 In printed material users are typically given no out-of-band
 information about the semantics of the typesetting.  However,
 smaller things are less noticeable, and it's generally accepted that
 the author of the document wishes the reader to pay less attention
 to them than more prominent things.

 That works fine with small .


 No, it doesn't, and you explain why yourself here:



 User-agents which can't literally render smaller fonts can choose
 alternative mechanisms for denoting lower importance to users.



 I don't see how that explains why small is an inappropriate tag to use
 for things which an author wishes to be less noticeable.
 [...]



 Of course that's possible, but, as you noticed too, only by redefining the
 small semantics, and is not a best choice per se. That's both because the
 original semantics for the small tag was targeted to styling and nothing
 else (the html 4 document type definitions declared it as a member of the
 fontstyle entity, while, for instance, strong and em were parts of the
 phrase entity), and because the term 'small', at first glance, suggests the
 idea of a typographical function, regardless any other related concept which
 might be specific for the English (or whatever else) culture, but might not
 be as well immediate for non-English developers all around the world. As a
 consequence, since any average developer could just rely on the old
 semantics, being he intuitively confident with it, the semantics
 redefinition could find a first counter-indication: let's think on a word
 written with alternate b and small letters, or just to a paragraph first
 letter evidenced by a b, obviously the application of the new semantics
 here would be untrivial (i.e. an assistive software for blind users would be
 fouled by this and give unpredictable results). Despite the previous use
 case would be a misuse of the b and small markup, yet it would be
 possible, meaning not prohibited, and so creating a new element with a
 proper semantic could be a better choice.


No matter *what* we do, if there *is* a default style for an element, it
will be misused by people.  This is a fact of life.  Defining a new element
which is identical to small in every way except that it hasn't been
misused *yet* is thus a mug's game, because it *will* be misused in the same
way as small, and then we just have two identical elements for no reason.

Yes, bad markup will foul up semantic agents.  But people will *always*
write bad markup.  At least with the semantic redefinition we get to declare
lots of usages that *are* appropriate to be conforming without any effort on
the author's part.

And really, the type of people who would write a word with alternating
letters wrapped in b and small tags are hardly the kind to even *care*
about semantics.

But, you're right, we have to deal with backward compatibility, and
 redefining the small and b semantics can be a good compromise, since a
 new element would face some heavy concerns, mainly related to rendering and
 to the state of the art implementations in non-visual user agents (and the
 alike).

 However, I think that a solution, at least partial, can be found for the
 rendering concern (and I'd push for this being done anyway, since there are
 several new elements defined for HTML 5). Most user agents are capable to
 interpret a dtd to some extent, so it could be worth the effort to define an
 html 5 specific dtd in addition to the parsing roules - which aim to
 overcome all problems arising by previous dtd-only html specifications - so
 that a non html5-fully-compliant browser can somehow interpret any new
 elements. HTML 5 Doctype declaration could accept a dtd just for backward
 compatibility purpose, and any fully compliant user agent would just ignore
 such dtd. More specifically, such a dtd could define default values for some
 attributes, such as the style attribute (to have any new element properly
 rendered - some assistive technologies are capable to interpret style sheets
 too), and, anyway, there should be a way, in SMGL, to create an alias for an
 element (i.e., a new element - let's call it incidental - could be aliased
 to small for better compatibility).


Html5 is no longer an SGML language.


 Let's come to the non-typographical interpretation a today u.a. may be
 capable of, as in your example about lynx. This can be a very good reason to
 deem small a very good choice. But, are we sure that *every* existing user
 agent can do that? If the answer is yes, we can stop here: small is a
 perfect choise. Better: small is all we need, so let's stop bothering each
 other about this matter. But if the answer is no, we have to face a number
 of user agents needing an update to understand the new semantics for the
 small tag, and 

Re: [whatwg] Deprecating small , b ?

2008-11-25 Thread Calogero Alex Baldacchino

Tab Atkins Jr. ha scritto:



On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino 
[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:



Of course that's possible, but, as you noticed too, only by
redefining the small semantics, and is not a best choice per se.
That's both because the original semantics for the small tag was
targeted to styling and nothing else (the html 4 document type
definitions declared it as a member of the fontstyle entity,
while, for instance, strong and em were parts of the phrase
entity), and because the term 'small', at first glance, suggests
the idea of a typographical function, regardless any other related
concept which might be specific for the English (or whatever else)
culture, but might not be as well immediate for non-English
developers all around the world. As a consequence, since any
average developer could just rely on the old semantics, being he
intuitively confident with it, the semantics redefinition could
find a first counter-indication: let's think on a word written
with alternate b and small letters, or just to a paragraph
first letter evidenced by a b, obviously the application of the
new semantics here would be untrivial (i.e. an assistive software
for blind users would be fouled by this and give unpredictable
results). Despite the previous use case would be a misuse of the
b and small markup, yet it would be possible, meaning not
prohibited, and so creating a new element with a proper semantic
could be a better choice. 



No matter *what* we do, if there *is* a default style for an element, 
it will be misused by people.  This is a fact of life.  Defining a new 
element which is identical to small in every way except that it 
hasn't been misused *yet* is thus a mug's game, because it *will* be 
misused in the same way as small, and then we just have two 
identical elements for no reason.


I'll start with an example. A few time ago I played around with Opera 
Voice. It seemed to be capable to interpret visual style sheets and 
specifically font styles, so that bold or italics text (so constraint in 
the style sheet, not the markup) were spoken differently from 'normal' 
text, but a paragraph first letter differing from the rest of the word 
(which is a non-rare typographical choice), as far as I remember, caused 
the whole word to be skipped. This suggests me that if we really want a 
'cross-presentation' semantics, we have to keep as far as we can from 
anything having a *main* typographical semantics (as small and b 
have from their birth). Every language is somehow prone to side-effects 
caused by misuse (i.e. it is possible to cause a big mess in a software 
written in a language allowing to pass a pointer to a function - there 
are tons of examples for language design issues - yet such could be a 
desireable capability), but appropriate choices for both semantics and 
syntax may help to reduce the likelyhood of a misuse.


I think that very likely both b and small will carry on their old 
semantics, so being more prone to misuse with respect to their new one, 
since very likely a lot of developers are, and will rest, more confident 
with their original semantics, which is also suggested by their names 
('b' standing for 'bold' and 'small'... for something small on the 
screen or on paper). Instead, a new element would require the developer 
to take some effort at least to learn about its existence, so he would 
read that such element primary use is to indicate a different importance 
of a piece of text, so that a non visual user agent can present it in an 
appropriate manner, and a visual or print user agent can render it in 
different ways. Ah, the default style could be slightly or very 
different from the small one, i.e. the text could be surrounded by 
parenthesis or hyphens, despite of the font size (and the new elements 
could be designed such to accept just non-empty strings consisting of 
more than one non-spacing character).




Yes, bad markup will foul up semantic agents.  But people will 
*always* write bad markup.  At least with the semantic redefinition we 
get to declare lots of usages that *are* appropriate to be conforming 
without any effort on the author's part.


And really, the type of people who would write a word with alternating 
letters wrapped in b and small tags are hardly the kind to even 
*care* about semantics.


Let me reverse this approach: what should an assistive user agent do 
with such a bM/bsmallE/smallbS/bsmallS/small? I think 
that dealing with that word as normal text would be a more gracefull 
degradation than discarding it, and if we clearly state that b and 
small have only typographical semantics, while different elements are 
provided to differentiate the grade of emphasys of a phrase, an 
assistive user agent could support a better behaviour, while any author 
disregarding semantics would not cause any trouble (the 

Re: [whatwg] Deprecating small , b ?

2008-11-25 Thread Tab Atkins Jr.
On Tue, Nov 25, 2008 at 3:08 PM, Calogero Alex Baldacchino 
[EMAIL PROTECTED] wrote:

 Tab Atkins Jr. ha scritto:



 On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino 
 [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:


Of course that's possible, but, as you noticed too, only by
redefining the small semantics, and is not a best choice per se.
That's both because the original semantics for the small tag was
targeted to styling and nothing else (the html 4 document type
definitions declared it as a member of the fontstyle entity,
while, for instance, strong and em were parts of the phrase
entity), and because the term 'small', at first glance, suggests
the idea of a typographical function, regardless any other related
concept which might be specific for the English (or whatever else)
culture, but might not be as well immediate for non-English
developers all around the world. As a consequence, since any
average developer could just rely on the old semantics, being he
intuitively confident with it, the semantics redefinition could
find a first counter-indication: let's think on a word written
with alternate b and small letters, or just to a paragraph
first letter evidenced by a b, obviously the application of the
new semantics here would be untrivial (i.e. an assistive software
for blind users would be fouled by this and give unpredictable
results). Despite the previous use case would be a misuse of the
b and small markup, yet it would be possible, meaning not
prohibited, and so creating a new element with a proper semantic
could be a better choice.

 No matter *what* we do, if there *is* a default style for an element, it
 will be misused by people.  This is a fact of life.  Defining a new element
 which is identical to small in every way except that it hasn't been
 misused *yet* is thus a mug's game, because it *will* be misused in the same
 way as small, and then we just have two identical elements for no reason.


 I'll start with an example. A few time ago I played around with Opera
 Voice. It seemed to be capable to interpret visual style sheets and
 specifically font styles, so that bold or italics text (so constraint in the
 style sheet, not the markup) were spoken differently from 'normal' text, but
 a paragraph first letter differing from the rest of the word (which is a
 non-rare typographical choice), as far as I remember, caused the whole word
 to be skipped.


Do you mean that if you had markup like pbW/bhen I was young.../p,
it would be read out as I was young...?  If so, that's clearly a bug in
the reader, and has nothing to do with semantics or the lack of it.  There
is *no* legitimate interpretation of that markup that would lead one to
discard the first word.


 This suggests me that if we really want a 'cross-presentation' semantics,
 we have to keep as far as we can from anything having a *main* typographical
 semantics (as small and b have from their birth). Every language is
 somehow prone to side-effects caused by misuse (i.e. it is possible to cause
 a big mess in a software written in a language allowing to pass a pointer to
 a function - there are tons of examples for language design issues - yet
 such could be a desireable capability), but appropriate choices for both
 semantics and syntax may help to reduce the likelyhood of a misuse.

 I think that very likely both b and small will carry on their old
 semantics, so being more prone to misuse with respect to their new one,
 since very likely a lot of developers are, and will rest, more confident
 with their original semantics, which is also suggested by their names ('b'
 standing for 'bold' and 'small'... for something small on the screen or on
 paper). Instead, a new element would require the developer to take some
 effort at least to learn about its existence, so he would read that such
 element primary use is to indicate a different importance of a piece of
 text, so that a non visual user agent can present it in an appropriate
 manner, and a visual or print user agent can render it in different ways.


Well, the new semantics are purposely very close to the old 'semantics'.
Bold text *is* text purposely offset from the surrounding prose.  Some
legacy uses of b are more correctly done with other existing elements,
like strong or h1, but at least it's *close*.

And again, the type of author who *is* marking up random things with b for
purely stylistic reasons isn't the sort who is going around reading
standards documents, or likely even caring in the slightest.  If they *did*
discover a new element that has the correct semantics (like standout or
something), they'll either ignore it (if it's basically identical to b) or
use it nonsemantically as well (if it offers some exciting new default
styling).

Basically, there is a subset of authors who are morons, and they'll screw up
anything we do.  Most of us aren't like that, but 

Re: [whatwg] Deprecating small , b ?

2008-11-24 Thread Asbjørn Ulsberg

On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED] wrote:


In printed material users are typically given no out-of-band information
about the semantics of the typesetting.  However, smaller things are
less noticeable, and it's generally accepted that the author of the
document wishes the reader to pay less attention to them than more
prominent things.

That works fine with small.


No, it doesn't, and you explain why yourself here:


User-agents which can't literally render smaller fonts can choose
alternative mechanisms for denoting lower importance to users.


If the point isn't to literally render smaller fonts, you shouldn't
indicate that you want the fonts rendered smaller either. What you want is
to semantically indicate that the text wrapped inside the element is of
less significance than the surrounding text, e.g. a negative 'strong' or
'em'. Just as 'b' isn't equal to 'strong', 'small' isn't equal to what
we're trying to express here.

What we need is a new element that can capture this semantic.


Denoting particular text as being of lessor importance is quite
different from choosing the overall base font size (or indeed typeface)
for the page, or the colour of links or headings -- that's merely
expressing a preference for how graphical user-agents should render
particular semantics, but the semantics themselves are conveyed to _all_
user-agents (a, h3, etc).


Which is why we need to capture this as semantic and not as presentational
sugar.


Indeed you can't.  And nor can you if you were reading printed text with
some words in bold.


Why does printed text set the standard for what we are able to express
with a markup language? Does e.g. PDF in any way direct what should be
possible with HTML?


However, you would appreciate that the author had wished for some
particular words to stand out from the surrounding text.


That's a job for the style sheet, whether it's provided by the author or
by the user agent. Using the same element would in most circumstances
yield the same presentation. Isn't that what you want?


However, you can only notice this if the words have been distinguished
in some way.  With b, all user-agents can choose to convey to users
that those words are special.


They are only special for sighted users, browsing the page with a rather
advanced user agent. They are not special to blind users or to users of
text-based user agents like Lynx. If you want to express semantics, then
use a semantic element.

Expressing semantics through presentation only is done in print because of
the limitations in the printing system. If the print was for a blind
person, printed with braille, one could imagine (had it been supported)
that letters with a higher weight could be physically warmer than others,
or with a more jagged edge so they could stand out.

Such effects would have been impossible if the document was only tagged
with presentational markup. The same applies to other mediums than print
-- you need to know the underlying reason of why something is presented
the way it is to transfer that presentation to another environment. And
for that you need the semantics.

--
Asbjørn Ulsberg   -=|=-[EMAIL PROTECTED]
«He's a loathsome offensive brute, yet I can't look away»


Re: [whatwg] Deprecating small , b ?

2008-11-24 Thread Smylers
Asbjørn Ulsberg writes:

 On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED]
 wrote:
 
  In printed material users are typically given no out-of-band
  information about the semantics of the typesetting.  However,
  smaller things are less noticeable, and it's generally accepted that
  the author of the document wishes the reader to pay less attention
  to them than more prominent things.
 
  That works fine with small .
 
 No, it doesn't, and you explain why yourself here:
 
  User-agents which can't literally render smaller fonts can choose
  alternative mechanisms for denoting lower importance to users.

I don't see how that explains why small is an inappropriate tag to use
for things which an author wishes to be less noticeable.

 If the point isn't to literally render smaller fonts, you shouldn't
 indicate that you want the fonts rendered smaller either.

Indeed.  font size=-1 would be bad to use for this.

 What you want is to semantically indicate that the text wrapped inside
 the element is of less significance than the surrounding text, e.g. a
 negative 'strong' or 'em'.

Yes.  And I reckon than small works for that.  English has the idiom
of 'small print', roughly meaning text written by the legal department
rather than the marketing department.  But 'small print' doesn't
literally have to be typeset with a smaller font; it's a figure of
speech.

 'small' isn't equal to what we're trying to express here.  What we
 need is a new element that can capture this semantic.

If we were starting from scratch then indeed small may not be the best
name to choose for this element.  But, unfortunately, we aren't.

small has existed for some time, and people are already using it.  If
one currently wants to denote lessor importance small is the best
element to pick.  Further, existing browsers know what to do with
small; if we introduce a new element then content that uses it will
have a sub-optimal rendering in current browsers, whereas small
already does something appropriate.

So I still think small works for denoting that something is of smaller
importance.

  Indeed you can't.  And nor can you if you were reading printed text
  with some words in bold.
 
 Why does printed text set the standard for what we are able to express
 with a markup language?

It doesn't set the standard.  But it's useful in some comparisons.  And
most of the time humans cope perfectly well with inferring typographic
conventions without having them spelt out.

  However, you would appreciate that the author had wished for some
  particular words to stand out from the surrounding text.
 
 That's a job for the style sheet, whether it's provided by the author
 or by the user agent.

The style-sheet can only pick out particular words if those words have
been marked-up as special in the document, so it doesn't solve the
problem of how to mark them up.

Further, this isn't using b because the house style is to have all
text in a bold weight (that can be done by style-sheets, and if the
style-sheet is missing all the content is still there); it's using b
to convey _some_ semantics: namely that those particular words are in
some way special.

So if the mark-up is span class=brand_name or similar and the
distinguishing presentation added with CSS then users without
style-sheets are completely unaware that the author identified those
words as being special.  Whereas with b, everybody gets to know.

  However, you can only notice this if the words have been
  distinguished in some way.  With b , all user-agents can choose to
  convey to users that those words are special.
 
 They are only special for sighted users, browsing the page with a
 rather advanced user agent. They are not special to blind users or to
 users of text-based user agents like Lynx.

Not true.  Any user-agent can choose to convey that words marked in b
are somehow different from the surrounding words.  Lynx does this.

 If you want to express semantics, then use a semantic element.

That's begging the question.  If we define b to be semantic, then it
is!

 Expressing semantics through presentation only is done in print
 because of the limitations in the printing system.

Well, yes.

 If the print was for a blind person, printed with braille, one could
 imagine (had it been supported) that letters with a higher weight
 could be physically warmer than others, or with a more jagged edge so
 they could stand out.

Yup -- and an HTML-to-braille converter could choose to do that with
words marked in b, whereas it couldn't with span class=BrandName.

 Such effects would have been impossible if the document was only
 tagged with presentational markup.

To some extent, yes: not knowing whether a letter is where it is on the
page because it's a start of a paragraph or a heading, or just because
the previous line is full, hampers doing that.  And similarly for
typefaces.

 The same applies to other mediums than print -- you need to know the
 underlying reason of why 

Re: [whatwg] Deprecating small, b ?

2008-11-24 Thread Lachlan Hunt

Nils Dagsson Moskopp wrote:

The small element represents small print [...]



The b element represents a span of text to be stylistically offset from

the normal prose without conveying any extra importance [...]

Both definitions seems rather presentational (contrasting, for example,
the new semantic definition for the i element) and could also be
realized by use of span elements.

To me these look like the last remnants of the font element. So why
are these elements retained ?


This issue has been discussed in depth in the past; most significantly 
on public-html around May 2007.


http://lists.w3.org/Archives/Public/www-html/2007May/thread.html#msg3
(I think most of the releant discussion is in the Cleaning House thread)

I have added an entry to the FAQ detailing the rationale for including 
these elements, and have previously written an article about the issue too.


http://wiki.whatwg.org/wiki/FAQ#Why_are_some_presentational_elements_like_.3Cb.3E.2C_.3Ci.3E_and_.3Csmall.3E_still_included.3F
http://lachy.id.au/log/2007/05/b-and-i
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-January/009060.html

--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/


Re: [whatwg] Deprecating small , b ?

2008-11-24 Thread Felix Miata
On 2008/11/24 16:19 (GMT) Smylers composed:

 So I still think small works for denoting that something is of smaller
 importance.

I do too, but I don't believe less importance can be the only inference. One
could simply want smaller text, without expecting that inference. e.g., just
because fine print legalese is called what it is doesn't doesn't
necessarily make it unimportant or less important. I'm for keeping small in
the spec.
-- 
Love is not easily angered. Love does not demand
its own way.   1 Corinthians 13:5 NIV

 Team OS/2 ** Reg. Linux User #211409

Felix Miata  ***  http://fm.no-ip.com/


Re: [whatwg] Deprecating small , b ?

2008-11-24 Thread Smylers
Felix Miata writes:

 On 2008/11/24 16:19 (GMT) Smylers composed:
 
  So I still think small works for denoting that something is of
  smaller importance.
 
 I do too, but I don't believe less importance can be the only
 inference. One could simply want smaller text, without expecting that
 inference.

If you just want something to be smaller stylistically and there's
nothing special about that portion of the text then I think using
small for it would be as bad as using h1 just to make text bigger;
CSS is a better choice.

 e.g., just because fine print legalese is called what it is doesn't
 doesn't necessarily make it unimportant or less important.

It's less important in the sense that it isn't the point of what the
author wants users to have conveyed to them; it's less important to the
message.  (Of course, to users any caveats in the small print may be
very important indeed!)

Smylers


Re: [whatwg] Deprecating small , b ?

2008-11-24 Thread Asbjørn Ulsberg

On Mon, 24 Nov 2008 17:19:44 +0100, Smylers [EMAIL PROTECTED] wrote:


I don't see how that explains why small is an inappropriate tag to use
for things which an author wishes to be less noticeable.


I was thinking mostly about the tag's current usage on the web, which is a crazy mix 
between the HTML4 and HTML5 definition of the element. HTML4 defines it purely 
presentational, HTML5 mostly semantical. In that context, I believe small is 
inappropriate.

However, as you write and as HTML5 defines it, there is nothing wrong with small per 
se, and I agree that as an element indicating smallprint, it works just fine.

Since my initial reply might have been a bit too colored by the HTML4 
definition of the element and its current usage on the web, I hereby withdraw 
my comment and conclude that I mostly agree with you. :-)

--
Asbjørn Ulsberg -=|=-  [EMAIL PROTECTED]
«He's a loathsome offensive brute, yet I can't look away»


Re: [whatwg] Deprecating small , b

2008-11-24 Thread Pentasis


I was thinking mostly about the tag's current usage on the web, which is a 
crazy mix between the HTML4 and HTML5 definition of the element. HTML4 
defines it purely presentational, HTML5 mostly semantical. In that 
context, I believe small is inappropriate.


However, as you write and as HTML5 defines it, there is nothing wrong with 
small per se, and I agree that as an element indicating smallprint, it 
works just fine.


Since my initial reply might have been a bit too colored by the HTML4 
definition of the element and its current usage on the web, I hereby 
withdraw my comment and conclude that I mostly agree with you. :-)




But isn't this just the reason why it should be dis-used?
The HTML4 spec defined it as a styling tag, and that is how it is *mostly* 
used and understood by the majority of the users/authors.
Just because HTML5 redefines the element does not mean that the element will 
suddenly be semantic. Even if people start using it purely semantically from 
now on (and what is the chance of that?), the existing websites still carry 
small-tags that are not compliant with the new definition. By redefining it 
the (existing) web breaks; allbeit purely in the semantic area. 





Re: [whatwg] Deprecating small , b

2008-11-24 Thread Jonas Sicking

Pentasis wrote:


I was thinking mostly about the tag's current usage on the web, which 
is a crazy mix between the HTML4 and HTML5 definition of the element. 
HTML4 defines it purely presentational, HTML5 mostly semantical. In 
that context, I believe small is inappropriate.


However, as you write and as HTML5 defines it, there is nothing wrong 
with small per se, and I agree that as an element indicating 
smallprint, it works just fine.


Since my initial reply might have been a bit too colored by the HTML4 
definition of the element and its current usage on the web, I hereby 
withdraw my comment and conclude that I mostly agree with you. :-)




But isn't this just the reason why it should be dis-used?
The HTML4 spec defined it as a styling tag, and that is how it is 
*mostly* used and understood by the majority of the users/authors.
Just because HTML5 redefines the element does not mean that the element 
will suddenly be semantic. Even if people start using it purely 
semantically from now on (and what is the chance of that?), the existing 
websites still carry small-tags that are not compliant with the new 
definition. By redefining it the (existing) web breaks; allbeit purely 
in the semantic area.


Note that the semantic meaning that HTML5 gives it is very weak. All it 
says is that the text inside the b is different from the text outside 
it. All the existing uses on the web that I've seen are correct 
according to this semantic definition.


Do you have any counter examples of this, where the b was containing 
something that was exactly semantically the same as the surrounding content?


/ Jonas


Re: [whatwg] Deprecating small , b

2008-11-24 Thread Nils Dagsson Moskopp
Am Montag, den 24.11.2008, 15:10 -0800 schrieb Jonas Sicking:
 Note that the semantic meaning that HTML5 gives it is very weak. All it 
 says is that the text inside the b is different from the text outside 
 it. All the existing uses on the web that I've seen are correct 
 according to this semantic definition.
Weak in this case means: Not of much semantic, probably only of presentational 
use.

So can't we just mark all presentational elements as obsolete in a
clear, consistenst way, instead of trying to redefine them ? Maybe put
them into a presentational annex of the spec, that defines rendering
of obsolete elements ?

The thing I am concerned with is that if they are included like
normal (read: semantic) elements, authors will probably use them for
new pages.


Cheers,

Nils



Re: [whatwg] Deprecating small , b ?

2008-11-17 Thread Smylers
Pentasis writes:

 2) When using small on different text-nodes throughout the document,
 one would expect all these text-nodes to be semantically the same. But
 they are not (unless all of them are copyright notices).

In printed material users are typically given no out-of-band information
about the semantics of the typesetting.  However, smaller things are
less noticeable, and it's generally accepted that the author of the
document wishes the reader to pay less attention to them than more
prominent things.

That works fine with small.  User-agents which can't literally render
smaller fonts can choose alternative mechanisms for denoting lower
importance to users.

There's no chance of doing this with span class=legalese or similar,
since user-agents are unaware of the semantic they should be conveying.

 3) small is a styling element, it has zero semantic meaning, so it does 
 not belong inside HTML.

Denoting particular text as being of lessor importance is quite
different from choosing the overall base font size (or indeed typeface)
for the page, or the colour of links or headings -- that's merely
expressing a preference for how graphical user-agents should render
particular semantics, but the semantics themselves are conveyed to _all_
user-agents (a, h3, etc).

 4) b Siemens/b also does not tell me anything about the semantics.
 Is it used as a name, a brand a foreign word ? etc. I cannot get that
 information from looking at the b element.

Indeed you can't.  And nor can you if you were reading printed text with
some words in bold.  However, you would appreciate that the author had
wished for some particular words to stand out from the surrounding text.
Perhaps you then notice it's being done for all brand names?  Or that
the emboldened words spell out a secret message?

However, you can only notice this if the words have been distinguished
in some way.  With b, all user-agents can choose to convey to users
that those words are special.

Smylers


Re: [whatwg] Deprecating small , b ?

2008-11-17 Thread Pentasis

2) When using small on different text-nodes throughout the document,
one would expect all these text-nodes to be semantically the same. But
they are not (unless all of them are copyright notices).


In printed material users are typically given no out-of-band information
about the semantics of the typesetting.  However, smaller things are
less noticeable, and it's generally accepted that the author of the
document wishes the reader to pay less attention to them than more
prominent things.

That works fine with small.  User-agents which can't literally render
smaller fonts can choose alternative mechanisms for denoting lower
importance to users.

There's no chance of doing this with span class=legalese or similar,
since user-agents are unaware of the semantic they should be conveying.


3) small is a styling element, it has zero semantic meaning, so it does
not belong inside HTML.


Denoting particular text as being of lessor importance is quite
different from choosing the overall base font size (or indeed typeface)
for the page, or the colour of links or headings -- that's merely
expressing a preference for how graphical user-agents should render
particular semantics, but the semantics themselves are conveyed to _all_
user-agents (a, h3, etc).


4) b Siemens/b also does not tell me anything about the semantics.
Is it used as a name, a brand a foreign word ? etc. I cannot get that
information from looking at the b element.


Indeed you can't.  And nor can you if you were reading printed text with
some words in bold.  However, you would appreciate that the author had
wished for some particular words to stand out from the surrounding text.
Perhaps you then notice it's being done for all brand names?  Or that
the emboldened words spell out a secret message?

However, you can only notice this if the words have been distinguished
in some way.  With b, all user-agents can choose to convey to users
that those words are special.

Smylers



You cannot make a 100% comparison between printed and web-published styling 
and semantics. Apart from the obvious visual difference, we are talking 
about the ability here to convey semantics other than just visual. For 
example to aid machine-readability but far more importantly, Assistive 
Technologies.
If markup in web-publishing was meant to be just for visual feedback, we 
would only need 1 block and one inline element as we can do anything with 
just classes and CSS in that respect. In that case you would be right, as 
indeed a book, newspaper or magazine can be read just fine without using 
markup-elements. And so can webpages. But this is not the main reason behind 
the semantic web.


Bert 





Re: [whatwg] Deprecating small , b ?

2008-11-17 Thread Smylers
Pentasis writes:

  In printed material users are typically given no out-of-band
  information about the semantics of the typesetting.  However,
  smaller things are less noticeable, and it's generally accepted that
  the author of the document wishes the reader to pay less attention
  to them than more prominent things.
  
  That works fine with small.  User-agents which can't literally
  render smaller fonts can choose alternative mechanisms for denoting
  lower importance to users.
  
  There's no chance of doing this with span class=legalese or
  similar, since user-agents are unaware of the semantic they should
  be conveying.
  
   3) small is a styling element, it has zero semantic meaning, so
   it does not belong inside HTML.
  
  Denoting particular text as being of lessor importance is quite
  different from choosing the overall base font size (or indeed
  typeface) for the page, or the colour of links or headings -- that's
  merely expressing a preference for how graphical user-agents should
  render particular semantics, but the semantics themselves are
  conveyed to _all_ user-agents (a , h3 , etc).
  
   4) b Siemens/b also does not tell me anything about the
   semantics.  Is it used as a name, a brand a foreign word ? etc. I
   cannot get that information from looking at the b element.
  
  Indeed you can't.  And nor can you if you were reading printed text
  with some words in bold.  However, you would appreciate that the
  author had wished for some particular words to stand out from the
  surrounding text.  Perhaps you then notice it's being done for all
  brand names?  Or that the emboldened words spell out a secret
  message?
  
  However, you can only notice this if the words have been
  distinguished in some way.  With b, all user-agents can choose to
  convey to users that those words are special.
 
 You cannot make a 100% comparison between printed and web-published
 styling and semantics. Apart from the obvious visual difference, we
 are talking about the ability here to convey semantics other than just
 visual.

Indeed.

 For example to aid machine-readability but far more importantly,
 Assistive Technologies.  If markup in web-publishing was meant to be
 just for visual feedback, we would only need 1 block and one inline
 element as we can do anything with just classes and CSS in that
 respect.

But that would be using a styling technology (and an optional one at
that) for conveying meaning.

Anybody without the CSS -- or with a non-graphical user-agent, which
can't render the CSS rules to the user -- will be missing out.  Such
users wouldn't be able to distinguish span class=legalese or even
span class=secret_message from the surrounding text.

Whereas if small or b are used, all user-agents can do _something_
with them.

So I completely agree with what you say.

Smylers


Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread David Muschiol
On Fri, Nov 14, 2008 at 06:09, Nils Dagsson Moskopp
[EMAIL PROTECTED] wrote:
The small element represents small print [...]

The b element represents a span of text to be stylistically offset from
 the normal prose without conveying any extra importance [...]

 Both definitions seems rather presentational (contrasting, for example,
 the new semantic definition for the i element) and could also be
 realized by use of span elements.

Why use span class=smallprintCopyright (c) 2008 …/span instead
of just smallCopyright (c) 2008 …/small?  The latter possibility
is way more semantic.

And why use span class=brandSiemens/span instead of just bSiemens/b?

To me, the small and b elements – especially the former – make perfect sense.

-david


Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Pentasis

The small element represents small print [...]


The b element represents a span of text to be stylistically offset from

the normal prose without conveying any extra importance [...]

Both definitions seems rather presentational (contrasting, for example,
the new semantic definition for the i element) and could also be
realized by use of span elements.


Why use span class=smallprintCopyright (c) 2008 ?/span instead
of just smallCopyright (c) 2008 ?/small?  The latter possibility
is way more semantic.

And why use span class=brandSiemens/span instead of just 
bSiemens/b?


To me, the small and b elements ? especially the former ? make perfect 
sense.


-david


I agree with the original poster on this.

1) Just because it makes sense to a human (it doesn't to me), does not mean 
it makes sense to a machine.
2) When using small on different text-nodes throughout the document, one 
would expect all these text-nodes to be semantically the same. But they are 
not (unless all of them are copyright notices).
3) small is a styling element, it has zero semantic meaning, so it does 
not belong inside HTML.
4) bSiemens/b also does not tell me anything about the semantics. Is it 
used as a name, a brand a foreign word ? etc. I cannot get that information 
from looking at the b element.


Bert 





Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Oldřich Vetešník
Dne Fri, 14 Nov 2008 14:40:20 +0100 Pentasis [EMAIL PROTECTED]  
napsal/-a:



I agree with the original poster on this.

1) Just because it makes sense to a human (it doesn't to me), does not  
mean it makes sense to a machine.
2) When using small on different text-nodes throughout the document,  
one would expect all these text-nodes to be semantically the same. But  
they are not (unless all of them are copyright notices).
3) small is a styling element, it has zero semantic meaning, so it  
does not belong inside HTML.
4) bSiemens/b also does not tell me anything about the semantics. Is  
it used as a name, a brand a foreign word ? etc. I cannot get that  
information from looking at the b element.


Bert


I second that, even though it might have a zero value.

Ollie


Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Tab Atkins Jr.
On Fri, Nov 14, 2008 at 7:40 AM, Pentasis [EMAIL PROTECTED] wrote:

 The small element represents small print [...]

  The b element represents a span of text to be stylistically offset from

 the normal prose without conveying any extra importance [...]

 Both definitions seems rather presentational (contrasting, for example,
 the new semantic definition for the i element) and could also be
 realized by use of span elements.


 Why use span class=smallprintCopyright (c) 2008 ?/span instead
 of just smallCopyright (c) 2008 ?/small?  The latter possibility
 is way more semantic.

 And why use span class=brandSiemens/span instead of just
 bSiemens/b?

 To me, the small and b elements ? especially the former ? make perfect
 sense.

 -david


 I agree with the original poster on this.

 1) Just because it makes sense to a human (it doesn't to me), does not mean
 it makes sense to a machine.
 2) When using small on different text-nodes throughout the document, one
 would expect all these text-nodes to be semantically the same. But they are
 not (unless all of them are copyright notices).


Why would you expect this?  Or rather, why would you expect this level of
semantic specificity?  small means something fairly broad that multiple
types of specific semantics can fall under.


 3) small is a styling element, it has zero semantic meaning, so it does
 not belong inside HTML.


It *had* zero semantic meaning.  Actually, though, this wasn't quite true.
The semantics that have been attached to small (and i, and b) are an
approximation of the common semantics that users of the elements conferred
on the contents.  Text within small was, quite often, used for small
print.  Matching up theory with practice is a good thing here.

i and b, once you subtract the semantics stolen by em and strong,
are used pretty much specifically as the spec states.

4) bSiemens/b also does not tell me anything about the semantics. Is it
 used as a name, a brand a foreign word ? etc. I cannot get that information
 from looking at the b element.


Of course not.  You're not intended to.  What you *do* get, though, is that
this is a word which is *intentionally* stylistically offset from the rest
of the text.  This conveys semantic meaning to a human - it means that the
word is special or being used in a particular context.  b and i don't
communicate *much*, but they communicate *something*.  One could, of course,
also use a span to mark up and style the text, thus communicating the same
intent to a person reading the styled text, but to a machine the span
means literally nothing, while b and i have the possibility to
communicate *something*.

In addition, the fact that these elements traditionally have a particular
preferred rendering means something.  A dumb terminal which doesn't
understand CSS won't give any indication to the user that a span exists at
all, while b and i have a chance of providing fallback rendering that
still accomplishes what they were designed to do.  A decent chunk of html5
concerns itself with providing fallbacks and graceful degradation (or
progressive enhancement, whichever way you want to look at it).  Having some
*nearly* semantic-free elements which have a meaningful fallback can be
useful.

Of course, it may certainly be more useful to you if you provide a class on
the i as well.

~TJ


Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Pentasis
Of course not.  You're not intended to.  What you *do* get, though, is that 
this is a word which is *intentionally* stylistically offset from the rest 
of the text.  This conveys semantic meaning to a human - it means that the 
word is special or being used in a particular context.  b and i don't 
communicate *much*, but they communicate *something*.  One could, of course, 
also use a span to mark up and style the text, thus communicating the same 
intent to a person reading the styled text, but to a machine the span 
means literally nothing, while b and i have the possibility to 
communicate *something*.


In addition, the fact that these elements traditionally have a particular 
preferred rendering means something.  A dumb terminal which doesn't 
understand CSS won't give any indication to the user that a span exists at 
all, while b and i have a chance of providing fallback rendering that 
still accomplishes what they were designed to do.  A decent chunk of html5 
concerns itself with providing fallbacks and graceful degradation (or 
progressive enhancement, whichever way you want to look at it).  Having some 
*nearly* semantic-free elements which have a meaningful fallback can be 
useful.


Of course, it may certainly be more useful to you if you provide a class on 
the i as well.


~TJ



First: Computers are binary instruments. conveying *something* is not very 
logical seen from a computers point of view. It is not usefull to *me* to 
provide a class to the i or any other element, it is usefull to the 
computer, as humans may indeed come to some sort of conclusion based on 
style or strangely used semantics, computers cannot, they (still) need a 
more literal means of semantics.


Second: Suppose I want to collect all copyright notices from 1000 websites 
(don't ask me why, I just want to), how am I to do this when they are marked 
up in smalls? I will definatly end up with a lot of text that has nothing 
to do with copyrights (and probably miss a lot of copyright notices as they 
are marked up differently) Whereas If they were maked up in (for example) 
span class=copyright I could retrieve it all based on the class-name.


Bert 





Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Tab Atkins Jr.
On Fri, Nov 14, 2008 at 9:38 AM, Pentasis [EMAIL PROTECTED] wrote:

 Of course not.  You're not intended to.  What you *do* get, though, is that
 this is a word which is *intentionally* stylistically offset from the rest
 of the text.  This conveys semantic meaning to a human - it means that the
 word is special or being used in a particular context.  b and i don't
 communicate *much*, but they communicate *something*.  One could, of course,
 also use a span to mark up and style the text, thus communicating the same
 intent to a person reading the styled text, but to a machine the span
 means literally nothing, while b and i have the possibility to
 communicate *something*.

 In addition, the fact that these elements traditionally have a particular
 preferred rendering means something.  A dumb terminal which doesn't
 understand CSS won't give any indication to the user that a span exists at
 all, while b and i have a chance of providing fallback rendering that
 still accomplishes what they were designed to do.  A decent chunk of html5
 concerns itself with providing fallbacks and graceful degradation (or
 progressive enhancement, whichever way you want to look at it).  Having some
 *nearly* semantic-free elements which have a meaningful fallback can be
 useful.

 Of course, it may certainly be more useful to you if you provide a class on
 the i as well.

 ~TJ



 First: Computers are binary instruments. conveying *something* is not very
 logical seen from a computers point of view. It is not usefull to *me* to
 provide a class to the i or any other element, it is usefull to the
 computer, as humans may indeed come to some sort of conclusion based on
 style or strangely used semantics, computers cannot, they (still) need a
 more literal means of semantics.


If we wish to communicate that level of semantics, yes.  It may not be
useful to us.  If you *really* need some metadata/semantics, @class probably
can't convey it with enough granularity.  Check out the big discussion from
a few months ago about ccRel and RDFa.


 Second: Suppose I want to collect all copyright notices from 1000 websites
 (don't ask me why, I just want to), how am I to do this when they are marked
 up in smalls? I will definatly end up with a lot of text that has nothing
 to do with copyrights (and probably miss a lot of copyright notices as they
 are marked up differently) Whereas If they were maked up in (for example)
 span class=copyright I could retrieve it all based on the class-name.


That would be a wonderful perfect world.  I'd like the copyright date as
well, so I can retrieve only things copyrighted in the last ten years.
Assuming that metadata will exist is a fool's errand.  The fact is that if
you are searching for copyright notices, the most efficient way is likely to
just search for the string copyright and the (c) symbol.  That'll net you
copyright notices with a high accuracy, and some training on real data can
yield further rules to improve the data-mining accuracy.

While we're hoping for copyright notices to be marked up as span
class=copyright, though, why not wish for small class=copyright?  If
you're going to be providing metadata, it works the same.  Is it that you
believe people won't provide a special class for copyrights if the small
tag already gives them the preferred display?  Do you believe that everyone
will automatically use class=copyright to mark up their copyright
notices?  What if they use class=copyright-notice?  Or class=license?
Or any of a million other distinct possibilities that would destroy any
naive attempt to datamine based on a particular class name?

~TJ


Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Pentasis
If we wish to communicate that level of semantics, yes.  It may not be 
useful to us.  If you *really* need some metadata/semantics, @class probably 
can't convey it with enough granularity.  Check out the big discussion from 
a few months ago about ccRel and RDFa.
 

Not yet maybe, but we could at least try to keep options open for the future.


Second: Suppose I want to collect all copyright notices from 1000 websites 
(don't ask me why, I just want to), how am I to do this when they are marked 
up in smalls? I will definatly end up with a lot of text that has nothing 
to do with copyrights (and probably miss a lot of copyright notices as they 
are marked up differently) Whereas If they were maked up in (for example) 
span class=copyright I could retrieve it all based on the class-name.

That would be a wonderful perfect world.  I'd like the copyright date as 
well, so I can retrieve only things copyrighted in the last ten years.  
Assuming that metadata will exist is a fool's errand.  The fact is that if 
you are searching for copyright notices, the most efficient way is likely to 
just search for the string copyright and the (c) symbol.  That'll net you 
copyright notices with a high accuracy, and some training on real data can 
yield further rules to improve the data-mining accuracy.

You say it yourself, only in a perfect world where all websites in the world 
would be written in the same language would your solution work. Unfortunatly 
I would miss out on all the chinese copyright stuff.
But another example (based on siemens) wouldn't it be nice if I could tell 
Google I am looking for a person named Siemens so it would ignore the 
brand-name?


While we're hoping for copyright notices to be marked up as span 
class=copyright, though, why not wish for small class=copyright?  If 
you're going to be providing metadata, it works the same.  Is it that you 
believe people won't provide a special class for copyrights if the small 
tag already gives them the preferred display?  Do you believe that everyone 
will automatically use class=copyright to mark up their copyright notices? 
 What if they use class=copyright-notice?  Or class=license?  Or any of 
a million other distinct possibilities that would destroy any naive attempt 
to datamine based on a particular class name?


Well, that would have to be defined in the standard, wouldn't it? I'm not 
saying -again- it should be defined NOW, but at least leave the door open.
I have no problems with using small over span, neither one is correct as far as 
I can see, in this context. Using copyright instead of license or 
copyright-notice would have to be defined somewhere, either in the standard 
or in an externally maintained document that is accepted as best practice 
or standards related.

PS: I find it very difficult to respond to rich-text/html messages as they 
seriously mess up the indentation. Sorry therfor if this message is unclear as 
original message and reply are mixed up.

Re: [whatwg] Deprecating small, b ?

2008-11-14 Thread Tab Atkins Jr.
On Fri, Nov 14, 2008 at 10:44 AM, Pentasis [EMAIL PROTECTED] wrote:

  If we wish to communicate that level of semantics, yes.  It may not be
 useful to us.  If you *really* need some metadata/semantics, @class probably
 can't convey it with enough granularity.  Check out the big discussion from
 a few months ago about ccRel and RDFa.


 Not yet maybe, but we could at least try to keep options open for the
 future.


Of course, but I don't think having small in the language closes any
options.


 Second: Suppose I want to collect all copyright notices from 1000
 websites (don't ask me why, I just want to), how am I to do this when they
 are marked up in smalls? I will definatly end up with a lot of text that
 has nothing to do with copyrights (and probably miss a lot of copyright
 notices as they are marked up differently) Whereas If they were maked up in
 (for example) span class=copyright I could retrieve it all based on the
 class-name.

 That would be a wonderful perfect world.  I'd like the copyright date as
 well, so I can retrieve only things copyrighted in the last ten years.
 Assuming that metadata will exist is a fool's errand.  The fact is that if
 you are searching for copyright notices, the most efficient way is likely to
 just search for the string copyright and the (c) symbol.  That'll net you
 copyright notices with a high accuracy, and some training on real data can
 yield further rules to improve the data-mining accuracy.

 You say it yourself, only in a perfect world where all websites in the
 world would be written in the same language would your solution work.
 Unfortunatly I would miss out on all the chinese copyright stuff.


Of course.  But would you expect chinese speakers to use class=copyright
on their pages anyway?


 But another example (based on siemens) wouldn't it be nice if I could
 tell Google I am looking for a person named Siemens so it would ignore the
 brand-name?


Certainly.  But at this point you're expecting authors to mark up their
pages with metadata every time they mention someone's name.  The use of b
doesn't prevent this, but your use-case certainly requires quite a lot more.


 While we're hoping for copyright notices to be marked up as span
 class=copyright, though, why not wish for small class=copyright?  If
 you're going to be providing metadata, it works the same.  Is it that you
 believe people won't provide a special class for copyrights if the small
 tag already gives them the preferred display?  Do you believe that everyone
 will automatically use class=copyright to mark up their copyright
 notices?  What if they use class=copyright-notice?  Or class=license?
 Or any of a million other distinct possibilities that would destroy any
 naive attempt to datamine based on a particular class name?

 Well, that would have to be defined in the standard, wouldn't it? I'm not
 saying -again- it should be defined NOW, but at least leave the door open.
 I have no problems with using small over span, neither one is correct as
 far as I can see, in this context. Using copyright instead of license or
 copyright-notice would have to be defined somewhere, either in the
 standard or in an externally maintained document that is accepted as best
 practice or standards related.


Okay, then we have no issue with small.  There has been some discussion,
btw, about standardizing a set of normative class names.  You should be able
to turn something up about it.

PS: I find it very difficult to respond to rich-text/html messages as they
 seriously mess up the indentation. Sorry therfor if this message is unclear
 as original message and reply are mixed up.


No problem; it was clear enough.  The only richtext I use is quote levels,
and with the conversation context nearby anyway, it's not difficult to
puzzle out when it occasionally messes up.

~TJ