Re: [whatwg] Deprecating small , b ?
Tab Atkins Jr. ha scritto: On Wed, Nov 26, 2008 at 4:48 PM, Calogero Alex Baldacchino [EMAIL PROTECTED] wrote: [cut] We don't have to touch parsing at all to accomplish essentially this.The issue you're worried about is getting crazy semantics applied to individual letters. Semantic parsers (which honestly the average browser is *not*) can easily just ignore the semantic value of b or small or i when they don't wrap a full word, assuming that the use is either stylistic or too complex/subtle to easily capture. Well, such is a 'semantic' solution equivalent to leaving all to the implementation; a 'parsing' solution would solve the 'problem' at the bottom, but I acknowledge the question is too marginal to be seriously taken into account. Agree to disagree, I guess. I don't find We hope you'll find bProduct A/b to be the best laundry detergent you've ever used! to be denoting emphasis or importance, really. I think 'Product A' is the core of the message, the thing some people are trying to sell you, the name you *must* remember when you want to by a laundry detergent, so those people become rich. The bold presentation aims to capture your attention and keep your eyes on it a bit longer; on a tv/radio spot the name of the product would be spoken out with some isolation, with at least a bit of emphasis, for the same reasons. It denotes importance meaning you need to pay a special attention to it in order to understand *what the author wants you to understand*. I think that the same semantics can be expressed by strong, since the importance of a piece of text is not (only) in its meaning, or in the message overall meaning, or in one's way to take it as important or not, but (also, or mainly) in the author's intention to mark it as different from the rest of the content, as a reading key, to drive your attention and as well your thoughts (ok, that's like saying that truth is a chimera, but such can be a crude truth :-P ). If I was contrasting Product A with another item, I could perhaps agree. But we're not, so I don't. ^_^ However, we're obviously splitting hairs here. But you're implicitly contrasting 'Product A' with a bounce of generic items, all items of the same category your potential buyer might happen to know (I think this needs some clarification with one example, I guess you were referring to comparative advertising, which has not been legal in Italy for several years, so what you wrote - with some makup - has always been one of the most common advertisement here). Anyway, I agree we're splitting hairs, but there's some reason for me to push those concepts, and I hope I'll be able to make it clear. Well, a foreign-language word, specially if correctly pronounced (by someone else), can be more or less hard to 'catch', so a bit of emphasis in its pronounce might help the listener to correctly distinguish sounds. That's stretching quite a bit more than I think is appropriate. Just because I use a foreign phrase, does not mean that I'm emphasizing it. If I, in audible speech, would put a bit of inflection on the phrase, that still doesn't mean I'm emphasizing it in anything like the way I emphasize I'm emnot/em going to the dance with you!. Isn't a bit of inflection also a bit of emphasis in pronounce? Perhaps I'm misusing the English term; that sounded correct at me in a wider sense, but I'll leave that concept, or modify it... In other words, at most I might slightly stylistically offset the phrase from my surrounding spoken words, but I wouldn't be *emphasizing* them. So the i semantics are correct here. ^_^ After all, most of times bold and italicized texts (try and) reflect our way to pronounce sentences, with more or less isolation, more or less emphasis, quicker or slower, so changing their meaning, telling the listener that any part requires a greater or a lesser attention, is somehow 'special', with somehow different grades of 'speciality'. From this point of view, I think either b/i can be semantically the very same thing as strong/em, or their semantic should be redefined so to indicate a different (and lower) grade of 'speciality' on the same speciality scale, but not as a different kind of 'speciality' (i.e., b-text stands out for some - opaque - reason which has nothing in common with strong-text). You're overreaching your definition of importance and emphasis. I don't think it's valuable to denote *everything* that is in some way special as important or emphatic - you lose a sense of scale. If you wish to define the words as such, then sure, b and i are lesser grades of importance and emphasis by definition. By more conventional definitions, though, they're not, and their stated semantics are fine. Ok, let's define 'special' in a more correct manner. What should be a slight offset? What does 'outstanding for some reason' mean, in a less ambigous definition? How should the offset
Re: [whatwg] Deprecating small , b ?
Tab Atkins Jr. ha scritto: On Tue, Nov 25, 2008 at 3:08 PM, Calogero Alex Baldacchino [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Tab Atkins Jr. ha scritto: On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Do you mean that if you had markup like pbW/bhen I was young.../p, it would be read out as I was young...? If so, that's clearly a bug in the reader, and has nothing to do with semantics or the lack of it. There is *no* legitimate interpretation of that markup that would lead one to discard the first word. I agree that a reading software unable to understand some text with unexpected typographic variants, should read it as normal text; however, I guess how the above can result in an unexpected situation, when looking for non-typographic semantics. Basically, there is a subset of authors who are morons, and they'll screw up anything we do. Most of us aren't like that, but trying to design around that subset is a game you can't win. Their pages will be FUBAR no matter what we do, until browsers' rendering engines are literally hooked up to a sentient semantic parser. Arghh!! Such a software would be too smart and dominate the world... That could think, morons are bothering; human beings generate morons; no more human beings means no more bother for me Ah, the default style could be slightly or very different from the small one, i.e. the text could be surrounded by parenthesis or hyphens, despite of the font size (and the new elements could be designed such to accept just non-empty strings consisting of more than one non-spacing character). We could, but is there any reason to have it do that? Making the text small is a good visual representation of the small print or aside semantics. The concept was (or could be - let me modify it), the more we provide alternative visual representations of the aside semantic element, the more likely a moron designer will stay far from it, since he could be confused about the style he's creating. As well, the rule there is no default style for the element could prevent authoring tools from just changing the name of a button used to style some text. But I know, all would fail because most popular browser would choose very similar rendering (or would they just follow rendering small fonts). Anyway, I wouldn't underestimate the latter characteristic (ok, that wasn't clear), that is establishing the use of the element is legal if it sorrunds a piece of text made up of one or more whole words (or at least one readable character) and if it's bounded by spacing or punctuation characters (that is, the 'semantic element' cannot be a part of a word). Of course, the misuse concern would just move from the messed-up word to a messed-up sentence, but at least, in this case, an assistive reader would be less likely fouled up and, without any need for luck, it could speak out something funny, yet understandable. Of course the same could be done redefining b and small parsing rules, but such would result in a break with a bounch of (possible) legacy uses, and if we had to break somehow with the past, why don't have a look for some more significant names? - Just to say, not hoping to persuade you :-P Here it is me not understanding. I think that any reason to offset some text from the surrounding one can be reduced to the different grade of 'importance' the author gives it, in the same meaning as Smylers used in his mails (that is, not the importance of the content, but the relevance it gets as attention focus - he made the example of the English small print idiom, and in another mail clarified that It's less important in the sense that it isn't the point of what the author wants users to have conveyed to them; it's less important to the message. (Of course, to users any caveats in the small print may be very important indeed!)). From this point of view, unless we aimed to avail of b as an intermediate grade of relevance between 'normal text' and 'em/strong' (but, aren't these enough to attract a reader's attention?), redefining its semantic might be redundant with lesser utility. (In my crazy mind, this applies to the headings too, since a 'good' heading focuses attention on the core subject of its following section, so have to be evidenced as an important slice of text). Furthermore, I meant that strong and em would have been a better choice than b in Smylers' examples because their *original semantics* is very close together with that of a more relevant text/a text needing greater attention, while b *original semantics* is very different and needs to be redefined for this purpose (but we have still got possible alternatives to this).
Re: [whatwg] Deprecating small , b
Pentasis writes: [Asbjørn Ulsberg writes:] However, as you write and as HTML5 defines it, there is nothing wrong with small per se, and I agree that as an element indicating smallprint, it works just fine. Since my initial reply might have been a bit too colored by the HTML4 definition of the element and its current usage on the web, I hereby withdraw my comment and conclude that I mostly agree with you. :-) Yay, consensus! Thanks, Asbjørn. But isn't this just the reason why it should be dis-used? The HTML4 spec defined it as a styling tag, and that is how it is *mostly* used and understood by the majority of the users/authors. That may be true (though authors who want smaller text just because they think the default looks too large could also use font size=2 or CSS), but authors who wanted to diminished the emphasis of certain content to users are likely to pick small because there isn't much else available. Just because an element is currently widely used for a purpose we deem inappropriate doesn't mean that its appropriate uses aren't important. Tables are widely used for layout; br-s are widely misused. Both of those clearly have other valid uses, so are still in HTML. Just because HTML5 redefines the element does not mean that the element will suddenly be semantic. Even if people start using it purely semantically from now on (and what is the chance of that?), the existing websites still carry small-tags that are not compliant with the new definition. Yes. But the suggested alternative was to deprecate small entirely and invent a new element to convey the semantic of 'small print'. That would of course make _all_ current uses of small non-conforming. Presentational small-s are going to be non-conforming either way; allowing semantic small-s to conform doesn't change that. By redefining it the (existing) web breaks; allbeit purely in the semantic area. That's intentional. If anybody checks legacy content against the new standard they will discover that what they did is no longer recommended. However, browsers will 100% support it and continue to render it as it always has been, so the 'breakage' is no way visible; if the author chooses not to care about it then no harm is done. Smylers PS: Pentasis, please could you send mails that do at least one of attributing who you're quoting or include In-Reply-To: headers so that they continue the existing thread rather than starting a new one. Without either it's rather tedious to have to look up who said the text you quote. Thanks.
Re: [whatwg] Deprecating small , b ?
Smylers wrote: Asbjørn Ulsberg writes: On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED] wrote: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small . No, it doesn't, and you explain why yourself here: User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. I don't see how that explains why small is an inappropriate tag to use for things which an author wishes to be less noticeable. [...] Of course that's possible, but, as you noticed too, only by redefining the small semantics, and is not a best choice per se. That's both because the original semantics for the small tag was targeted to styling and nothing else (the html 4 document type definitions declared it as a member of the fontstyle entity, while, for instance, strong and em were parts of the phrase entity), and because the term 'small', at first glance, suggests the idea of a typographical function, regardless any other related concept which might be specific for the English (or whatever else) culture, but might not be as well immediate for non-English developers all around the world. As a consequence, since any average developer could just rely on the old semantics, being he intuitively confident with it, the semantics redefinition could find a first counter-indication: let's think on a word written with alternate b and small letters, or just to a paragraph first letter evidenced by a b, obviously the application of the new semantics here would be untrivial (i.e. an assistive software for blind users would be fouled by this and give unpredictable results). Despite the previous use case would be a misuse of the b and small markup, yet it would be possible, meaning not prohibited, and so creating a new element with a proper semantic could be a better choice. But, you're right, we have to deal with backward compatibility, and redefining the small and b semantics can be a good compromise, since a new element would face some heavy concerns, mainly related to rendering and to the state of the art implementations in non-visual user agents (and the alike). However, I think that a solution, at least partial, can be found for the rendering concern (and I'd push for this being done anyway, since there are several new elements defined for HTML 5). Most user agents are capable to interpret a dtd to some extent, so it could be worth the effort to define an html 5 specific dtd in addition to the parsing roules - which aim to overcome all problems arising by previous dtd-only html specifications - so that a non html5-fully-compliant browser can somehow interpret any new elements. HTML 5 Doctype declaration could accept a dtd just for backward compatibility purpose, and any fully compliant user agent would just ignore such dtd. More specifically, such a dtd could define default values for some attributes, such as the style attribute (to have any new element properly rendered - some assistive technologies are capable to interpret style sheets too), and, anyway, there should be a way, in SMGL, to create an alias for an element (i.e., a new element - let's call it incidental - could be aliased to small for better compatibility). Let's come to the non-typographical interpretation a today u.a. may be capable of, as in your example about lynx. This can be a very good reason to deem small a very good choice. But, are we sure that *every* existing user agent can do that? If the answer is yes, we can stop here: small is a perfect choise. Better: small is all we need, so let's stop bothering each other about this matter. But if the answer is no, we have to face a number of user agents needing an update to understand the new semantics for the small tag, and so, if the new semantics can be assumed as *surely* reliable only with new/updated u.a.'s (that is, with those ones fully compatible with html 5 specifications), that's somehow like to be starting from scratch, and consequently there is space for a new, more appropriate element. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. That's a job for the style sheet, whether it's provided by the author or by the user agent. The style-sheet can only pick out particular words if those words have been marked-up as special in the document, so it doesn't solve the problem of how to mark them up. Further, this isn't using b because the house style is to have all text in a bold weight (that can be done by style-sheets, and if the style-sheet is missing all the content is still there); it's using b to
Re: [whatwg] Deprecating small , b ?
On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino [EMAIL PROTECTED] wrote: Smylers wrote: Asbjørn Ulsberg writes: On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED] wrote: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small . No, it doesn't, and you explain why yourself here: User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. I don't see how that explains why small is an inappropriate tag to use for things which an author wishes to be less noticeable. [...] Of course that's possible, but, as you noticed too, only by redefining the small semantics, and is not a best choice per se. That's both because the original semantics for the small tag was targeted to styling and nothing else (the html 4 document type definitions declared it as a member of the fontstyle entity, while, for instance, strong and em were parts of the phrase entity), and because the term 'small', at first glance, suggests the idea of a typographical function, regardless any other related concept which might be specific for the English (or whatever else) culture, but might not be as well immediate for non-English developers all around the world. As a consequence, since any average developer could just rely on the old semantics, being he intuitively confident with it, the semantics redefinition could find a first counter-indication: let's think on a word written with alternate b and small letters, or just to a paragraph first letter evidenced by a b, obviously the application of the new semantics here would be untrivial (i.e. an assistive software for blind users would be fouled by this and give unpredictable results). Despite the previous use case would be a misuse of the b and small markup, yet it would be possible, meaning not prohibited, and so creating a new element with a proper semantic could be a better choice. No matter *what* we do, if there *is* a default style for an element, it will be misused by people. This is a fact of life. Defining a new element which is identical to small in every way except that it hasn't been misused *yet* is thus a mug's game, because it *will* be misused in the same way as small, and then we just have two identical elements for no reason. Yes, bad markup will foul up semantic agents. But people will *always* write bad markup. At least with the semantic redefinition we get to declare lots of usages that *are* appropriate to be conforming without any effort on the author's part. And really, the type of people who would write a word with alternating letters wrapped in b and small tags are hardly the kind to even *care* about semantics. But, you're right, we have to deal with backward compatibility, and redefining the small and b semantics can be a good compromise, since a new element would face some heavy concerns, mainly related to rendering and to the state of the art implementations in non-visual user agents (and the alike). However, I think that a solution, at least partial, can be found for the rendering concern (and I'd push for this being done anyway, since there are several new elements defined for HTML 5). Most user agents are capable to interpret a dtd to some extent, so it could be worth the effort to define an html 5 specific dtd in addition to the parsing roules - which aim to overcome all problems arising by previous dtd-only html specifications - so that a non html5-fully-compliant browser can somehow interpret any new elements. HTML 5 Doctype declaration could accept a dtd just for backward compatibility purpose, and any fully compliant user agent would just ignore such dtd. More specifically, such a dtd could define default values for some attributes, such as the style attribute (to have any new element properly rendered - some assistive technologies are capable to interpret style sheets too), and, anyway, there should be a way, in SMGL, to create an alias for an element (i.e., a new element - let's call it incidental - could be aliased to small for better compatibility). Html5 is no longer an SGML language. Let's come to the non-typographical interpretation a today u.a. may be capable of, as in your example about lynx. This can be a very good reason to deem small a very good choice. But, are we sure that *every* existing user agent can do that? If the answer is yes, we can stop here: small is a perfect choise. Better: small is all we need, so let's stop bothering each other about this matter. But if the answer is no, we have to face a number of user agents needing an update to understand the new semantics for the small tag, and
Re: [whatwg] Deprecating small , b ?
Tab Atkins Jr. ha scritto: On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Of course that's possible, but, as you noticed too, only by redefining the small semantics, and is not a best choice per se. That's both because the original semantics for the small tag was targeted to styling and nothing else (the html 4 document type definitions declared it as a member of the fontstyle entity, while, for instance, strong and em were parts of the phrase entity), and because the term 'small', at first glance, suggests the idea of a typographical function, regardless any other related concept which might be specific for the English (or whatever else) culture, but might not be as well immediate for non-English developers all around the world. As a consequence, since any average developer could just rely on the old semantics, being he intuitively confident with it, the semantics redefinition could find a first counter-indication: let's think on a word written with alternate b and small letters, or just to a paragraph first letter evidenced by a b, obviously the application of the new semantics here would be untrivial (i.e. an assistive software for blind users would be fouled by this and give unpredictable results). Despite the previous use case would be a misuse of the b and small markup, yet it would be possible, meaning not prohibited, and so creating a new element with a proper semantic could be a better choice. No matter *what* we do, if there *is* a default style for an element, it will be misused by people. This is a fact of life. Defining a new element which is identical to small in every way except that it hasn't been misused *yet* is thus a mug's game, because it *will* be misused in the same way as small, and then we just have two identical elements for no reason. I'll start with an example. A few time ago I played around with Opera Voice. It seemed to be capable to interpret visual style sheets and specifically font styles, so that bold or italics text (so constraint in the style sheet, not the markup) were spoken differently from 'normal' text, but a paragraph first letter differing from the rest of the word (which is a non-rare typographical choice), as far as I remember, caused the whole word to be skipped. This suggests me that if we really want a 'cross-presentation' semantics, we have to keep as far as we can from anything having a *main* typographical semantics (as small and b have from their birth). Every language is somehow prone to side-effects caused by misuse (i.e. it is possible to cause a big mess in a software written in a language allowing to pass a pointer to a function - there are tons of examples for language design issues - yet such could be a desireable capability), but appropriate choices for both semantics and syntax may help to reduce the likelyhood of a misuse. I think that very likely both b and small will carry on their old semantics, so being more prone to misuse with respect to their new one, since very likely a lot of developers are, and will rest, more confident with their original semantics, which is also suggested by their names ('b' standing for 'bold' and 'small'... for something small on the screen or on paper). Instead, a new element would require the developer to take some effort at least to learn about its existence, so he would read that such element primary use is to indicate a different importance of a piece of text, so that a non visual user agent can present it in an appropriate manner, and a visual or print user agent can render it in different ways. Ah, the default style could be slightly or very different from the small one, i.e. the text could be surrounded by parenthesis or hyphens, despite of the font size (and the new elements could be designed such to accept just non-empty strings consisting of more than one non-spacing character). Yes, bad markup will foul up semantic agents. But people will *always* write bad markup. At least with the semantic redefinition we get to declare lots of usages that *are* appropriate to be conforming without any effort on the author's part. And really, the type of people who would write a word with alternating letters wrapped in b and small tags are hardly the kind to even *care* about semantics. Let me reverse this approach: what should an assistive user agent do with such a bM/bsmallE/smallbS/bsmallS/small? I think that dealing with that word as normal text would be a more gracefull degradation than discarding it, and if we clearly state that b and small have only typographical semantics, while different elements are provided to differentiate the grade of emphasys of a phrase, an assistive user agent could support a better behaviour, while any author disregarding semantics would not cause any trouble (the
Re: [whatwg] Deprecating small , b ?
On Tue, Nov 25, 2008 at 3:08 PM, Calogero Alex Baldacchino [EMAIL PROTECTED] wrote: Tab Atkins Jr. ha scritto: On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Of course that's possible, but, as you noticed too, only by redefining the small semantics, and is not a best choice per se. That's both because the original semantics for the small tag was targeted to styling and nothing else (the html 4 document type definitions declared it as a member of the fontstyle entity, while, for instance, strong and em were parts of the phrase entity), and because the term 'small', at first glance, suggests the idea of a typographical function, regardless any other related concept which might be specific for the English (or whatever else) culture, but might not be as well immediate for non-English developers all around the world. As a consequence, since any average developer could just rely on the old semantics, being he intuitively confident with it, the semantics redefinition could find a first counter-indication: let's think on a word written with alternate b and small letters, or just to a paragraph first letter evidenced by a b, obviously the application of the new semantics here would be untrivial (i.e. an assistive software for blind users would be fouled by this and give unpredictable results). Despite the previous use case would be a misuse of the b and small markup, yet it would be possible, meaning not prohibited, and so creating a new element with a proper semantic could be a better choice. No matter *what* we do, if there *is* a default style for an element, it will be misused by people. This is a fact of life. Defining a new element which is identical to small in every way except that it hasn't been misused *yet* is thus a mug's game, because it *will* be misused in the same way as small, and then we just have two identical elements for no reason. I'll start with an example. A few time ago I played around with Opera Voice. It seemed to be capable to interpret visual style sheets and specifically font styles, so that bold or italics text (so constraint in the style sheet, not the markup) were spoken differently from 'normal' text, but a paragraph first letter differing from the rest of the word (which is a non-rare typographical choice), as far as I remember, caused the whole word to be skipped. Do you mean that if you had markup like pbW/bhen I was young.../p, it would be read out as I was young...? If so, that's clearly a bug in the reader, and has nothing to do with semantics or the lack of it. There is *no* legitimate interpretation of that markup that would lead one to discard the first word. This suggests me that if we really want a 'cross-presentation' semantics, we have to keep as far as we can from anything having a *main* typographical semantics (as small and b have from their birth). Every language is somehow prone to side-effects caused by misuse (i.e. it is possible to cause a big mess in a software written in a language allowing to pass a pointer to a function - there are tons of examples for language design issues - yet such could be a desireable capability), but appropriate choices for both semantics and syntax may help to reduce the likelyhood of a misuse. I think that very likely both b and small will carry on their old semantics, so being more prone to misuse with respect to their new one, since very likely a lot of developers are, and will rest, more confident with their original semantics, which is also suggested by their names ('b' standing for 'bold' and 'small'... for something small on the screen or on paper). Instead, a new element would require the developer to take some effort at least to learn about its existence, so he would read that such element primary use is to indicate a different importance of a piece of text, so that a non visual user agent can present it in an appropriate manner, and a visual or print user agent can render it in different ways. Well, the new semantics are purposely very close to the old 'semantics'. Bold text *is* text purposely offset from the surrounding prose. Some legacy uses of b are more correctly done with other existing elements, like strong or h1, but at least it's *close*. And again, the type of author who *is* marking up random things with b for purely stylistic reasons isn't the sort who is going around reading standards documents, or likely even caring in the slightest. If they *did* discover a new element that has the correct semantics (like standout or something), they'll either ignore it (if it's basically identical to b) or use it nonsemantically as well (if it offers some exciting new default styling). Basically, there is a subset of authors who are morons, and they'll screw up anything we do. Most of us aren't like that, but
Re: [whatwg] Deprecating small , b ?
On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED] wrote: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small. No, it doesn't, and you explain why yourself here: User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. If the point isn't to literally render smaller fonts, you shouldn't indicate that you want the fonts rendered smaller either. What you want is to semantically indicate that the text wrapped inside the element is of less significance than the surrounding text, e.g. a negative 'strong' or 'em'. Just as 'b' isn't equal to 'strong', 'small' isn't equal to what we're trying to express here. What we need is a new element that can capture this semantic. Denoting particular text as being of lessor importance is quite different from choosing the overall base font size (or indeed typeface) for the page, or the colour of links or headings -- that's merely expressing a preference for how graphical user-agents should render particular semantics, but the semantics themselves are conveyed to _all_ user-agents (a, h3, etc). Which is why we need to capture this as semantic and not as presentational sugar. Indeed you can't. And nor can you if you were reading printed text with some words in bold. Why does printed text set the standard for what we are able to express with a markup language? Does e.g. PDF in any way direct what should be possible with HTML? However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. That's a job for the style sheet, whether it's provided by the author or by the user agent. Using the same element would in most circumstances yield the same presentation. Isn't that what you want? However, you can only notice this if the words have been distinguished in some way. With b, all user-agents can choose to convey to users that those words are special. They are only special for sighted users, browsing the page with a rather advanced user agent. They are not special to blind users or to users of text-based user agents like Lynx. If you want to express semantics, then use a semantic element. Expressing semantics through presentation only is done in print because of the limitations in the printing system. If the print was for a blind person, printed with braille, one could imagine (had it been supported) that letters with a higher weight could be physically warmer than others, or with a more jagged edge so they could stand out. Such effects would have been impossible if the document was only tagged with presentational markup. The same applies to other mediums than print -- you need to know the underlying reason of why something is presented the way it is to transfer that presentation to another environment. And for that you need the semantics. -- Asbjørn Ulsberg -=|=-[EMAIL PROTECTED] «He's a loathsome offensive brute, yet I can't look away»
Re: [whatwg] Deprecating small , b ?
Asbjørn Ulsberg writes: On Mon, 17 Nov 2008 15:26:22 +0100, Smylers [EMAIL PROTECTED] wrote: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small . No, it doesn't, and you explain why yourself here: User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. I don't see how that explains why small is an inappropriate tag to use for things which an author wishes to be less noticeable. If the point isn't to literally render smaller fonts, you shouldn't indicate that you want the fonts rendered smaller either. Indeed. font size=-1 would be bad to use for this. What you want is to semantically indicate that the text wrapped inside the element is of less significance than the surrounding text, e.g. a negative 'strong' or 'em'. Yes. And I reckon than small works for that. English has the idiom of 'small print', roughly meaning text written by the legal department rather than the marketing department. But 'small print' doesn't literally have to be typeset with a smaller font; it's a figure of speech. 'small' isn't equal to what we're trying to express here. What we need is a new element that can capture this semantic. If we were starting from scratch then indeed small may not be the best name to choose for this element. But, unfortunately, we aren't. small has existed for some time, and people are already using it. If one currently wants to denote lessor importance small is the best element to pick. Further, existing browsers know what to do with small; if we introduce a new element then content that uses it will have a sub-optimal rendering in current browsers, whereas small already does something appropriate. So I still think small works for denoting that something is of smaller importance. Indeed you can't. And nor can you if you were reading printed text with some words in bold. Why does printed text set the standard for what we are able to express with a markup language? It doesn't set the standard. But it's useful in some comparisons. And most of the time humans cope perfectly well with inferring typographic conventions without having them spelt out. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. That's a job for the style sheet, whether it's provided by the author or by the user agent. The style-sheet can only pick out particular words if those words have been marked-up as special in the document, so it doesn't solve the problem of how to mark them up. Further, this isn't using b because the house style is to have all text in a bold weight (that can be done by style-sheets, and if the style-sheet is missing all the content is still there); it's using b to convey _some_ semantics: namely that those particular words are in some way special. So if the mark-up is span class=brand_name or similar and the distinguishing presentation added with CSS then users without style-sheets are completely unaware that the author identified those words as being special. Whereas with b, everybody gets to know. However, you can only notice this if the words have been distinguished in some way. With b , all user-agents can choose to convey to users that those words are special. They are only special for sighted users, browsing the page with a rather advanced user agent. They are not special to blind users or to users of text-based user agents like Lynx. Not true. Any user-agent can choose to convey that words marked in b are somehow different from the surrounding words. Lynx does this. If you want to express semantics, then use a semantic element. That's begging the question. If we define b to be semantic, then it is! Expressing semantics through presentation only is done in print because of the limitations in the printing system. Well, yes. If the print was for a blind person, printed with braille, one could imagine (had it been supported) that letters with a higher weight could be physically warmer than others, or with a more jagged edge so they could stand out. Yup -- and an HTML-to-braille converter could choose to do that with words marked in b, whereas it couldn't with span class=BrandName. Such effects would have been impossible if the document was only tagged with presentational markup. To some extent, yes: not knowing whether a letter is where it is on the page because it's a start of a paragraph or a heading, or just because the previous line is full, hampers doing that. And similarly for typefaces. The same applies to other mediums than print -- you need to know the underlying reason of why
Re: [whatwg] Deprecating small, b ?
Nils Dagsson Moskopp wrote: The small element represents small print [...] The b element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance [...] Both definitions seems rather presentational (contrasting, for example, the new semantic definition for the i element) and could also be realized by use of span elements. To me these look like the last remnants of the font element. So why are these elements retained ? This issue has been discussed in depth in the past; most significantly on public-html around May 2007. http://lists.w3.org/Archives/Public/www-html/2007May/thread.html#msg3 (I think most of the releant discussion is in the Cleaning House thread) I have added an entry to the FAQ detailing the rationale for including these elements, and have previously written an article about the issue too. http://wiki.whatwg.org/wiki/FAQ#Why_are_some_presentational_elements_like_.3Cb.3E.2C_.3Ci.3E_and_.3Csmall.3E_still_included.3F http://lachy.id.au/log/2007/05/b-and-i http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2007-January/009060.html -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Re: [whatwg] Deprecating small , b ?
On 2008/11/24 16:19 (GMT) Smylers composed: So I still think small works for denoting that something is of smaller importance. I do too, but I don't believe less importance can be the only inference. One could simply want smaller text, without expecting that inference. e.g., just because fine print legalese is called what it is doesn't doesn't necessarily make it unimportant or less important. I'm for keeping small in the spec. -- Love is not easily angered. Love does not demand its own way. 1 Corinthians 13:5 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://fm.no-ip.com/
Re: [whatwg] Deprecating small , b ?
Felix Miata writes: On 2008/11/24 16:19 (GMT) Smylers composed: So I still think small works for denoting that something is of smaller importance. I do too, but I don't believe less importance can be the only inference. One could simply want smaller text, without expecting that inference. If you just want something to be smaller stylistically and there's nothing special about that portion of the text then I think using small for it would be as bad as using h1 just to make text bigger; CSS is a better choice. e.g., just because fine print legalese is called what it is doesn't doesn't necessarily make it unimportant or less important. It's less important in the sense that it isn't the point of what the author wants users to have conveyed to them; it's less important to the message. (Of course, to users any caveats in the small print may be very important indeed!) Smylers
Re: [whatwg] Deprecating small , b ?
On Mon, 24 Nov 2008 17:19:44 +0100, Smylers [EMAIL PROTECTED] wrote: I don't see how that explains why small is an inappropriate tag to use for things which an author wishes to be less noticeable. I was thinking mostly about the tag's current usage on the web, which is a crazy mix between the HTML4 and HTML5 definition of the element. HTML4 defines it purely presentational, HTML5 mostly semantical. In that context, I believe small is inappropriate. However, as you write and as HTML5 defines it, there is nothing wrong with small per se, and I agree that as an element indicating smallprint, it works just fine. Since my initial reply might have been a bit too colored by the HTML4 definition of the element and its current usage on the web, I hereby withdraw my comment and conclude that I mostly agree with you. :-) -- Asbjørn Ulsberg -=|=- [EMAIL PROTECTED] «He's a loathsome offensive brute, yet I can't look away»
Re: [whatwg] Deprecating small , b
I was thinking mostly about the tag's current usage on the web, which is a crazy mix between the HTML4 and HTML5 definition of the element. HTML4 defines it purely presentational, HTML5 mostly semantical. In that context, I believe small is inappropriate. However, as you write and as HTML5 defines it, there is nothing wrong with small per se, and I agree that as an element indicating smallprint, it works just fine. Since my initial reply might have been a bit too colored by the HTML4 definition of the element and its current usage on the web, I hereby withdraw my comment and conclude that I mostly agree with you. :-) But isn't this just the reason why it should be dis-used? The HTML4 spec defined it as a styling tag, and that is how it is *mostly* used and understood by the majority of the users/authors. Just because HTML5 redefines the element does not mean that the element will suddenly be semantic. Even if people start using it purely semantically from now on (and what is the chance of that?), the existing websites still carry small-tags that are not compliant with the new definition. By redefining it the (existing) web breaks; allbeit purely in the semantic area.
Re: [whatwg] Deprecating small , b
Pentasis wrote: I was thinking mostly about the tag's current usage on the web, which is a crazy mix between the HTML4 and HTML5 definition of the element. HTML4 defines it purely presentational, HTML5 mostly semantical. In that context, I believe small is inappropriate. However, as you write and as HTML5 defines it, there is nothing wrong with small per se, and I agree that as an element indicating smallprint, it works just fine. Since my initial reply might have been a bit too colored by the HTML4 definition of the element and its current usage on the web, I hereby withdraw my comment and conclude that I mostly agree with you. :-) But isn't this just the reason why it should be dis-used? The HTML4 spec defined it as a styling tag, and that is how it is *mostly* used and understood by the majority of the users/authors. Just because HTML5 redefines the element does not mean that the element will suddenly be semantic. Even if people start using it purely semantically from now on (and what is the chance of that?), the existing websites still carry small-tags that are not compliant with the new definition. By redefining it the (existing) web breaks; allbeit purely in the semantic area. Note that the semantic meaning that HTML5 gives it is very weak. All it says is that the text inside the b is different from the text outside it. All the existing uses on the web that I've seen are correct according to this semantic definition. Do you have any counter examples of this, where the b was containing something that was exactly semantically the same as the surrounding content? / Jonas
Re: [whatwg] Deprecating small , b
Am Montag, den 24.11.2008, 15:10 -0800 schrieb Jonas Sicking: Note that the semantic meaning that HTML5 gives it is very weak. All it says is that the text inside the b is different from the text outside it. All the existing uses on the web that I've seen are correct according to this semantic definition. Weak in this case means: Not of much semantic, probably only of presentational use. So can't we just mark all presentational elements as obsolete in a clear, consistenst way, instead of trying to redefine them ? Maybe put them into a presentational annex of the spec, that defines rendering of obsolete elements ? The thing I am concerned with is that if they are included like normal (read: semantic) elements, authors will probably use them for new pages. Cheers, Nils
Re: [whatwg] Deprecating small , b ?
Pentasis writes: 2) When using small on different text-nodes throughout the document, one would expect all these text-nodes to be semantically the same. But they are not (unless all of them are copyright notices). In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small. User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. There's no chance of doing this with span class=legalese or similar, since user-agents are unaware of the semantic they should be conveying. 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. Denoting particular text as being of lessor importance is quite different from choosing the overall base font size (or indeed typeface) for the page, or the colour of links or headings -- that's merely expressing a preference for how graphical user-agents should render particular semantics, but the semantics themselves are conveyed to _all_ user-agents (a, h3, etc). 4) b Siemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Indeed you can't. And nor can you if you were reading printed text with some words in bold. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. Perhaps you then notice it's being done for all brand names? Or that the emboldened words spell out a secret message? However, you can only notice this if the words have been distinguished in some way. With b, all user-agents can choose to convey to users that those words are special. Smylers
Re: [whatwg] Deprecating small , b ?
2) When using small on different text-nodes throughout the document, one would expect all these text-nodes to be semantically the same. But they are not (unless all of them are copyright notices). In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small. User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. There's no chance of doing this with span class=legalese or similar, since user-agents are unaware of the semantic they should be conveying. 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. Denoting particular text as being of lessor importance is quite different from choosing the overall base font size (or indeed typeface) for the page, or the colour of links or headings -- that's merely expressing a preference for how graphical user-agents should render particular semantics, but the semantics themselves are conveyed to _all_ user-agents (a, h3, etc). 4) b Siemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Indeed you can't. And nor can you if you were reading printed text with some words in bold. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. Perhaps you then notice it's being done for all brand names? Or that the emboldened words spell out a secret message? However, you can only notice this if the words have been distinguished in some way. With b, all user-agents can choose to convey to users that those words are special. Smylers You cannot make a 100% comparison between printed and web-published styling and semantics. Apart from the obvious visual difference, we are talking about the ability here to convey semantics other than just visual. For example to aid machine-readability but far more importantly, Assistive Technologies. If markup in web-publishing was meant to be just for visual feedback, we would only need 1 block and one inline element as we can do anything with just classes and CSS in that respect. In that case you would be right, as indeed a book, newspaper or magazine can be read just fine without using markup-elements. And so can webpages. But this is not the main reason behind the semantic web. Bert
Re: [whatwg] Deprecating small , b ?
Pentasis writes: In printed material users are typically given no out-of-band information about the semantics of the typesetting. However, smaller things are less noticeable, and it's generally accepted that the author of the document wishes the reader to pay less attention to them than more prominent things. That works fine with small. User-agents which can't literally render smaller fonts can choose alternative mechanisms for denoting lower importance to users. There's no chance of doing this with span class=legalese or similar, since user-agents are unaware of the semantic they should be conveying. 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. Denoting particular text as being of lessor importance is quite different from choosing the overall base font size (or indeed typeface) for the page, or the colour of links or headings -- that's merely expressing a preference for how graphical user-agents should render particular semantics, but the semantics themselves are conveyed to _all_ user-agents (a , h3 , etc). 4) b Siemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Indeed you can't. And nor can you if you were reading printed text with some words in bold. However, you would appreciate that the author had wished for some particular words to stand out from the surrounding text. Perhaps you then notice it's being done for all brand names? Or that the emboldened words spell out a secret message? However, you can only notice this if the words have been distinguished in some way. With b, all user-agents can choose to convey to users that those words are special. You cannot make a 100% comparison between printed and web-published styling and semantics. Apart from the obvious visual difference, we are talking about the ability here to convey semantics other than just visual. Indeed. For example to aid machine-readability but far more importantly, Assistive Technologies. If markup in web-publishing was meant to be just for visual feedback, we would only need 1 block and one inline element as we can do anything with just classes and CSS in that respect. But that would be using a styling technology (and an optional one at that) for conveying meaning. Anybody without the CSS -- or with a non-graphical user-agent, which can't render the CSS rules to the user -- will be missing out. Such users wouldn't be able to distinguish span class=legalese or even span class=secret_message from the surrounding text. Whereas if small or b are used, all user-agents can do _something_ with them. So I completely agree with what you say. Smylers
Re: [whatwg] Deprecating small, b ?
On Fri, Nov 14, 2008 at 06:09, Nils Dagsson Moskopp [EMAIL PROTECTED] wrote: The small element represents small print [...] The b element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance [...] Both definitions seems rather presentational (contrasting, for example, the new semantic definition for the i element) and could also be realized by use of span elements. Why use span class=smallprintCopyright (c) 2008 …/span instead of just smallCopyright (c) 2008 …/small? The latter possibility is way more semantic. And why use span class=brandSiemens/span instead of just bSiemens/b? To me, the small and b elements – especially the former – make perfect sense. -david
Re: [whatwg] Deprecating small, b ?
The small element represents small print [...] The b element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance [...] Both definitions seems rather presentational (contrasting, for example, the new semantic definition for the i element) and could also be realized by use of span elements. Why use span class=smallprintCopyright (c) 2008 ?/span instead of just smallCopyright (c) 2008 ?/small? The latter possibility is way more semantic. And why use span class=brandSiemens/span instead of just bSiemens/b? To me, the small and b elements ? especially the former ? make perfect sense. -david I agree with the original poster on this. 1) Just because it makes sense to a human (it doesn't to me), does not mean it makes sense to a machine. 2) When using small on different text-nodes throughout the document, one would expect all these text-nodes to be semantically the same. But they are not (unless all of them are copyright notices). 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. 4) bSiemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Bert
Re: [whatwg] Deprecating small, b ?
Dne Fri, 14 Nov 2008 14:40:20 +0100 Pentasis [EMAIL PROTECTED] napsal/-a: I agree with the original poster on this. 1) Just because it makes sense to a human (it doesn't to me), does not mean it makes sense to a machine. 2) When using small on different text-nodes throughout the document, one would expect all these text-nodes to be semantically the same. But they are not (unless all of them are copyright notices). 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. 4) bSiemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Bert I second that, even though it might have a zero value. Ollie
Re: [whatwg] Deprecating small, b ?
On Fri, Nov 14, 2008 at 7:40 AM, Pentasis [EMAIL PROTECTED] wrote: The small element represents small print [...] The b element represents a span of text to be stylistically offset from the normal prose without conveying any extra importance [...] Both definitions seems rather presentational (contrasting, for example, the new semantic definition for the i element) and could also be realized by use of span elements. Why use span class=smallprintCopyright (c) 2008 ?/span instead of just smallCopyright (c) 2008 ?/small? The latter possibility is way more semantic. And why use span class=brandSiemens/span instead of just bSiemens/b? To me, the small and b elements ? especially the former ? make perfect sense. -david I agree with the original poster on this. 1) Just because it makes sense to a human (it doesn't to me), does not mean it makes sense to a machine. 2) When using small on different text-nodes throughout the document, one would expect all these text-nodes to be semantically the same. But they are not (unless all of them are copyright notices). Why would you expect this? Or rather, why would you expect this level of semantic specificity? small means something fairly broad that multiple types of specific semantics can fall under. 3) small is a styling element, it has zero semantic meaning, so it does not belong inside HTML. It *had* zero semantic meaning. Actually, though, this wasn't quite true. The semantics that have been attached to small (and i, and b) are an approximation of the common semantics that users of the elements conferred on the contents. Text within small was, quite often, used for small print. Matching up theory with practice is a good thing here. i and b, once you subtract the semantics stolen by em and strong, are used pretty much specifically as the spec states. 4) bSiemens/b also does not tell me anything about the semantics. Is it used as a name, a brand a foreign word ? etc. I cannot get that information from looking at the b element. Of course not. You're not intended to. What you *do* get, though, is that this is a word which is *intentionally* stylistically offset from the rest of the text. This conveys semantic meaning to a human - it means that the word is special or being used in a particular context. b and i don't communicate *much*, but they communicate *something*. One could, of course, also use a span to mark up and style the text, thus communicating the same intent to a person reading the styled text, but to a machine the span means literally nothing, while b and i have the possibility to communicate *something*. In addition, the fact that these elements traditionally have a particular preferred rendering means something. A dumb terminal which doesn't understand CSS won't give any indication to the user that a span exists at all, while b and i have a chance of providing fallback rendering that still accomplishes what they were designed to do. A decent chunk of html5 concerns itself with providing fallbacks and graceful degradation (or progressive enhancement, whichever way you want to look at it). Having some *nearly* semantic-free elements which have a meaningful fallback can be useful. Of course, it may certainly be more useful to you if you provide a class on the i as well. ~TJ
Re: [whatwg] Deprecating small, b ?
Of course not. You're not intended to. What you *do* get, though, is that this is a word which is *intentionally* stylistically offset from the rest of the text. This conveys semantic meaning to a human - it means that the word is special or being used in a particular context. b and i don't communicate *much*, but they communicate *something*. One could, of course, also use a span to mark up and style the text, thus communicating the same intent to a person reading the styled text, but to a machine the span means literally nothing, while b and i have the possibility to communicate *something*. In addition, the fact that these elements traditionally have a particular preferred rendering means something. A dumb terminal which doesn't understand CSS won't give any indication to the user that a span exists at all, while b and i have a chance of providing fallback rendering that still accomplishes what they were designed to do. A decent chunk of html5 concerns itself with providing fallbacks and graceful degradation (or progressive enhancement, whichever way you want to look at it). Having some *nearly* semantic-free elements which have a meaningful fallback can be useful. Of course, it may certainly be more useful to you if you provide a class on the i as well. ~TJ First: Computers are binary instruments. conveying *something* is not very logical seen from a computers point of view. It is not usefull to *me* to provide a class to the i or any other element, it is usefull to the computer, as humans may indeed come to some sort of conclusion based on style or strangely used semantics, computers cannot, they (still) need a more literal means of semantics. Second: Suppose I want to collect all copyright notices from 1000 websites (don't ask me why, I just want to), how am I to do this when they are marked up in smalls? I will definatly end up with a lot of text that has nothing to do with copyrights (and probably miss a lot of copyright notices as they are marked up differently) Whereas If they were maked up in (for example) span class=copyright I could retrieve it all based on the class-name. Bert
Re: [whatwg] Deprecating small, b ?
On Fri, Nov 14, 2008 at 9:38 AM, Pentasis [EMAIL PROTECTED] wrote: Of course not. You're not intended to. What you *do* get, though, is that this is a word which is *intentionally* stylistically offset from the rest of the text. This conveys semantic meaning to a human - it means that the word is special or being used in a particular context. b and i don't communicate *much*, but they communicate *something*. One could, of course, also use a span to mark up and style the text, thus communicating the same intent to a person reading the styled text, but to a machine the span means literally nothing, while b and i have the possibility to communicate *something*. In addition, the fact that these elements traditionally have a particular preferred rendering means something. A dumb terminal which doesn't understand CSS won't give any indication to the user that a span exists at all, while b and i have a chance of providing fallback rendering that still accomplishes what they were designed to do. A decent chunk of html5 concerns itself with providing fallbacks and graceful degradation (or progressive enhancement, whichever way you want to look at it). Having some *nearly* semantic-free elements which have a meaningful fallback can be useful. Of course, it may certainly be more useful to you if you provide a class on the i as well. ~TJ First: Computers are binary instruments. conveying *something* is not very logical seen from a computers point of view. It is not usefull to *me* to provide a class to the i or any other element, it is usefull to the computer, as humans may indeed come to some sort of conclusion based on style or strangely used semantics, computers cannot, they (still) need a more literal means of semantics. If we wish to communicate that level of semantics, yes. It may not be useful to us. If you *really* need some metadata/semantics, @class probably can't convey it with enough granularity. Check out the big discussion from a few months ago about ccRel and RDFa. Second: Suppose I want to collect all copyright notices from 1000 websites (don't ask me why, I just want to), how am I to do this when they are marked up in smalls? I will definatly end up with a lot of text that has nothing to do with copyrights (and probably miss a lot of copyright notices as they are marked up differently) Whereas If they were maked up in (for example) span class=copyright I could retrieve it all based on the class-name. That would be a wonderful perfect world. I'd like the copyright date as well, so I can retrieve only things copyrighted in the last ten years. Assuming that metadata will exist is a fool's errand. The fact is that if you are searching for copyright notices, the most efficient way is likely to just search for the string copyright and the (c) symbol. That'll net you copyright notices with a high accuracy, and some training on real data can yield further rules to improve the data-mining accuracy. While we're hoping for copyright notices to be marked up as span class=copyright, though, why not wish for small class=copyright? If you're going to be providing metadata, it works the same. Is it that you believe people won't provide a special class for copyrights if the small tag already gives them the preferred display? Do you believe that everyone will automatically use class=copyright to mark up their copyright notices? What if they use class=copyright-notice? Or class=license? Or any of a million other distinct possibilities that would destroy any naive attempt to datamine based on a particular class name? ~TJ
Re: [whatwg] Deprecating small, b ?
If we wish to communicate that level of semantics, yes. It may not be useful to us. If you *really* need some metadata/semantics, @class probably can't convey it with enough granularity. Check out the big discussion from a few months ago about ccRel and RDFa. Not yet maybe, but we could at least try to keep options open for the future. Second: Suppose I want to collect all copyright notices from 1000 websites (don't ask me why, I just want to), how am I to do this when they are marked up in smalls? I will definatly end up with a lot of text that has nothing to do with copyrights (and probably miss a lot of copyright notices as they are marked up differently) Whereas If they were maked up in (for example) span class=copyright I could retrieve it all based on the class-name. That would be a wonderful perfect world. I'd like the copyright date as well, so I can retrieve only things copyrighted in the last ten years. Assuming that metadata will exist is a fool's errand. The fact is that if you are searching for copyright notices, the most efficient way is likely to just search for the string copyright and the (c) symbol. That'll net you copyright notices with a high accuracy, and some training on real data can yield further rules to improve the data-mining accuracy. You say it yourself, only in a perfect world where all websites in the world would be written in the same language would your solution work. Unfortunatly I would miss out on all the chinese copyright stuff. But another example (based on siemens) wouldn't it be nice if I could tell Google I am looking for a person named Siemens so it would ignore the brand-name? While we're hoping for copyright notices to be marked up as span class=copyright, though, why not wish for small class=copyright? If you're going to be providing metadata, it works the same. Is it that you believe people won't provide a special class for copyrights if the small tag already gives them the preferred display? Do you believe that everyone will automatically use class=copyright to mark up their copyright notices? What if they use class=copyright-notice? Or class=license? Or any of a million other distinct possibilities that would destroy any naive attempt to datamine based on a particular class name? Well, that would have to be defined in the standard, wouldn't it? I'm not saying -again- it should be defined NOW, but at least leave the door open. I have no problems with using small over span, neither one is correct as far as I can see, in this context. Using copyright instead of license or copyright-notice would have to be defined somewhere, either in the standard or in an externally maintained document that is accepted as best practice or standards related. PS: I find it very difficult to respond to rich-text/html messages as they seriously mess up the indentation. Sorry therfor if this message is unclear as original message and reply are mixed up.
Re: [whatwg] Deprecating small, b ?
On Fri, Nov 14, 2008 at 10:44 AM, Pentasis [EMAIL PROTECTED] wrote: If we wish to communicate that level of semantics, yes. It may not be useful to us. If you *really* need some metadata/semantics, @class probably can't convey it with enough granularity. Check out the big discussion from a few months ago about ccRel and RDFa. Not yet maybe, but we could at least try to keep options open for the future. Of course, but I don't think having small in the language closes any options. Second: Suppose I want to collect all copyright notices from 1000 websites (don't ask me why, I just want to), how am I to do this when they are marked up in smalls? I will definatly end up with a lot of text that has nothing to do with copyrights (and probably miss a lot of copyright notices as they are marked up differently) Whereas If they were maked up in (for example) span class=copyright I could retrieve it all based on the class-name. That would be a wonderful perfect world. I'd like the copyright date as well, so I can retrieve only things copyrighted in the last ten years. Assuming that metadata will exist is a fool's errand. The fact is that if you are searching for copyright notices, the most efficient way is likely to just search for the string copyright and the (c) symbol. That'll net you copyright notices with a high accuracy, and some training on real data can yield further rules to improve the data-mining accuracy. You say it yourself, only in a perfect world where all websites in the world would be written in the same language would your solution work. Unfortunatly I would miss out on all the chinese copyright stuff. Of course. But would you expect chinese speakers to use class=copyright on their pages anyway? But another example (based on siemens) wouldn't it be nice if I could tell Google I am looking for a person named Siemens so it would ignore the brand-name? Certainly. But at this point you're expecting authors to mark up their pages with metadata every time they mention someone's name. The use of b doesn't prevent this, but your use-case certainly requires quite a lot more. While we're hoping for copyright notices to be marked up as span class=copyright, though, why not wish for small class=copyright? If you're going to be providing metadata, it works the same. Is it that you believe people won't provide a special class for copyrights if the small tag already gives them the preferred display? Do you believe that everyone will automatically use class=copyright to mark up their copyright notices? What if they use class=copyright-notice? Or class=license? Or any of a million other distinct possibilities that would destroy any naive attempt to datamine based on a particular class name? Well, that would have to be defined in the standard, wouldn't it? I'm not saying -again- it should be defined NOW, but at least leave the door open. I have no problems with using small over span, neither one is correct as far as I can see, in this context. Using copyright instead of license or copyright-notice would have to be defined somewhere, either in the standard or in an externally maintained document that is accepted as best practice or standards related. Okay, then we have no issue with small. There has been some discussion, btw, about standardizing a set of normative class names. You should be able to turn something up about it. PS: I find it very difficult to respond to rich-text/html messages as they seriously mess up the indentation. Sorry therfor if this message is unclear as original message and reply are mixed up. No problem; it was clear enough. The only richtext I use is quote levels, and with the conversation context nearby anyway, it's not difficult to puzzle out when it occasionally messes up. ~TJ