2011-06-14 10:32, Ian Hickson wrote:

On Thu, 10 Mar 2011, Jukka K. Korpela wrote:
[...]
A sentence in the text may continue with list items, displayed e.g. as a
bulleted list. So the list breaks the paragraph as a block of text but
not logically - the list items are part of the sentence just as they
would be if they were just mentioned in the text, for example using 1)
numbers in the text, 2) letters in the text, or 3) no special notation.

Indeed, but "block of text" is pretty much what a paragraph is -- it isn't
a logical construct.

The word "paragraph" is ambiguous, as he current wording says indirectly but clearly: It defines "The p element represents a paragraph", but the word "paragraph" links to the following:

"The term paragraph as defined in this section is distinct from (though related to) the p element defined later. The paragraph concept defined here is used to describe how to interpret documents.

A paragraph is typically a run of phrasing content that forms a block of text with one or more sentences that discuss a particular topic, as in typography, but can also be used for more general thematic grouping. For instance, an address is also a paragraph, as is a part of a form, a byline, or a stanza in a poem."

So it says that p is a paragraph, linking to an explanation that says paragraph is different from p. The explanation mentions "the term paragraph as defined in this section", but it does not give a definition - a sentence that begins with "A paragraph is typically" is a prelude to a definition, not a definition.

I gather that "The p element represents a paragraph" more or less means "the p element denotes a block of text". Can you make this more explicit, please? This is very confusing even to people who regularly read specifications for breakfast. In the current wordings, there are _two_ dimensions of vagueness of "paragraph": whether it is the classical concept of text that discusses one topic or the layout concept roughly corresponding to the old HTML block concept; and whether it is about explicitly marked-up elements (<p>...</p>) or more generally about constructs whose "paragraphness" might be inferred by some rules.

It would probably be best to dispense with the word "paragraph", as many people can't avoid thinking that paragraph is something logical, not the layout concept of a block of text. Nut unfortunately, in HTML heritage, the p element is not _purely_ a block of text. In addition to the name and old descriptions, it associates with the logical concept of paragraph, since p elements have default top and bottom margins. So they differ from div elements. A div element containing only text can be characterized as a block of text, and so can a p element, but there's a difference.

Maybe something like the following might express this:

The p element represents a block of text so that consecutive p elements are regarded as topically distinct. The name p comes from "paragraph", and the p element typically corresponds to a paragraph in prose, i.e. a subdivision of text that deals with one point or gives the words of one speaker in a discussion. However, the p element is also used for other thematic grouping, for example for a byline, a mailing address, for a label and associated field in a form, for a byline, or for a stanza in a poem.

Visual browsers are expected to render p elements by default with empty lines before and after, caused by default top and bottom margin.

a) for styling purposes (you need a container element so that you can
specify, without clumsily using classes on both the P and the UL, e.g.
that vertical spacing be reduced or zero)

<div>  handles this case:

    <div>This sentence
     <ol>
      <li>contains
      <li>a list
     </ol>
    ...and is made of four paragraphs but can be styled as one since the
    &lt;div>  element is used instead of&lt;p>.</div>

But if this follows, for example, a table, then extra measures would be needed to create vertical spacing. Using the p element would make the spacing the default. Similarly for spacing after this construct. So it would be more robust to use <p>...</p> markup here. Or you would need to assign style properties to the div element, effectively making it formatted the same way as p elements normally are, in your document.

I don't think anonymous blocks of text are a good idea. There was a reason why they were frowned upon in HTML 4.01. After years of favoring <p>...</p> as a container, as opposite to the original idea that allowed <p> as an empty element indicating paragraph break, it seems odd to give so many examples with "loose" text.

So I hope an example like the above but with <p>...</p> markup can be added, to answer the common question (which is often formulated in terms of a "list header", but it's really about something that starts as a paragraph and then moves to listing things down as a bulleted list).

Maybe an explanation like this might be added (perhaps even after the definition of p, as it really clarifies the concept):

Within a p element, only phrasing content (previously called "text-level" content) is allowed. This implies that it cannot contain a list element or a table element for example. A part of a document that discusses one topic is normally marked up as one p element, but if it contains lists for example, it needs to be marked up as one more p elements intermixed with (not containing) one or more lists. The part may be marked up as a div element to group the elements for the purposes of styling and scripting, for example

<div class="p">
<p>This is text, which may be just list header (introduction to
the list) or a longer presentation.
<ul>
  <li>an item</li>
  <li>another item</li>
</ul>
<p>Here we may have text that logically continues the discussion
of the topic.</p>
</div>

* * *

I know this suggestion is long and raw, but I hope its basic content is something we can agree on. And I have no big problem with using div markup here, even though it somewhat goes against the spirit of modern HTML.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Reply via email to