Re: [Lynx-dev] lynx un-renders

2018-05-27 Thread russellbell
Quoth Thorsten Glaser, 'can you please use hexadecimal
numbers' 
 Sorry!  lynx uses hex sometimes, dec others.  129 = 0x81

Quoth Thorsten Glaser, 'I've only ever seen those used by
windows codepage 1252 users'.
It shows up rarely.  I can't make sense why.  There are
definitely mistakes in other pages on the same sites.  I don't know
what codepage the authors use.  Since 'high octet preset' is an
instruction to formatters that, as far as I can tell, lynx doesn't
handle, I silence it.  I mention it in this forum in case others like
the idea.
From a Unicode doc:

'# PADDING CHARACTER and HIGH OCTET PRESET represent
# architectural concepts initially proposed for early
# drafts of ISO/IEC 10646-1. They were never actually
# approved or standardized: hence their designation
# here as the "figment" type. Formal name aliases
# (and corresponding abbreviations) for these code
# points are included here because these names leaked
# out from the draft documents and were published in
# at least one RFC whose names for code points was
# implemented in Perl regex expressions.'

russell bell

PS
dec hex Description
129 81  high octet preset
699 2bb 'commaturnedmod' or 'Modifier Turned Comma'
79961f3c'Greek Capital Letter Iota With Psili And Oxia'.
10133f5 'lunate epsilon'
863421ba'Counter-Clockwise Arrow'
8764223c'sim'
894322ef'Midline Horizontal Ellipsis'
939824b6'Circled Latin Capital Letter A'
967925cfcirclefilled or blackcircle
97642624'Caduceus or "Kerykeion"
8203200bzero-width space
78791ec7'Latin Small Letter E With Circumflex And Dot Below'


___
Lynx-dev mailing list
Lynx-dev@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lynx-dev


Re: [Lynx-dev] lynx un-renders

2018-05-27 Thread Thorsten Glaser
Hi russellbell,

can you please use hexadecimal numbers in threads like this one,
so it is easier for others to find the mentioned character in the
Unicode database? Thanks in advance!

>I think it appears by accident every time I see it.

I’ve only ever seen those used by windows codepage 1252 users…
I usually render the C1 control characters thus:

/* 0x80 */  0x20AC, 0x278A, 0x201A, 0x0192, 0x201E, 0x2026, 0x2020, 0x2021,
/* 0x88 */  0x02C6, 0x2030, 0x0160, 0x2039, 0x0152, 0x278B, 0x017D, 0x278C,
/* 0x90 */  0x278D, 0x2018, 0x2019, 0x201C, 0x201D, 0x2022, 0x2013, 0x2014,
/* 0x98 */  0x02DC, 0x2122, 0x0161, 0x203A, 0x0153, 0x278E, 0x017E, 0x0178,

I’m using 0x278A‥0x278E for those undefined even in cp1252 so
that there’s at least some indication of what went wrong. C1
control characters generally should not be interpreted, or even
sent out as part of text files, so this is the safe way to do
it, and rendering it as if it were miscalculated cp1252 makes
most of its uses legible.

Note that this applies equally for undefined uses of all three
of: bare \x80 in the HTML, bare \xC2\xA0 (U+0080) in the HTML,
and use of entities like  and  in the HTML. These
all should be rendered the same as  for… acceptance
that the word “web designer” is a curse word denoting idiots.

bye,
//mirabilos
-- 
[...] if maybe ext3fs wasn't a better pick, or jfs, or maybe reiserfs, oh but
what about xfs, and if only i had waited until reiser4 was ready... in the be-
ginning, there was ffs, and in the middle, there was ffs, and at the end, there
was still ffs, and the sys admins knew it was good. :)  -- Ted Unangst über *fs

___
Lynx-dev mailing list
Lynx-dev@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lynx-dev


[Lynx-dev] lynx un-renders

2018-05-27 Thread russellbell

'Hop over protocol' is an instruction to the renderer for
formatting.  I notice that def7_uni.tbl comments out its rendering as
'HO':  good call.  I make lynx render it as nothing.  I think it appears by
accident every time I see it.

russell bell


___
Lynx-dev mailing list
Lynx-dev@nongnu.org
https://lists.nongnu.org/mailman/listinfo/lynx-dev