Here we see only lynx always renders correctly. Firefox and w3m only render correctly for ASCII.
$ cat r.htm <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content= "text/html; charset=utf-8"> <title>test</title> </head> <body> <p>國 國,國國;A A,AA;A A,AA</p> </body> </html> $ lynx -dump r.htm 國 國,國國;A A,AA;A A,AA $ w3m -dump r.htm 國國,國國;AA,AA;A A,AA