Hi, console.log() of Node uses v8::String::WriteUtf8() internally. Unfortunately it supports only BMP.
http://code.google.com/p/v8/issues/detail?id=761 On Thu, 11 Aug 2011 12:57:05 -0500, Marcel Laverdet <[email protected]> wrote: > console.log() and document.write() are not parts of v8. These are host > functions and have different implementations in Chrome and in NodeJS. Chrome > seems to have a very robust implementation of both, which is aware of > surrogate pairs and the target encoding. NodeJS on the other hand fails to > respect surrogate pairs. > > Your examples don't show too much other than the fact that > String.fromCodeCode() will not generate surrogate pairs and therefore can > only generate characters with a 16 bit codepoint. You'll see the same > results in NodeJS. > > '??' === String.fromCharCode(0xd864, 0xdd0e) > true > > On Thu, Aug 11, 2011 at 12:05 PM, ~flow <[email protected]> wrote: > > > so i went and put the same javascript into an HTML page to be displayed by > > chrome and into a standalone js snippet to be run using nodejs: > > > > var f = function( text ) { > > document.write( '<h1>', text, '</h1>' ); > > document.write( '<div>', text.length, '</div>' ); > > document.write( '<div>0x', text.charCodeAt(0).toString( 16 ), '</div>' ); > > document.write( '<div>0x', text.charCodeAt(1).toString( 16 ), '</div>' ); > > console.log( '<h1>', text, '</h1>' ); > > console.log( '<div>', text.length, '</div>' ); > > console.log( '<div>0x', text.charCodeAt(0).toString( 16 ), '</div>' ); > > console.log( '<div>0x', text.charCodeAt(1).toString( 16 ), '</div>' ); > > }; > > > > f( '??' ); > > f( String.fromCharCode( 0x2910e ) ); > > f( String.fromCharCode( 0xd864, 0xdd0e ) ); > > > > in function f(), those document.write() calls are only present in the HTML > > document, not the standalone. > > > > i want to show here that something more fundamental must be different > > between javascript running inside google chrome and javascript running > > inside nodejs. because, you see, the output i get inside chrome looks like > > this: > > > > ?? > > 2 > > 0xd864 > > 0xdd0e > > ? > > 1 > > 0x910e > > 0xNaN > > ?? > > 2 > > 0xd864 > > 0xdd0e > > > > the second character is silently truncated (notice how the chr code is > > reported as 0x910e where it should be 0x2910e) which is sad, but both > > using a string literal and a numerical surrogate pair works---both in the > > HTML page and in chrome's console output! conversely, in nodejs, this is > > what i get: > > > > <h1> ? </h1> > > <div> 1 </div> > > <div>0x fffd </div> > > <div>0x NaN </div> > > <h1> ? </h1> > > <div> 1 </div> > > <div>0x 910e </div> > > <div>0x NaN </div> > > <h1> ?????</h1> > > <div> 2 </div> > > <div>0x d864 </div> > > <div>0x dd0e </div> > > > > the silver lining here is that v8 inside nodejs does preserve the surrogate > > pair, even though it fails to output it correctly. however, the > > console.log() method gets it completely wrong. may i add that the analog in > > python 3.1 does work---since i use a 'narrow' python build, it also reports > > a string '??' as being two characters long, and manages to print it out > > correctly, which seems to tell me that my ubuntu gnome terminal knows how to > > handle surrogate pairs. > > > > i could perfectly live with those surrogate pairs---they're a nuisance but > > i know how to deal with them from years of experience with python. the > > really sad thing here is that nodejs's v8 seems to fall short on something > > that v8 can be demonstrated to do correctly when running inside chrome. > > > > that said, let me add that i sometimes worry about the unneeded complexity > > that goes into implementations. why can't people just use a 32bit wide > > character datatypes? instead they make users jump to all kinds of gratuitous > > hoops. > > > > -- > > v8-users mailing list > > [email protected] > > http://groups.google.com/group/v8-users > > > > -- > v8-users mailing list > [email protected] > http://groups.google.com/group/v8-users -- { name: "Koichi Kobayashi", mail: "[email protected]", blog: "http://d.hatena.ne.jp/koichik/", twitter: "@koichik" } -- v8-users mailing list [email protected] http://groups.google.com/group/v8-users
