2009/3/9 Stephan Beal <[email protected]>: > > On Mon, Mar 9, 2009 at 8:41 PM, Erik Corry <[email protected]> wrote: >> 2009/3/9 Stephan Beal <[email protected]>: >> Here's the text from v8.h: >> >> * Allocates a new string from either utf-8 encoded or ascii data. >> * The second parameter 'length' gives the buffer length. >> * If the data is utf-8 encoded, the caller must >> * be careful to supply the length parameter. >> * If it is not given, the function calls >> * 'strlen' to determine the buffer length, it might be >> * wrong if 'data' contains a null character. >> */ > > Aha, okay i wasn't clear on the automatic assumption to utf8. Fair enough. > >> So it will assume that it is UTF-8 if it is not ASCII. Not all binary >> sequences are valid UTF-8 so you can't use this for binary data. >> Internally, V8 does not use UTF-8 so this data will be converted to >> UC16. > > Doh, and here all along i assumed utf8 was what WAS used, as the API > has Utf8Value but no Utf16Value. > >> /** Allocates a new string from utf16 data.*/ >> static Local<String> New(const uint16_t* data, int length = -1); >> >> This one takes 16 bit characters and can represent binary data with no >> corruption, but the length is in characters, so you can's use it for >> an odd number of bytes. > > What's the byte order?
Native. > >>> In my case i'm working on an i/o library which of course treats the >>> data as opaque (void*). If i understand you correctly, if it happens > ... >> Giving binary data to the above New method will result in undefined >> behaviour. > > Fair enough. > >> The external strings must have their data either in ASCII or in UC16. >> There's no Latin1 and undefined stuff will result if you try. In the >> case of an external string the actual string data is not on the V8 >> heap. It is assumed to be immutable too of course since all JS >> strings are immutable. > > That wouldn't solve my case, which is effectively latin1. i'll need to > think about that (but don't mind living with the limitation of ascii > read/write). > >>> That's an idea. Didn't think of that. It'd mean (in my case) buffering >>> arbitrarily large read buffers, and since v8 doesn't guaranty GC will >>> ever be called, i don't want to risk it causing an arbitrarily-sized >>> leak. >> >> If the data is on the V8 heap then it won't be collected without a GC >> either. :) > > But even if i registered it for gc via a weak pointer callback, it's > not guaranteed to be freed, so i'm forced to add external gc to it in > *any* case and have the client call the cleanup routine when their > context dies (this is currently handled via a sentry object in the > client app which cleans up when it goes out of scope). If there is a global GC then the weak pointer callbacks will be called. If there is no global GC then strings on the heap may not be collected either. Did you notice that recent V8s have had logic added to do a GC on context creation if there was a context that died since the last context creation (the slightly convoluted logic is to avoid a wave of expensive global GCs when a series of contexts are destroyed at around the same time). > > > -- > ----- stephan beal > http://wanderinghorse.net/home/stephan/ > > > > -- Erik Corry, Software Engineer Google Denmark ApS. CVR nr. 28 86 69 84 c/o Philip & Partners, 7 Vognmagergade, P.O. Box 2227, DK-1018 Copenhagen K, Denmark. --~--~---------~--~----~------------~-------~--~----~ v8-users mailing list [email protected] http://groups.google.com/group/v8-users -~----------~----~----~----~------~----~------~--~---
