Re: [v8-users] Re: should internalizing a string maintain external data when possible?

2023-04-17 Thread Bruce MacNaughton
I think I've had a bit of a misconception; I was jumbling together 
"external" and our Persistent extension. Externalizing the internalized 
string wouldn't help because the real issue (for us) is to have the 
internalized string built on our subclass with the Persistent property. And 
that doesn't seem possible.

On Monday, April 17, 2023 at 9:02:50 AM UTC-7 Bruce MacNaughton wrote:

> The goal (what we've used them for now) is to have a unique copy of a 
> string so we can track the source of a string. We attach 
> Persistent properties to the string so that a string from one 
> source can be differentiated from a string from another source. 
> Specifically, we track strings that are user-input.
>
> I was a little fast and loose with the comment about "add some external 
> data to x". More precisely, I create a subclass that extends 
> v8::String::ExternalStringResource and the subclass has a 
> Persistent member.
>
> Is there another way to differentiate between two strings of identical 
> form/content/representation?
>
> On Monday, April 17, 2023 at 8:40:23 AM UTC-7 pth...@google.com wrote:
>
>> The purpose of internalized strings is to have a canonical, unique 
>> representation of a string. So yes if strings are internalized that 
>> are duplicates of already internalized values, we "throw away"/discard that 
>> duplicate and point them to the unique string instead (i.e. the string 
>> becomes a ThinString).
>> If you want to take ownership of the buffer of an internalized string, 
>> you can by externalizing the internalized string (as opposed to 
>> internalizing an external string).
>>
>> Maybe you can tell us what your intent is? Because you mentioned "add 
>> some external data to x" I am not sure if external strings are really what 
>> you are looking for. The purpose of external strings is only to allow 
>> embedders to have ownership of the string buffer, not to attach arbitrary 
>> data to strings.
>>
>> On Mon, Apr 17, 2023 at 4:30 PM Bruce MacNaughton  
>> wrote:
>>
>>> "(either one or two byte) because defaulting to throw away the external 
>>> data" should be "(either one or two byte) BEFORE defaulting to throw away 
>>> the external data"
>>>
>>> On Monday, April 17, 2023 at 7:12:19 AM UTC-7 Bruce MacNaughton wrote:
>>>
>>>> This is my first relatively deep dive into v8 strings; if I 
>>>> misinterpreted what's going on, please correct me.
>>>>
>>>> When a string is used as a key:
>>>>
>>>> const x = 'xyzzy';
>>>> // add some external data to x
>>>> const obj = { [x]: 1234 };
>>>>
>>>> the external data is lost because StringTable:::LookupKey() finds an 
>>>> entry (data->FindEntry()) and returns a Handle to it. That String 
>>>> is *not* internalized, so
>>>> String::MakeThin() is called on the original string (as 'this') with 
>>>> the Handle returned by FindEntry(). MakeThin() finds 
>>>> this->IsExternalString() true so calls MigrateExternalString(), 
>>>> internalized is *not* an External string, so FinalizeExternalString() is 
>>>> called, discarding the external data.
>>>>
>>>> whew.
>>>>
>>>> It seems like, to propagate the external data that, in LookupString(), 
>>>> if string->IsExternal() and !result->IsExternal() then result should be 
>>>> reconstructed as an External string.
>>>>
>>>> It may be that the way it works is intended, but it's not clear because 
>>>> MigrateExternalString() in string.cc checks to see if 'internalized' is an 
>>>> external string (either one or two byte) because defaulting to throw away 
>>>> the external data.
>>>>
>>>> But maybe there are considerations with the shared heap that I have yet 
>>>> to understand.
>>>>
>>>> Basic question: is this is a bug or intentional design?
>>>>
>>>>
>>>> -- 
>>> -- 
>>> v8-users mailing list
>>> v8-u...@googlegroups.com
>>> http://groups.google.com/group/v8-users
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "v8-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to v8-users+u...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/v8-users/e88c1553-4a78-435f-bc75-cc42f0b3f147n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/v8-users/e88c1553-4a78-435f-bc75-cc42f0b3f147n%40googlegroups.com?utm_medium=email_source=footer>
>>> .
>>>
>>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-users/9e08c09e-dfdf-4d1b-816b-9b82e14f2c12n%40googlegroups.com.


Re: [v8-users] Re: should internalizing a string maintain external data when possible?

2023-04-17 Thread Bruce MacNaughton
The goal (what we've used them for now) is to have a unique copy of a 
string so we can track the source of a string. We attach 
Persistent properties to the string so that a string from one 
source can be differentiated from a string from another source. 
Specifically, we track strings that are user-input.

I was a little fast and loose with the comment about "add some external 
data to x". More precisely, I create a subclass that extends 
v8::String::ExternalStringResource and the subclass has a 
Persistent member.

Is there another way to differentiate between two strings of identical 
form/content/representation?

On Monday, April 17, 2023 at 8:40:23 AM UTC-7 pth...@google.com wrote:

> The purpose of internalized strings is to have a canonical, unique 
> representation of a string. So yes if strings are internalized that 
> are duplicates of already internalized values, we "throw away"/discard that 
> duplicate and point them to the unique string instead (i.e. the string 
> becomes a ThinString).
> If you want to take ownership of the buffer of an internalized string, you 
> can by externalizing the internalized string (as opposed to internalizing 
> an external string).
>
> Maybe you can tell us what your intent is? Because you mentioned "add some 
> external data to x" I am not sure if external strings are really what you 
> are looking for. The purpose of external strings is only to allow embedders 
> to have ownership of the string buffer, not to attach arbitrary data to 
> strings.
>
> On Mon, Apr 17, 2023 at 4:30 PM Bruce MacNaughton  
> wrote:
>
>> "(either one or two byte) because defaulting to throw away the external 
>> data" should be "(either one or two byte) BEFORE defaulting to throw away 
>> the external data"
>>
>> On Monday, April 17, 2023 at 7:12:19 AM UTC-7 Bruce MacNaughton wrote:
>>
>>> This is my first relatively deep dive into v8 strings; if I 
>>> misinterpreted what's going on, please correct me.
>>>
>>> When a string is used as a key:
>>>
>>> const x = 'xyzzy';
>>> // add some external data to x
>>> const obj = { [x]: 1234 };
>>>
>>> the external data is lost because StringTable:::LookupKey() finds an 
>>> entry (data->FindEntry()) and returns a Handle to it. That String 
>>> is *not* internalized, so
>>> String::MakeThin() is called on the original string (as 'this') with the 
>>> Handle returned by FindEntry(). MakeThin() finds 
>>> this->IsExternalString() true so calls MigrateExternalString(), 
>>> internalized is *not* an External string, so FinalizeExternalString() is 
>>> called, discarding the external data.
>>>
>>> whew.
>>>
>>> It seems like, to propagate the external data that, in LookupString(), 
>>> if string->IsExternal() and !result->IsExternal() then result should be 
>>> reconstructed as an External string.
>>>
>>> It may be that the way it works is intended, but it's not clear because 
>>> MigrateExternalString() in string.cc checks to see if 'internalized' is an 
>>> external string (either one or two byte) because defaulting to throw away 
>>> the external data.
>>>
>>> But maybe there are considerations with the shared heap that I have yet 
>>> to understand.
>>>
>>> Basic question: is this is a bug or intentional design?
>>>
>>>
>>> -- 
>> -- 
>> v8-users mailing list
>> v8-u...@googlegroups.com
>> http://groups.google.com/group/v8-users
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "v8-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to v8-users+u...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/v8-users/e88c1553-4a78-435f-bc75-cc42f0b3f147n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/v8-users/e88c1553-4a78-435f-bc75-cc42f0b3f147n%40googlegroups.com?utm_medium=email_source=footer>
>> .
>>
>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-users/ad4e1dae-d9c7-49d0-91f1-a79c19d090afn%40googlegroups.com.


[v8-users] Re: should internalizing a string maintain external data when possible?

2023-04-17 Thread Bruce MacNaughton
"(either one or two byte) because defaulting to throw away the external 
data" should be "(either one or two byte) BEFORE defaulting to throw away 
the external data"

On Monday, April 17, 2023 at 7:12:19 AM UTC-7 Bruce MacNaughton wrote:

> This is my first relatively deep dive into v8 strings; if I misinterpreted 
> what's going on, please correct me.
>
> When a string is used as a key:
>
> const x = 'xyzzy';
> // add some external data to x
> const obj = { [x]: 1234 };
>
> the external data is lost because StringTable:::LookupKey() finds an entry 
> (data->FindEntry()) and returns a Handle to it. That String is 
> *not* internalized, so
> String::MakeThin() is called on the original string (as 'this') with the 
> Handle returned by FindEntry(). MakeThin() finds 
> this->IsExternalString() true so calls MigrateExternalString(), 
> internalized is *not* an External string, so FinalizeExternalString() is 
> called, discarding the external data.
>
> whew.
>
> It seems like, to propagate the external data that, in LookupString(), if 
> string->IsExternal() and !result->IsExternal() then result should be 
> reconstructed as an External string.
>
> It may be that the way it works is intended, but it's not clear because 
> MigrateExternalString() in string.cc checks to see if 'internalized' is an 
> external string (either one or two byte) because defaulting to throw away 
> the external data.
>
> But maybe there are considerations with the shared heap that I have yet to 
> understand.
>
> Basic question: is this is a bug or intentional design?
>
>
>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-users/e88c1553-4a78-435f-bc75-cc42f0b3f147n%40googlegroups.com.


[v8-users] should internalizing a string maintain external data when possible?

2023-04-17 Thread Bruce MacNaughton
This is my first relatively deep dive into v8 strings; if I misinterpreted 
what's going on, please correct me.

When a string is used as a key:

const x = 'xyzzy';
// add some external data to x
const obj = { [x]: 1234 };

the external data is lost because StringTable:::LookupKey() finds an entry 
(data->FindEntry()) and returns a Handle to it. That String is 
*not* internalized, so
String::MakeThin() is called on the original string (as 'this') with the 
Handle returned by FindEntry(). MakeThin() finds 
this->IsExternalString() true so calls MigrateExternalString(), 
internalized is *not* an External string, so FinalizeExternalString() is 
called, discarding the external data.

whew.

It seems like, to propagate the external data that, in LookupString(), if 
string->IsExternal() and !result->IsExternal() then result should be 
reconstructed as an External string.

It may be that the way it works is intended, but it's not clear because 
MigrateExternalString() in string.cc checks to see if 'internalized' is an 
external string (either one or two byte) because defaulting to throw away 
the external data.

But maybe there are considerations with the shared heap that I have yet to 
understand.

Basic question: is this is a bug or intentional design?


-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-users/9f639578-d50b-4263-a514-52331aff63cbn%40googlegroups.com.


[v8-users] v8::Isolate::GetCurrent() - when can it return null

2023-04-04 Thread Bruce MacNaughton
I've written a node extension that works with private data. I see from the 
function signature that v8::Isolate::GetCurrent() can return NULL, but I 
don't understand how that's possible, given that my extension is called by 
JavaScript and has to be running in an isolate.

The v8::Isolate::Current() function has a DCHECK_NOT_NULL, which I infer 
means that it shouldn't happen. But I don't know when debug checks would be 
enabled, so that could be a bad inference.

Does it return NULL in specific circumstances, like startup, thread 
termination, etc.? Can it return 

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-users/54dd757e-c768-46b4-8375-fd4e37719df8n%40googlegroups.com.


Re: [v8-users] internal fields

2017-11-04 Thread Bruce MacNaughton

>
> Does it use Wrap and/or as classes subclassed with ObjectWrap?  Wrap uses 
> internal field 0 to store the class so it can be later unwrapped from the 
> V8 object.

https://github.com/nodejs/node/blob/master/src/node_object_wrap.h#L75 (near 
> that is also SetWeak reference)


Yes, I think you just answered my question. So
 object->Wrap(info.This());

in the Object::New() method stores the instance in internal field 0, right?


>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [v8-users] internal fields

2017-11-04 Thread Bruce MacNaughton
Thanks for pointing me in the right direction.

On Saturday, November 4, 2017 at 9:52:23 AM UTC-7, J Decker wrote:
>
>
>
> On Sat, Nov 4, 2017 at 9:29 AM, Bruce MacNaughton <bmacna...@gmail.com 
> > wrote:
>
>> I am new to Nan, V8, and C++ (so if I haven't put a big enough target on 
>> my back I don't know what else I can add). I've written a lot of JavaScript 
>> and, in the past, C, assembler, and kernel mode code, so hopefully the 
>> bulls-eye is a little smaller now.
>>
>> I'm working with an existing code base and am trying to understand why 
>> things were done the way they were. It uses Nan to create an addon for 
>> nodejs. I'm hoping someone here can help me understand some pieces that 
>> escape me.
>>
>  
> Nan is really a nodejs thing, and not V8... so this is sort of the wrong 
> place for these questions...
>  
>
>>
>> 1. The code sets internal field count for each class - sometimes to 1 and 
>> sometimes to 2 - but never invokes "setInternalField()" or 
>> "getInternalField()". Is there some reason, possibly historical, that 
>> "setInternalFieldCount()" needed to be called to set a value? The way I 
>> have interpreted what I've read is that my code needs to set and get the 
>> value explicitly, so setting a value but never storing anything there makes 
>> no sense to me.
>>
>>   // Prepare constructor template
>>  v8::Local ctor = 
>> Nan::New(New);
>>  ctor->InstanceTemplate()->SetInternalFieldCount(2);
>>  ctor->SetClassName(Nan::New("MyClass").ToLocalChecked());
>>
>>
> Does it use Wrap and/or as classes subclassed with ObjectWrap?  Wrap uses 
> internal field 0 to store the class so it can be later unwrapped from the 
> V8 object.
> https://github.com/nodejs/node/blob/master/src/node_object_wrap.h#L75 
> (near that is also SetWeak reference)
>  
>
>> 2. Given that I'm storing something in internal fields, my understanding 
>> is that I need to free any resources (memory, etc.) that are used by the 
>> internal field if the object is GC'd. Doing that in the destructor seems to 
>> be the right way to handle that. Is that all there is to it?
>>
>> the destructor is really too late, at the point the destructor is called, 
> the Object holding it would have also disappeared If the destructor is 
> getting called, it's probably because of an ObjectWrapped thing 
> disappearing, which internally stores the object in the class as a 
> Persistent<> that is SetWeak()'d.  SetWeak takes a callback which is called 
> when the object is GC'd.
>  
>
>> 3. What difference does it make to v8 if the internal field is an aligned 
>> pointer or not? Is the ability to set/get aligned pointers a consistency 
>> check so assumptions can be made? Does the interface check the alignment? 
>> (Not critical for me, I don't think, but I'd like to understand.)
>>
>>
> I dooubt it matters... basically internal fields seem to be user-data 
> fields that store the value so your user code can later retrieve it.  
> Internally I wouldn't expect V8 to ever actually do anything with those 
> fields. Since they are usually pointers that are stored, aligned buffers 
> will be more optimal.
>  
>
>>
>>
>>
>>
>> -- 
>> -- 
>> v8-users mailing list
>> v8-u...@googlegroups.com 
>> http://groups.google.com/group/v8-users
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "v8-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to v8-users+u...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[v8-users] internal fields

2017-11-04 Thread Bruce MacNaughton
I am new to Nan, V8, and C++ (so if I haven't put a big enough target on my 
back I don't know what else I can add). I've written a lot of JavaScript 
and, in the past, C, assembler, and kernel mode code, so hopefully the 
bulls-eye is a little smaller now.

I'm working with an existing code base and am trying to understand why 
things were done the way they were. It uses Nan to create an addon for 
nodejs. I'm hoping someone here can help me understand some pieces that 
escape me.

1. The code sets internal field count for each class - sometimes to 1 and 
sometimes to 2 - but never invokes "setInternalField()" or 
"getInternalField()". Is there some reason, possibly historical, that 
"setInternalFieldCount()" needed to be called to set a value? The way I 
have interpreted what I've read is that my code needs to set and get the 
value explicitly, so setting a value but never storing anything there makes 
no sense to me.

  // Prepare constructor template
 v8::Local ctor = Nan::New(New);
 ctor->InstanceTemplate()->SetInternalFieldCount(2);
 ctor->SetClassName(Nan::New("MyClass").ToLocalChecked());

2. Given that I'm storing something in internal fields, my understanding is 
that I need to free any resources (memory, etc.) that are used by the 
internal field if the object is GC'd. Doing that in the destructor seems to 
be the right way to handle that. Is that all there is to it?

3. What difference does it make to v8 if the internal field is an aligned 
pointer or not? Is the ability to set/get aligned pointers a consistency 
check so assumptions can be made? Does the interface check the alignment? 
(Not critical for me, I don't think, but I'd like to understand.)





-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.