Hi Andrew,

> On 4 May 2023, at 15:14, Bertrand Marquis <bertrand.marq...@arm.com> wrote:
> 
> Hi Andrew,
> 
>> On 14 Apr 2023, at 10:58, Jens Wiklander <jens.wiklan...@linaro.org> wrote:
>> 
>> Hi,
>> 
>> On Thu, Apr 13, 2023 at 3:27 PM Andrew Cooper <andrew.coop...@citrix.com> 
>> wrote:
>>> 
>>> On 13/04/2023 1:26 pm, Julien Grall wrote:
>>>>> +static int ffa_domain_init(struct domain *d)
>>>>> +{
>>>>> +    struct ffa_ctx *ctx;
>>>>> +
>>>>> +    if ( !ffa_version )
>>>>> +        return -ENODEV;
>>>>> +
>>>>> +    ctx = xzalloc(struct ffa_ctx);
>>>>> +    if ( !ctx )
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    d->arch.tee = ctx;
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +/* This function is supposed to undo what ffa_domain_init() has done */
>>>> 
>>>> I think there is a problem in the TEE framework. The callback
>>>> .relinquish_resources() will not be called if domain_create() failed.
>>>> So this will result to a memory leak.
>>>> 
>>>> We also can't call .relinquish_resources() on early domain creation
>>>> failure because relinquishing resources can take time and therefore
>>>> needs to be preemptible.
>>>> 
>>>> So I think we need to introduce a new callback domain_free() that will
>>>> be called arch_domain_destroy(). Is this something you can look at?
>>> 
>>> 
>>> Cleanup of an early domain creation failure, however you do it, is at
>>> most "the same amount of time again".  It cannot (absent of development
>>> errors) take the same indefinite time periods of time that a full
>>> domain_destroy() can.
>>> 
>>> The error path in domain_create() explicitly does call domain_teardown()
>>> so we can (eventually) purge these duplicate cleanup paths.  There are
>>> far too many easy errors to be made which occur from having split
>>> cleanup, and we have had to issue XSAs in the past to address some of
>>> them.  (Hence the effort to try and specifically change things, and
>>> remove the ability to introduce the errors in the first place.)
>>> 
>>> 
>>> Right now, it is specifically awkward to do this nicely because
>>> domain_teardown() doesn't call into a suitable arch hook.
>>> 
>>> IMO the best option here is extend domain_teardown() with an
>>> arch_domain_teardown() state/hook, and wire in the TEE cleanup path into
>>> this too.
>>> 
>>> Anything else is explicitly adding to technical debt that I (or someone
>>> else) is going to have to revert further down the line.
>>> 
>>> If you want, I am happy to prototype the arch_domain_teardown() bit of
>>> the fix, but I will have to defer wiring in the TEE part to someone
>>> capable of testing it.
>> 
>> You're more than welcome to prototype the fix, I can test it and add
>> it to the next version of the patch set when we're happy with the
>> result.
> 
> 
> Could you tell us if you are still happy to work on the prototype for
> arch_domain_teardown and when you would be able to give a first prototype ?

Could you answer to this question ?

Cheers
Bertrand

Reply via email to