Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
On 09/12/2018 02:51 PM, Volodymyr Babchuk wrote: Hi, Hi, On 12.09.18 13:59, Julien Grall wrote: Hi Volodymyr, On 09/11/2018 08:30 PM, Volodymyr Babchuk wrote: On 11.09.18 14:53, Julien Grall wrote: On 10/09/18 18:44, Volodymyr Babchuk wrote: On 10.09.18 16:01, Julien Grall wrote: On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time. This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern. Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to. I understand this... But the map/unmap should be negligible over the rest of the context. I thought that map/unmap is quite costly operation, but I can be wrong there. At the moment, map/unmap is nearly a nop on Arm64 because all the RAM is mapped (I would avoid to assume that thought :)). The only cost if going through the p2m to translate the IPA to PA. For Arm32, each CPUs has its own page-tables and the map/unmap (and TLB flush) will be done locally. I would still expect the impact to be minimal. Note that today map_domain_page on Arm32 is quite simplistic. It would be possible to optimize it for lowering the impact of map/unmap. [...] It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards. No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer. But you forward that call to OP-TEE after. So what would OP-TEE do with that? Happily resume interrupted work. There is how RPC works: 1. NW client issues STD call (or yielding call in terms of SMCCC) 2. OP-TEE starts its work, but it is needed to be interrupted for some reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client Thank you for the explanation. As I mentioned in another thread, it would be good to have some kind of highly level explanation in the tree and all those interaction. If it is already existing, then pointer in the code. High level is covered at [1], and low level is covered in already mentioned header files. Could you add those pointers at the top of the OP-TEE file? But I don't know about any explanation at detail level I gave you above. That's fine. Can you add that in the commit message? Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
Hi, On 12.09.18 13:59, Julien Grall wrote: Hi Volodymyr, On 09/11/2018 08:30 PM, Volodymyr Babchuk wrote: On 11.09.18 14:53, Julien Grall wrote: On 10/09/18 18:44, Volodymyr Babchuk wrote: On 10.09.18 16:01, Julien Grall wrote: On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time. This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern. Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to. I understand this... But the map/unmap should be negligible over the rest of the context. I thought that map/unmap is quite costly operation, but I can be wrong there. This is how long job in OP-TEE can be preempted by linux kernel, for example. Timer IRQ ensures that control will be returned to linux, scheduler schedules some other task and OP-TEE patiently waits until its caller is scheduled back, so it can resume the work. However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available). With current configuration, this is maximum 16 pages per guest. As for performance comparison... This is doable, but will take some time. Let me write it differently, I will always chose the safe side until this is strictly necessary or performance has been proven. I might be convinced for just 16 pages, although it feels like a premature optimization... Okay, then I'll stick with memory copy helpers for now. It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards. No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer. But you forward that call to OP-TEE after. So what would OP-TEE do with that? Happily resume interrupted work. There is how RPC works: 1. NW client issues STD call (or yielding call in terms of SMCCC) 2. OP-TEE starts its work, but it is needed to be interrupted for some reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client Thank you for the explanation. As I mentioned in another thread, it would be good to have some kind of highly level explanation in the tree and all those interaction. If it is already existing, then pointer in the code. High level is covered at [1], and low level is covered in already mentioned header files. But I don't know about any explanation at detail level I gave you above. Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id. OP-TEE has internal check against this. I am not sure how OP-TEE internal check would help here. The user may know that thread-id 1 exist and will call it from 2 vCPUs concurrently. So handle_rpc will fin
Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
Hi Volodymyr, On 09/11/2018 08:30 PM, Volodymyr Babchuk wrote: On 11.09.18 14:53, Julien Grall wrote: On 10/09/18 18:44, Volodymyr Babchuk wrote: On 10.09.18 16:01, Julien Grall wrote: On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time. This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern. Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to. I understand this... But the map/unmap should be negligible over the rest of the context. This is how long job in OP-TEE can be preempted by linux kernel, for example. Timer IRQ ensures that control will be returned to linux, scheduler schedules some other task and OP-TEE patiently waits until its caller is scheduled back, so it can resume the work. However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available). With current configuration, this is maximum 16 pages per guest. As for performance comparison... This is doable, but will take some time. Let me write it differently, I will always chose the safe side until this is strictly necessary or performance has been proven. I might be convinced for just 16 pages, although it feels like a premature optimization... It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards. No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer. But you forward that call to OP-TEE after. So what would OP-TEE do with that? Happily resume interrupted work. There is how RPC works: 1. NW client issues STD call (or yielding call in terms of SMCCC) 2. OP-TEE starts its work, but it is needed to be interrupted for some reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client Thank you for the explanation. As I mentioned in another thread, it would be good to have some kind of highly level explanation in the tree and all those interaction. If it is already existing, then pointer in the code. Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id. OP-TEE has internal check against this. I am not sure how OP-TEE internal check would help here. The user may know that thread-id 1 exist and will call it from 2 vCPUs concurrently. So handle_rpc will find a context associated to it and use it for execute_std_call. If OP-TEE return an error (or is done with it), you will end up to free twice the same context. Did I miss anything? So it would be possible for two vCPU to call concurrently the same command and free it. Maybe you noticed that mediator uses shadow buffer to read cookie id.
Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
Hi Julien, On 11.09.18 14:53, Julien Grall wrote: On 10/09/18 18:44, Volodymyr Babchuk wrote: Hi Julien, On 10.09.18 16:01, Julien Grall wrote: Hi Volodymyr, On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time. This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern. Now, this copy will happen for every IRQ when CPU is in S-EL1/S-EL0 mode. Chances are quite high, I must say. Look: OP-TEE or (TA) is doing something, like encrypting some buffer, for example. IRQ fires, OP-TEE immediately executes RPC return (right from interrupt handler), so NW can handle interrupt. Then NW returns control back to OP-TEE, if it wants to. This is how long job in OP-TEE can be preempted by linux kernel, for example. Timer IRQ ensures that control will be returned to linux, scheduler schedules some other task and OP-TEE patiently waits until its caller is scheduled back, so it can resume the work. However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available). With current configuration, this is maximum 16 pages per guest. As for performance comparison... This is doable, but will take some time. [...] +static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ + struct shm_rpc *shm_rpc; + bool found = false; + + spin_lock(&ctx->lock); + + list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) + { + if ( shm_rpc->cookie == cookie ) What does guarantee you the cookie will be uniq? Normal World guarantees. This is the part of the protocol. By NW, do you mean the guest? You should know by now we should not trust what the guest is doing. If you think it is still fine, then I would like some writing to explain what is the impact of a guest putting twice the same cookie ID. Ah, I see your point. Yes, I'll add check to ensure that cookie is not reused. Thank you for pointing to this. It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards. No, it is OP-TEE asked to free buffer. This function is called, when NW returns from the RPC. So at this moment NW freed the buffer. But you forward that call to OP-TEE after. So what would OP-TEE do with that? Happily resume interrupted work. There is how RPC works: 1. NW client issues STD call (or yielding call in terms of SMCCC) 2. OP-TEE starts its work, but it is needed to be interrupted for some reason: IRQ arrived, it wants to block on a mutex, it asks NW to do some work (like allocating memory or loading TA). This is called "RPC return". 3. OP-TEE suspends thread and does return from SMC call with code OPTEE_SMC_RPC_VAL(SOME_CMD) in a0, and some optional parameters in other registers 4. NW sees that this is a RPC, and not completed STD call, so it does SOME_CMD and issues another SMC with code OPTEE_SMC_CALL_RETURN_FROM_RPC in a0 5. OP-TEE wakes up suspended thread and continues execution 6. pts 2-5 are repeated until OP-TEE finishes the work 7. It returns from last SMC call with code OPTEE_SMC_RETURN_SUCCESS/ OPTEE_SMC_RETURN_some_error in a0. 8. optee driver sees that call from pt.1 is finished at least and returns control back to client Looking at that code, I just noticed there potential race condition here. Nothing prevent a guest to call twice with the same optee_thread_id. OP-TEE has internal check against this. So it would be possible for two vCPU to call concurrently the same command and free it. Maybe you noticed that mediator uses shadow buffer to read cookie id. So it will free the buffer mentioned by OP-TEE. Basically what happened: 1. OP-TEE asks "free buffer with cookie X" in RPC return 2. guests says "I freed that buffer" in SMC call 3. mediator frees b
Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
On 10/09/18 18:44, Volodymyr Babchuk wrote: Hi Julien, On 10.09.18 16:01, Julien Grall wrote: Hi Volodymyr, On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time. This is a bit misleading... This copy will *only* happen for IRQ during an RPC. What are the chances for that? Fairly limited. If this is happening too often, then the map/unmap here will be your least concern. However, I would like to see any performance comparison here to weight with the memory impact in Xen (Arm32 have limited amount of VA available). Signed-off-by: Volodymyr Babchuk --- xen/arch/arm/tee/optee.c | 121 ++- 1 file changed, 119 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include #define MAX_STD_CALLS 16 +#define MAX_RPC_SHMS 16 /* * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx { int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { + struct list_head list; + struct optee_msg_arg *guest_arg; + struct page *guest_page; + mfn_t guest_mfn; + uint64_t cookie; +}; + struct domain_ctx { struct list_head list; struct list_head call_ctx_list; + struct list_head shm_rpc_list; struct domain *domain; atomic_t call_ctx_count; + atomic_t shm_rpc_count; spinlock_t lock; }; @@ -145,8 +157,10 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); + INIT_LIST_HEAD(&ctx->shm_rpc_list); atomic_set(&ctx->call_ctx_count, 0); + atomic_set(&ctx->shm_rpc_count, 0); spin_lock_init(&ctx->lock); spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr, I would prefer if you pass a gfn instead of the address here. + uint64_t cookie) NIT: Indentation +{ + struct shm_rpc *shm_rpc; + int count; + + count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); + if ( count == MAX_RPC_SHMS ) + return NULL; + + shm_rpc = xzalloc(struct shm_rpc); + if ( !shm_rpc ) + goto err; + + shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + + if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) + goto err; + + shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); + if ( !shm_rpc->guest_arg ) + { + gprintk(XENLOG_INFO, "Could not map domain page\n"); You don't unpin the guest page if Xen can't map the page. + goto err; + } + shm_rpc->cookie = cookie; + + spin_lock(&ctx->lock); + list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); + spin_unlock(&ctx->lock); + + return shm_rpc; + +err: + atomic_dec(&ctx->shm_rpc_count); + xfree(shm_rpc); + return NULL; +} + +static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ + struct shm_rpc *shm_rpc; + bool found = false; + + spin_lock(&ctx->lock); + + list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) + { + if ( shm_rpc->cookie == cookie ) What does guarantee you the cookie will be uniq? Normal World guarantees. This is the part of the protocol. By NW, do you mean the guest? You should know by now we should not trust what the guest is doing. If you think it is still fine, then I would like some writing to explain what is the impact of a guest putting twice the same cookie ID. [...] It feels quite suspicious to free the memory in Xen before calling OP-TEE. I think this need to be done afterwards. No, it is OP-TEE asked to free buffer. This func
Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
Hi Julien, On 10.09.18 16:01, Julien Grall wrote: Hi Volodymyr, On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Sorry, I just didn't know about this mechanism. But for performance reasons, I'd like to keep this buffers always mapped. You see, RPC returns are very frequent (for every IRQ, actually). So I think, it will be costly to map/unmap this buffer every time. Signed-off-by: Volodymyr Babchuk --- xen/arch/arm/tee/optee.c | 121 ++- 1 file changed, 119 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include #define MAX_STD_CALLS 16 +#define MAX_RPC_SHMS 16 /* * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx { int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { + struct list_head list; + struct optee_msg_arg *guest_arg; + struct page *guest_page; + mfn_t guest_mfn; + uint64_t cookie; +}; + struct domain_ctx { struct list_head list; struct list_head call_ctx_list; + struct list_head shm_rpc_list; struct domain *domain; atomic_t call_ctx_count; + atomic_t shm_rpc_count; spinlock_t lock; }; @@ -145,8 +157,10 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); + INIT_LIST_HEAD(&ctx->shm_rpc_list); atomic_set(&ctx->call_ctx_count, 0); + atomic_set(&ctx->shm_rpc_count, 0); spin_lock_init(&ctx->lock); spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr, I would prefer if you pass a gfn instead of the address here. + uint64_t cookie) NIT: Indentation +{ + struct shm_rpc *shm_rpc; + int count; + + count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); + if ( count == MAX_RPC_SHMS ) + return NULL; + + shm_rpc = xzalloc(struct shm_rpc); + if ( !shm_rpc ) + goto err; + + shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + + if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) + goto err; + + shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); + if ( !shm_rpc->guest_arg ) + { + gprintk(XENLOG_INFO, "Could not map domain page\n"); You don't unpin the guest page if Xen can't map the page. + goto err; + } + shm_rpc->cookie = cookie; + + spin_lock(&ctx->lock); + list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); + spin_unlock(&ctx->lock); + + return shm_rpc; + +err: + atomic_dec(&ctx->shm_rpc_count); + xfree(shm_rpc); + return NULL; +} + +static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ + struct shm_rpc *shm_rpc; + bool found = false; + + spin_lock(&ctx->lock); + + list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) + { + if ( shm_rpc->cookie == cookie ) What does guarantee you the cookie will be uniq? Normal World guarantees. This is the part of the protocol. + { + found = true; + list_del(&shm_rpc->list); + break; + } + } + spin_unlock(&ctx->lock); At this point you have a shm_rpc in hand to free. But what does guarantee you no-one will use it? This is valid point. I'll revisit this part of the code, thank you. Looks like I need some refcount there. + + if ( !found ) { + return; + } No need for the {} in a one-liner. + + if ( shm_rpc->guest_arg ) { Coding style: if ( ... ) { + unpin_guest_ram_addr(shm_rpc->guest_mfn); + unmap_domain_page_global(shm_rpc->guest_arg); + } + + xfree(shm_rpc); +} + static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp; + struct shm_r
Re: [Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
Hi Volodymyr, On 03/09/18 17:54, Volodymyr Babchuk wrote: OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Can you explain why you always need to keep the shared buffer mapped in Xen? Why not using access_guest_memory_by_ipa every time you want to get information from the guest? Signed-off-by: Volodymyr Babchuk --- xen/arch/arm/tee/optee.c | 121 ++- 1 file changed, 119 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include #define MAX_STD_CALLS 16 +#define MAX_RPC_SHMS16 /* * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx { int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { +struct list_head list; +struct optee_msg_arg *guest_arg; +struct page *guest_page; +mfn_t guest_mfn; +uint64_t cookie; +}; + struct domain_ctx { struct list_head list; struct list_head call_ctx_list; +struct list_head shm_rpc_list; struct domain *domain; atomic_t call_ctx_count; +atomic_t shm_rpc_count; spinlock_t lock; }; @@ -145,8 +157,10 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); +INIT_LIST_HEAD(&ctx->shm_rpc_list); atomic_set(&ctx->call_ctx_count, 0); +atomic_set(&ctx->shm_rpc_count, 0); spin_lock_init(&ctx->lock); spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr, I would prefer if you pass a gfn instead of the address here. +uint64_t cookie) NIT: Indentation +{ +struct shm_rpc *shm_rpc; +int count; + +count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); +if ( count == MAX_RPC_SHMS ) +return NULL; + +shm_rpc = xzalloc(struct shm_rpc); +if ( !shm_rpc ) +goto err; + +shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + +if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) +goto err; + +shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); +if ( !shm_rpc->guest_arg ) +{ +gprintk(XENLOG_INFO, "Could not map domain page\n"); You don't unpin the guest page if Xen can't map the page. +goto err; +} +shm_rpc->cookie = cookie; + +spin_lock(&ctx->lock); +list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); +spin_unlock(&ctx->lock); + +return shm_rpc; + +err: +atomic_dec(&ctx->shm_rpc_count); +xfree(shm_rpc); +return NULL; +} + +static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ +struct shm_rpc *shm_rpc; +bool found = false; + +spin_lock(&ctx->lock); + +list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) +{ +if ( shm_rpc->cookie == cookie ) What does guarantee you the cookie will be uniq? +{ +found = true; +list_del(&shm_rpc->list); +break; +} +} +spin_unlock(&ctx->lock); At this point you have a shm_rpc in hand to free. But what does guarantee you no-one will use it? + +if ( !found ) { +return; +} No need for the {} in a one-liner. + +if ( shm_rpc->guest_arg ) { Coding style: if ( ... ) { +unpin_guest_ram_addr(shm_rpc->guest_mfn); +unmap_domain_page_global(shm_rpc->guest_arg); +} + +xfree(shm_rpc); +} + static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp; +struct shm_rpc *shm_rpc, *shm_rpc_tmp; bool found = false; /* At this time all domain VCPUs should be stopped */ @@ -290,7 +374,11 @@ static void optee_domain_destroy(struct domain *d) list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list ) free_std_call_ctx(ctx, call); +list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) +free_shm_rpc(ctx, shm_rpc->cookie); + ASSERT
[Xen-devel] [PATCH v2 08/13] optee: add support for RPC SHM buffers
OP-TEE usually uses the same idea with command buffers (see previous commit) to issue RPC requests. Problem is that initially it has no buffer, where it can write request. So the first RPC request it makes is special: it requests NW to allocate shared buffer for other RPC requests. Usually this buffer is allocated only once for every OP-TEE thread and it remains allocated all the time until shutdown. Mediator needs to pin this buffer(s) to make sure that domain can't transfer it to someone else. Also it should be mapped into XEN address space, because mediator needs to check responses from guests. Signed-off-by: Volodymyr Babchuk --- xen/arch/arm/tee/optee.c | 121 ++- 1 file changed, 119 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/tee/optee.c b/xen/arch/arm/tee/optee.c index 1008eba..6d6b51d 100644 --- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -21,6 +21,7 @@ #include #define MAX_STD_CALLS 16 +#define MAX_RPC_SHMS16 /* * Call context. OP-TEE can issue multiple RPC returns during one call. @@ -35,11 +36,22 @@ struct std_call_ctx { int rpc_op; }; +/* Pre-allocated SHM buffer for RPC commands */ +struct shm_rpc { +struct list_head list; +struct optee_msg_arg *guest_arg; +struct page *guest_page; +mfn_t guest_mfn; +uint64_t cookie; +}; + struct domain_ctx { struct list_head list; struct list_head call_ctx_list; +struct list_head shm_rpc_list; struct domain *domain; atomic_t call_ctx_count; +atomic_t shm_rpc_count; spinlock_t lock; }; @@ -145,8 +157,10 @@ static int optee_enable(struct domain *d) ctx->domain = d; INIT_LIST_HEAD(&ctx->call_ctx_list); +INIT_LIST_HEAD(&ctx->shm_rpc_list); atomic_set(&ctx->call_ctx_count, 0); +atomic_set(&ctx->shm_rpc_count, 0); spin_lock_init(&ctx->lock); spin_lock(&domain_ctx_list_lock); @@ -256,11 +270,81 @@ static struct std_call_ctx *find_call_ctx(struct domain_ctx *ctx, int thread_id) return NULL; } +static struct shm_rpc *allocate_and_map_shm_rpc(struct domain_ctx *ctx, paddr_t gaddr, +uint64_t cookie) +{ +struct shm_rpc *shm_rpc; +int count; + +count = atomic_add_unless(&ctx->shm_rpc_count, 1, MAX_RPC_SHMS); +if ( count == MAX_RPC_SHMS ) +return NULL; + +shm_rpc = xzalloc(struct shm_rpc); +if ( !shm_rpc ) +goto err; + +shm_rpc->guest_mfn = lookup_and_pin_guest_ram_addr(gaddr, NULL); + +if ( mfn_eq(shm_rpc->guest_mfn, INVALID_MFN) ) +goto err; + +shm_rpc->guest_arg = map_domain_page_global(shm_rpc->guest_mfn); +if ( !shm_rpc->guest_arg ) +{ +gprintk(XENLOG_INFO, "Could not map domain page\n"); +goto err; +} +shm_rpc->cookie = cookie; + +spin_lock(&ctx->lock); +list_add_tail(&shm_rpc->list, &ctx->shm_rpc_list); +spin_unlock(&ctx->lock); + +return shm_rpc; + +err: +atomic_dec(&ctx->shm_rpc_count); +xfree(shm_rpc); +return NULL; +} + +static void free_shm_rpc(struct domain_ctx *ctx, uint64_t cookie) +{ +struct shm_rpc *shm_rpc; +bool found = false; + +spin_lock(&ctx->lock); + +list_for_each_entry( shm_rpc, &ctx->shm_rpc_list, list ) +{ +if ( shm_rpc->cookie == cookie ) +{ +found = true; +list_del(&shm_rpc->list); +break; +} +} +spin_unlock(&ctx->lock); + +if ( !found ) { +return; +} + +if ( shm_rpc->guest_arg ) { +unpin_guest_ram_addr(shm_rpc->guest_mfn); +unmap_domain_page_global(shm_rpc->guest_arg); +} + +xfree(shm_rpc); +} + static void optee_domain_destroy(struct domain *d) { struct arm_smccc_res resp; struct domain_ctx *ctx; struct std_call_ctx *call, *call_tmp; +struct shm_rpc *shm_rpc, *shm_rpc_tmp; bool found = false; /* At this time all domain VCPUs should be stopped */ @@ -290,7 +374,11 @@ static void optee_domain_destroy(struct domain *d) list_for_each_entry_safe( call, call_tmp, &ctx->call_ctx_list, list ) free_std_call_ctx(ctx, call); +list_for_each_entry_safe( shm_rpc, shm_rpc_tmp, &ctx->shm_rpc_list, list ) +free_shm_rpc(ctx, shm_rpc->cookie); + ASSERT(!atomic_read(&ctx->call_ctx_count)); +ASSERT(!atomic_read(&ctx->shm_rpc_count)); xfree(ctx); } @@ -452,6 +540,32 @@ out: return ret; } +static void handle_rpc_func_alloc(struct domain_ctx *ctx, + struct cpu_user_regs *regs) +{ +paddr_t ptr = get_user_reg(regs, 1) << 32 | get_user_reg(regs, 2); + +if ( ptr & (OPTEE_MSG_NONCONTIG_PAGE_SIZE - 1) ) +gprintk(XENLOG_WARNING, "Domain returned invalid RPC command buffer\n"); + +if ( ptr ) { +uint64_t cookie = get_user_reg(regs, 4) << 32 | get_user_reg(regs, 5); +struct shm_rpc *shm_rpc; + +shm_rpc = allocate_and_map_shm