The current MesaTEE access control is basic. Every file has one owner and every task has a group of participants. A task can read files owned by its participants and the result is readable by all participants.
While the model above is good enough for simple tasks, it is incapable of supporting concatenated jobs. Say user A and B want to collaborate but don't yet know exactly how. For example, they want to build a joint ML model by sharing data, but things like feature engineering can take a lot of trials. So they may want to apply some preprocessing (e.g., PSI) before doing other things. In this case, there is no need to do PSI multiple times, so it's better to cache the result. Then the problem becomes, who is the owner of the result? Our current model doesn't support this kind of complicated scenarios. So I say we implement a dedicated and powerful enough access control module for MesaTEE. We can follow existing practices, like the PERM (policy, effect, request, matcher) model proposed by the casbin project (see https://casbin.org/docs/en/syntax-for-models). I have a prototype working in MesaPy already and am trying to integrate this into MesaTEE as a dedicated service in the coming week. Other components can get AC decisions by sending requests to this service through our standard RPC mechanism. The question is, for the long term, should we maintain this module in Rust or Python? It is easier to implement attribute-based access control (ABAC) with Python since it supports reflection. And people can directly write policies in python code which is probably more convenient for most users. Yet another plus point is that we can use this case to demonstrate what MesaPy is capable of. The con is that performance may become an issue since Python is poor at concurrency. Meanwhile, implementing PERM and ABAC in Rust is feasible but takes significantly more time. Any thoughts? Pei
