Re: Multi-tenancy and caching issues

Francesco Chicchiriccò Tue, 09 Jan 2024 02:28:23 -0800

Thank Romain, I share your considerations and concerns below, and also agree 
that EMF routing is the way to go.


I probably need to tune my current exploration to let evolve what we currently 
have in Syncope towards proper EMF routing.

Do you have any sample I could follow about that?

Regards.

On 09/01/24 10:51, Romain Manni-Bucau wrote:

Hi Francesco,

While you have an EMF router you don't have pitfall 4, it only happens if
your routing is done at datasource level but it also means you have way
more side effects and you start to loose the hability to tune per tenant (a
common pattern is to tune the cache per tenant "size"/usage, there all
would be shared, not isolated so no real way to handle anything there).

Note: having routed caches can make it work somehow but will need a lot of
reimplementation of the cache whereas it is free when using a routed emf.
It can be faked with PartitionedDataCache overriding the key name
(appending the tenant) but in terms of supervision I fear it will be way
harder and I'm not sure it would be very consummable for people (you end up
making the leak risk higher for users by design and you don't get any
benefit from that - you don't reduce the overhead, you don't reduce the
pool size etc which are at another level).

In terms of spring-data integration there is also no link, just @Bean EMF
routedEmf() and you'll get it working transparently while a tx - cache
scope of spring - is for a single tenant.

Hope I'm not missing something "key" ;).

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le mar. 9 janv. 2024 à 10:32, Francesco Chicchiriccò <ilgro...@apache.org>
a écrit :

Hi Romain,
see my replies embedded below.

Regards.

On 08/01/24 17:43, Romain Manni-Bucau wrote:

Hi Francesco,

Normally if you have one EMF per tenant there is no leak between them

since the cache instance is stored in the EMF - used that approach in TomEE.

As I am saying below, this is what we have already in Syncope.

My company is also supporting customers heavily using this particular
feature: it works, I have no issues with that.
Someone is also building a SaaS solution on top of that, so runtime tenant
addition and removal is also fine.

I am exploring this different approach because it would allow to introduce
Spring Data JPA, which could have some benefits - see
https://issues.apache.org/jira/browse/SYNCOPE-1799

You can check it in

org.apache.openjpa.datacache.DataCacheManagerImpl#initialize of each emf
which should be different.

Thanks for the pointer.

So overall if there is a leak it is likely that it leaks accross

transactions or some spring cache level.

I think that things are more subtle: consider the following use case.

We have MyEntity with String @Id.

Suppose we have two tenants: A and B.

1. Tenant A will make a REST call which creates a MyEntity instance with
key "key1" under the db for A.

2. Tenant A will make another REST call which looks for the newly created
MyEntity instance via:

entityManager.find(MyEntity.class, "key1");

3. Tenant B makes the same call as (1) with the same key "key1": all is
fine, a new row is created under the db for B.

4. Tenant B makes the same call as (2) with the same key "key1": if not
already evicted, entityManager will return the MyEntity instance for Tenant
A from the cache.

I need to avoid the pitfalls from (4).

Side note: the datasource routing pattern is useless if you have an

entity manager routing pattern and only use JPA to do database work, both
will more easily conflict than help.

The idea is not to have an entity manager routing pattern, rather to have
a cache routing patter on the single entity manager factory; or just to
configure some predefined partitions.

If you still want to plug the datacase (query cache) configuration in

the jpa properties can take a custom fully qualified name too.

Le lun. 8 janv. 2024 à 17:14, Francesco Chicchiriccò <

ilgro...@apache.org>

a écrit :

Hi there,
at Syncope we have been implementing multi-tenancy by relying on

something

like:

* 1 data source per tenant
* 1 entity manager factory per tenant
* 1 transaction manager per tenant
* etc

So far so good.

Now I am experimenting a different approach similar to [1], e.g.

* 1 low-level data source per tenant
* 1 data source extending Spring's AbstractRoutingDataSource using the
value of a ThreadLocal variable as lookup key
* 1 single entity manager factory configured with the routing data

source

* 1 single transaction manager
* etc

It mostly works but I am having caching issues with concurrent

operations

working on different tenants, so I was wondering: how can I extend the
various OpenJPA (query, data, L1, L2, every one) caches to hold back
different actual instances per tenant and to use the appropriate one
depending on the same ThreadLocal value I have already used above for

data

sources?

Thanks in advance.
Regards.

[1] https://github.com/Cepr0/sb-multitenant-db-demo



--
Francesco Chicchiriccò

Tirasa - Open Source Excellence
http://www.tirasa.net/

Member at The Apache Software Foundation
Syncope, Cocoon, Olingo, CXF, OpenJPA, PonyMail
http://home.apache.org/~ilgrosso/

Re: Multi-tenancy and caching issues

Reply via email to