Added a correction: On Feb 28, 2014, at 9:14 PM, Mircea Markus <mmar...@redhat.com> wrote:
> >>>>> On 24 févr. 2014, at 17:39, Mircea Markus <mmar...@redhat.com> wrote: >>>>> >>>>> >>>>>> On Feb 17, 2014, at 10:13 PM, Emmanuel Bernard <emman...@hibernate.org> >>>>>> wrote: >>>>>> >>>>>> By the way, Mircea, Sanne and I had quite a long discussion about this >>>>>> one and the idea of one cache per entity. It turns out that the right >>>>>> (as in easy) solution does involve a higher level programming model like >>>>>> OGM provides. You can simulate it yourself using the Infinispan APIs but >>>>>> it is just cumbersome. >>>>> >>>>> Curious to hear the whole story :-) >>>>> We cannot mandate all the suers to use OGM though, one of the reasons >>>>> being OGM is not platform independent (hotrod). >>>> >>>> Then solve all the issues I have raised with a magic wand and come back to >>>> me when you have done it, I'm interested. >>> >>> People are going to use infinispan with one cache per entity, because it >>> makes sense: >>> - different config (repl/dist | persistent/non-persistent) for different >>> data types >>> - have map/reduce tasks running only the Person entires not on Dog as well, >>> when you want to select (Person) where age > 18 >>> I don't see a reason to forbid this, on the contrary. The way I see it the >>> relation between (OGM, ISPN) <=> (Hibernate, JDBC). Indeed OGM would be a >>> better abstraction and should be recommended as such for the Java clients, >>> but ultimately we're a general purpose storage engine that is available to >>> different platforms as well. >>> >> >> I do disagree on your assessment. >> I did write a whole essay on why I think your view is problematic - I was >> getting tired of repeating myself ;P >> https://github.com/infinispan/infinispan/wiki/A-continuum-of-data-structure-and-query-complexity > > Thanks for writing this up, it is a good taxonomy of data storage schemes and > querying. > >> >> To anecdotally answer your specific example, yes different configs for >> different entities is an interesting benefit but it has to outweigh the >> drawbacks. > > Using a single cache for all the types is practical at all :-) Just to expand > my idea, people prefer using different caches for many reasons: ^NOT > - security: Account cache has a different security requirements than the News > cache > - data consistency: News is a non-transactional cache, Account require > pessimistic XA transactions > - expiry: expire last year's news from the system. Not the same for Accounts > - availability: I want the Accounts cache to be backed up to another site. I > don't want that for the News cache > - logical data grouping: mixing Accounts with News doesn't make sense. I > might want to know which account appeared in the news, though. > >> If you have to do a map reduce for tasks so simple as age > 18, I think you >> system better have to be prepared to run gazillions of M/R jobs. > > I want to run a simple M/R job in the evening to determine who turns 18 > tomorrow, to congratulate them. Once a day, not gazzilions of times, and I > don't need to index the age filed just for that. Also when it comes to > Map/Reduce, the drawback of holding all the data in a single cache is > two-folded: > - performance: you iterate over the data that is not related to your query. > - programming model: the Map/Reduce implementation has a dependency on both > Dog and Person. If I add Cats to the cache, I'll need to update the M/R code > to be aware of that as well. Same if I rename/remove Dog. Not nice. > >> I think that Dogs and any domestic animal is fundamentally related to humans >> - Person in your case. So queries involving both will be required - a cross >> cache M/R is not doable today AFAIK and even if it was, it’s still M/R and >> all its drawbacks. >> To me, the Cache API and Hot Rod are well suited for what I call self >> contained object graph (i.e. where Dog would be an embedded object of Person >> and not a separate Entity). In that situation, there is a single cache. > > I see where you come from but I don't think requiring people to use a single > cache for all the entities is an option. Besides a natural logical > separation, different data has different storage requirements: security, > access patterns, consistency, durability, availability etc. For most of the > non-trivial use cases, using a single cache just wont do. > >> One cache per entity does make sense for API that do support what I call >> connected entities. Hibernate OGM specifically. > > OGM does a great job covering this, but it is very specific: java only and > OOP - our C/S mode, hotrod specifically, is language independent and not OOP. > Also I would like to comment on the following statements: > "I believe a cache API and Hot Rod are well suited to address up to the self > contained object graph use case with a couple of relations maintained > manually by the application but that cannot be queried. For the connected > entities use case, only a high level paradigm is suited like JPA." > > I don't think storing object graphs should be under scrutiny here: Infinispan > C/S mode (and there's where most of the client focus is BTW) has a schema > (prtobuf) that does not support object graphs. I also think expecting people > to use multiple caches for multiple data types is a solid assumption to start > from. And here's me speculating: these data types have logical relations > between them so people will ask for querying. In order to queries on multiple > data types, you can either merge them together (your suggestion) or support > some sort of new cross-cache indexing/querying/api. x-cache querying is more > flexible and less restraining than merging data, but from what I understand > from you has certain implementation challenges. There's no pressure to take a > decision now around supporting queries spreading multiple caches - just > something to keep an eye on when dealing with use cases/users. ATM merging > data is the only solution available, let's wait and see if people ask for > more. > >> But please read the wiki page first before commenting. I did spend a lot of >> time on it >> https://github.com/infinispan/infinispan/wiki/A-continuum-of-data-structure-and-query-complexity > > I do read your comments and I really appreciate your feedback. We come from > slightly different worlds and look at things from different angles, but > discussions like this raise many good points. > >> >> Emmanuel >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev