Re: [infinispan-dev] Design change in Infinispan Query

Mircea Markus Wed, 26 Feb 2014 06:21:26 -0800

On Feb 26, 2014, at 2:13 PM, Dan Berindei <dan.berin...@gmail.com> wrote:


> 
> 
> 
> On Wed, Feb 26, 2014 at 3:12 PM, Mircea Markus <mmar...@redhat.com> wrote:
> 
> On Feb 25, 2014, at 5:08 PM, Sanne Grinovero <sa...@infinispan.org> wrote:
> 
> > There also is the opposite problem to be considered, as Emmanuel
> > suggested on 11/04/2012:
> > you can't forbid the user to store the same object (same type and same
> > id) in two different caches, where each Cache might be using different
> > indexing options.
> >
> > If the "search service" is a global concept, and you run a query which
> > matches object X, we'll return it to the user but he won't be able to
> > figure out from which cache it's being sourced: is that ok?
> 
> Can't the user figure that out based on the way the query is built?
> I mean the problem is similar with the databases: if address is both a table 
> and an column in the USER table, then it's the query (select) that determines 
> where from the address is returned.
> 
> You mean the user should specify the cache name(s) when building the query?

yes

> 
> With a database you have to go a bit out of your way to select from more than 
> one table at a time, normally you have just one primary table that you select 
> from and the others are just to help you filter and transform that table. You 
> also have to add some information about the source table yourself if you need 
> it, otherwise the DB won't tell you what table the results are coming from:
> 
> SELECT "table1" as source, id FROM table1
> UNION ALL
> SELECT "table2" as source, id FROM table2
> 
> Adrian tells our current query API doesn't allow us to do projections with 
> synthetic columns. On the other hand, we need to extend the current API to 
> give us the entry key anyway, so it would be easy to extend it to give us the 
> name of the cache as well.
> 
> 
> >
> > Ultimately this implies a query might return the same object X in
> > multiple positions in the result list of the query; for example it
> > might be the top result according to some criteria but also be the 5th
> > result because of how it was indexed in a different case: maybe
> > someone will find good use for this "capability" but I see it
> > primarily as a source of confusion.
> 
> Curious if this cannot be source of data can/cannot be specified within the 
> query.
> 
> Right, the user should be able to scope a search to a single cache, or maybe 
> to multiple caches, even if there is only one global index.
> 
> But I think the same object can already be inserted twice in the same cache, 
> only with a different key, so returning duplicates from a query is something 
> the user already has to cope with.
> 
> 
> > Finally, if we move the search service as a global component, there
> > might be an impact in how we explain security: an ACL filter applied
> > on one cache - or the index metadata produced by that cache - might
> > not be applied in the same way by an entity being matched through a
> > second cache.
> > Not least a user's permission to access one cache (or not) will affect
> > his results in a rather complex way.
> 
> I'll let Tristan comment more on this, but is this really different from an 
> SQL database where you grant access on individual tables and run a query 
> involving multiple of them?
> 
> The difference would be that in a DB each table will have its own index(es), 
> so they only have to check the permissions once and not for every row. 
> 
> OTOH, if we plan to support key-level permissions, that would require 
> checking the permissions on each search result anyway, so this wouldn't cost 
> us anything.
>  
> 
> >
> > I'm wondering if we need to prevent such situations.
> >
> > Sanne
> >
> > On 25 February 2014 16:24, Mircea Markus <mmar...@redhat.com> wrote:
> >>
> >> On Feb 25, 2014, at 3:46 PM, Adrian Nistor <anis...@gmail.com> wrote:
> >>
> >>> They can do what they please. Either put multiple types in one basket or 
> >>> put them in separate caches (one type per cache). But allowing / 
> >>> recommending is one thing, mandating it is a different story.
> >>>
> >>> There's no reason to forbid _any_ of these scenarios / mandate one over 
> >>> the other! There was previously in this thread some suggestion of 
> >>> mandating the one type per cache usage. -1 for it
> >>
> >> Agreed. I actually don't see how we can enforce people that declare 
> >> Cache<Object,Object> not put whatever they want in it. Also makes total 
> >> sense for smaller caches as it is easy to set up etc.
> >> The debate in this email, the way I understood it, was: are/should people 
> >> using multiple caches for storing data? If yes we should consider querying 
> >> functionality spreading over multiple caches.
> >>
> >>>
> >>>
> >>>
> >>> On Tue, Feb 25, 2014 at 5:08 PM, Mircea Markus <mmar...@redhat.com> wrote:
> >>>
> >>> On Feb 25, 2014, at 9:28 AM, Emmanuel Bernard <emman...@hibernate.org> 
> >>> wrote:
> >>>
> >>>>> On 24 févr. 2014, at 17:39, Mircea Markus <mmar...@redhat.com> wrote:
> >>>>>
> >>>>>
> >>>>>> On Feb 17, 2014, at 10:13 PM, Emmanuel Bernard 
> >>>>>> <emman...@hibernate.org> wrote:
> >>>>>>
> >>>>>> By the way, Mircea, Sanne and I had quite a long discussion about this 
> >>>>>> one and the idea of one cache per entity. It turns out that the right 
> >>>>>> (as in easy) solution does involve a higher level programming model 
> >>>>>> like OGM provides. You can simulate it yourself using the Infinispan 
> >>>>>> APIs but it is just cumbersome.
> >>>>>
> >>>>> Curious to hear the whole story :-)
> >>>>> We cannot mandate all the suers to use OGM though, one of the reasons 
> >>>>> being OGM is not platform independent (hotrod).
> >>>>
> >>>> Then solve all the issues I have raised with a magic wand and come back 
> >>>> to me when you have done it, I'm interested.
> >>>
> >>> People are going to use infinispan with one cache per entity, because it 
> >>> makes sense:
> >>> - different config (repl/dist | persistent/non-persistent) for different 
> >>> data types
> >>> - have map/reduce tasks running only the Person entires not on Dog as 
> >>> well, when you want to select (Person) where age > 18
> >>> I don't see a reason to forbid this, on the contrary. The way I see it 
> >>> the relation between (OGM, ISPN) <=> (Hibernate, JDBC). Indeed OGM would 
> >>> be a better abstraction and should be recommended as such for the Java 
> >>> clients, but ultimately we're a general purpose storage engine that is 
> >>> available to different platforms as well.
> >>>
> >>>
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Design change in Infinispan Query

Reply via email to