Re: DDL implementation details

Sergey Kozlov Mon, 16 Jan 2017 13:51:08 -0800

For the approach schema==cache it is not clear how it works for some cases
(especialy if SQL and regular cache operations are mixed):


1. I want to store two tables in same cache (same schema) and both tables
have identical structure like id Integer, name String.
How they will be processed for regular cache operations
put/putall/get/getall etc where we get cache instance by its name (on SQL
level we use the table name) ?

2. What's about ALTER TABLE statement? Will the changes for one table lock
all tables in the cache?

3. In the future if we will introduce table-based features like SQL roles
(as next step of SQL implementation) "many tables one cache" rule looks
like more difficult for its implementation.

p.s. may be we need to think about new entity called "table" (based on type
descriptor) inside cache to better support of SQL DDL


On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <sergi.vlady...@gmail.com>
wrote:

> Dima,
>
> I agree that cache==table is definitely a wrong choice, but as far as I see
> Vova suggests having cache==tablespace instead of cache==schema. I tend to
> agree with this decoupling of physical and logical grouping, but the
> concern is that it will require much more work to do.
>
> Sergi
>
> 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <dsetrak...@apache.org>:
>
> > Vova,
> >
> > Currently I see only 2 ways we can proceed here:
> >
> >    1. cache == table
> >    2. cache == schema
> >
> > I agree that "cache==table" may be more flexible, but I don't think it
> will
> > work in Ignite.
> > We may end up with 1,000s of caches, which will carry significant
> overhead
> > on memory and cluster overall. I think that we have no choice but to take
> > "cache==schema" approach.
> >
> > D.
> >
> > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <voze...@gridgain.com>
> > wrote:
> >
> > > Sergi, Dima,
> > >
> > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> cache".
> > > Nobody suffers from it because nobody use Ignite as database. But in
> > > future, thanks to page memory, we are going to target real database use
> > > cases. Users will have multiple tables in Ignite. Plus views, triggers,
> > > constraints, etc.. All these features are very useful and easy to
> > implement
> > > provided that we already have table and index implementations. And in
> > > databases all related objects are *logically *grouped in a "schema".
> This
> > > is convenient for users: less boilerplate in SQL, better manageability
> > > (remember that database users will definitely need some console and/or
> UI
> > > tools to manage Ignite as a database).
> > >
> > > What you offer is to group database objects *physically *rather than
> > > logically. It will lead to:
> > > - Boilerplate in queries
> > > - Inconvenient database management. All the things database users are
> > used
> > > to - import/export tools, UIs, "USING" keyword, etc, will look weird in
> > > Ignite as there will be no way to group arbitrary objects logically.
> > >
> > > With this approach almost every user will have to use two schemes
> instead
> > > of one - one for operational data (PARTITIONED) and one for reference
> > data
> > > (REPLICATED). No conventional database works this way.
> > >
> > > Vladimir.
> > >
> > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > dsetrak...@apache.org>
> > > wrote:
> > >
> > > > Vova,
> > > >
> > > > I will join Sergi here. It seems like "schema = cache" will take care
> > of
> > > > all different configuration properties required for different groups
> of
> > > > caches. In addition, it cleanly maps into current Ignite
> architecture.
> > We
> > > > will need to have a very strong reason to move away from it.
> > > >
> > > > D.
> > > >
> > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> voze...@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Correct, it worked, because Ignite has never had real database use
> > case
> > > > in
> > > > > mind. Unfortunately, if our global plans go as expected, it will
> not
> > > work
> > > > > for Ignite 2.x+.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > sergi.vlady...@gmail.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Lets move on with SQL schema == Ignite cache. It worked always
> like
> > > > > this, I
> > > > > > see no reasons to change this.
> > > > > >
> > > > > > Sergi
> > > > > >
> > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com
> >:
> > > > > >
> > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps better than
> > "schema"
> > > > to
> > > > > > our
> > > > > > > cache. But not ideally still.
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov <
> > > > > voze...@gridgain.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Alex,
> > > > > > > >
> > > > > > > > Currently Ignite is not used as database. It is used as
> search
> > > > > engine -
> > > > > > > > several types, several tables, several joins. This is why
> > having
> > > > > > "SCHEMA
> > > > > > > ==
> > > > > > > > cache" was never a problem. Users have never build complex
> SQL
> > > > > > > applications
> > > > > > > > on top of Ignite. But we are going towards database. And my
> > > > question
> > > > > > > stands
> > > > > > > > still - suppose it is Y2019, how is user going to migrate his
> > > > > database
> > > > > > > > containing 20-30-50-100 tables in a single schema in Oracle
> to
> > > > > Ignite?
> > > > > > > >
> > > > > > > > Single cache for all tables? Doens't work - not flexible.
> Users
> > > > will
> > > > > > > > definitely require different cache modes, different
> co-location
> > > > > rules,
> > > > > > > > different number of backups, etc..
> > > > > > > > Schema per table? Doesn't work either - unmanageable and not
> > > > > convenient
> > > > > > > > for users even for relatively small databases.
> > > > > > > >
> > > > > > > > From user perspective schema is logical grouping of database
> > > > objects,
> > > > > > > > nothing more.
> > > > > > > >
> > > > > > > > For Ignite schema could be a logical group of resources
> (nodes,
> > > > > memory
> > > > > > > > pools, caches, etc.). And multiple tables over multiple
> caches
> > > > should
> > > > > > > > reside in it. To the contrast, table definition governs how
> > data
> > > is
> > > > > > > stored.
> > > > > > > > This is similar to, for example, MySQL approach, where you
> > define
> > > > how
> > > > > > you
> > > > > > > > store data on per-table level, and on schema level you define
> > > only
> > > > > > minor
> > > > > > > > things like collation.
> > > > > > > >
> > > > > > > > Vladimir.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander Paschenko <
> > > > > > > > alexander.a.pasche...@gmail.com> wrote:
> > > > > > > >
> > > > > > > >> Vova,
> > > > > > > >>
> > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov <
> > voze...@gridgain.com
> > > >:
> > > > > > > >> > I am not quite sure I understand the idea of "SCHEMA ==
> > > cache".
> > > > > > > Consider
> > > > > > > >> > some small database with, say, ~30 tables. And user wants
> to
> > > > > migrate
> > > > > > > to
> > > > > > > >> > Ignite. How is he supposed to do so? 30 schemas leading to
> > > > rewrite
> > > > > > of
> > > > > > > >> all
> > > > > > > >> > his SQL scripts? Or 30 key-value pairs in a single cache
> > > leading
> > > > > to
> > > > > > > >> lack of
> > > > > > > >> > flexibility and performance problems?
> > > > > > > >>
> > > > > > > >> But currently schema *is* semantically equal to cache while
> > > table
> > > > is
> > > > > > > >> equal to type descriptor (i.e. type of stored entities),
> > nothing
> > > > new
> > > > > > > >> here.
> > > > > > > >>
> > > > > > > >> Say, in single cache we may have entities of types Person
> and
> > > > > > > >> Organization, those map to two tables with same names, and
> can
> > > be
> > > > > > > >> accessed within the same cache (i.e. schema).
> > > > > > > >>
> > > > > > > >> If we want to limit the user with having single type
> > descriptor
> > > > per
> > > > > > > >> cache (i.e. cache has only one type of stored entities -
> BTW,
> > > > where
> > > > > we
> > > > > > > >> are with this 2.0-wise?), then this notion could change. But
> > > > > currently
> > > > > > > >> what has been suggested already fits quite good with what we
> > do
> > > > have
> > > > > > > >> at the moment regarding semantic of SQL objects.
> > > > > > > >>
> > > > > > > >> - Alex
> > > > > > > >>
> > > > > > > >> > Another example is how to deal with referene tables? Lots
> > > > database
> > > > > > has
> > > > > > > >> > small reference tables which is best to fit REPLICATED
> > cache,
> > > > > while
> > > > > > > >> others
> > > > > > > >> > are usually bound to PARTITIONED mode. "SCHEMA == cache"
> > will
> > > > > force
> > > > > > > >> users
> > > > > > > >> > to split them into separate schemes leading to poor user
> > > > > experience.
> > > > > > > >> >
> > > > > > > >> > I understand that we may have some implementation details
> > > around
> > > > > it
> > > > > > at
> > > > > > > >> the
> > > > > > > >> > moment. But from user perspective "SCHEMA == cache"
> doesn't
> > > make
> > > > > > > sense.
> > > > > > > >> As
> > > > > > > >> > we are going towards AI 2.0 we'd better to rethink this
> > > > approach.
> > > > > > > >> >
> > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM, Denis Magda <
> > > > dma...@apache.org>
> > > > > > > >> wrote:
> > > > > > > >> >
> > > > > > > >> >>
> > > > > > > >> >> > On Jan 12, 2017, at 12:35 PM, Dmitriy Setrakyan <
> > > > > > > >> dsetrak...@apache.org>
> > > > > > > >> >> wrote:
> > > > > > > >> >> >
> > > > > > > >> >> > On Thu, Jan 12, 2017 at 9:47 AM, Sergi Vladykin <
> > > > > > > >> >> sergi.vlady...@gmail.com>
> > > > > > > >> >> > wrote:
> > > > > > > >> >> >
> > > > > > > >> >> >> The xml config was only for example. We can put in
> this
> > > > > > > >> configuration
> > > > > > > >> >> >> string cache config parameters directly like this:
> > > > > > > >> >> >>
> > > > > > > >> >> >> CREATE SCHEMA "MyCacheName" WITH
> > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > >> >> >>
> > > > > > > >> >> >
> > > > > > > >> >> > This approach makes sense, if it can be easily
> supported
> > > with
> > > > > H2.
> > > > > > > >> >>
> > > > > > > >> >> What’s for affinity keys? Can we make an exception for
> them
> > > by
> > > > > > > >> defining in
> > > > > > > >> >> this part of the statement
> > > > > > > >> >>
> > > > > > > >> >> CREATE TABLE employee (
> > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > >> >>    dept_id BIGINT AFFINITY KEY,
> > > > > > > >> >>    name VARCHAR(128),
> > > > > > > >> >> );
> > > > > > > >> >>
> > > > > > > >> >> or that l
> > > > > > > >> >>
> > > > > > > >> >> CREATE TABLE employee (
> > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > >> >>    dept_id BIGINT,
> > > > > > > >> >>    name VARCHAR(128),
> > > > > > > >> >>    CONSTRAINT affKey AFFINITY KEY(dept_id)
> > > > > > > >> >> );
> > > > > > > >> >>
> > > > > > > >> >> ?
> > > > > > > >> >>
> > > > > > > >> >> —
> > > > > > > >> >> Denis
> > > > > > > >> >>
> > > > > > > >> >>
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com

Re: DDL implementation details

Reply via email to