[openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-02-19 Thread Wenzhi Yu
Hi team,

I need your advice on this patch[1], which aims to implement etcd DB data model 
and API
for 'ResourceClass' object.

As you may know, in mysql implementation, mysql will generate a 'id' field, 
which is an
unique and auto increase integer. The 'id' is also used as 'primary key' or 
'foreign key'
in mysql[2].

However, in etcd implementation, etcd will NOT generate this 'id' itself, so I 
intend to
use the 'uuid' attribute of the object instead of 'id', and modify the DB API 
method to use
'uuid' as object ident instead of 'id', like[3]. Personally I feel using 'uuid' 
is more
reasonable because 'id' is a specific field in DB like mysql, seems it does not 
have actual
meaning in data model, right?

An alternative way Hongbin suggested is to generate an unique 'id' like mysql 
by ourselves
and insert the 'id' into etcd data model. But he said he's OK with the idea to 
replace 'id'
with 'uuid' if it does not break anything.

What's your opinion on this issue? Thanks in advance!

[1]https://review.openstack.org/#/c/434909/
[2]https://github.com/openstack/zun/blob/c0cebba170b8e3ea5e62e335536cf974bbbf08ec/zun/db/sqlalchemy/models.py#L200
[3]https://github.com/openstack/zun/blob/c0cebba170b8e3ea5e62e335536cf974bbbf08ec/zun/db/etcd/api.py#L209
 


Best Regards,
Wenzhi Yu (yuywz)



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-02-20 Thread Qiming Teng
On Mon, Feb 20, 2017 at 02:14:20PM +0800, Wenzhi Yu wrote:
> Hi team,
> 
> I need your advice on this patch[1], which aims to implement etcd DB data 
> model and API
> for 'ResourceClass' object.
> 
> As you may know, in mysql implementation, mysql will generate a 'id' field, 
> which is an
> unique and auto increase integer. The 'id' is also used as 'primary key' or 
> 'foreign key'
> in mysql[2].

Can someone remind me the benefits we get from Integer over UUID as
primary key? UUID, as its name implies, is meant to be an identifier for
a resource. Why are we generating integer key values?

- Qiming
 
> However, in etcd implementation, etcd will NOT generate this 'id' itself, so 
> I intend to
> use the 'uuid' attribute of the object instead of 'id', and modify the DB API 
> method to use
> 'uuid' as object ident instead of 'id', like[3]. Personally I feel using 
> 'uuid' is more
> reasonable because 'id' is a specific field in DB like mysql, seems it does 
> not have actual
> meaning in data model, right?
> 
> An alternative way Hongbin suggested is to generate an unique 'id' like mysql 
> by ourselves
> and insert the 'id' into etcd data model. But he said he's OK with the idea 
> to replace 'id'
> with 'uuid' if it does not break anything.
> 
> What's your opinion on this issue? Thanks in advance!
> 
> [1]https://review.openstack.org/#/c/434909/
> [2]https://github.com/openstack/zun/blob/c0cebba170b8e3ea5e62e335536cf974bbbf08ec/zun/db/sqlalchemy/models.py#L200
> [3]https://github.com/openstack/zun/blob/c0cebba170b8e3ea5e62e335536cf974bbbf08ec/zun/db/etcd/api.py#L209
>  
> 
> 
> Best Regards,
> Wenzhi Yu (yuywz)


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-02-21 Thread gordon chung


On 21/02/17 01:28 AM, Qiming Teng wrote:
>> in mysql[2].
> Can someone remind me the benefits we get from Integer over UUID as
> primary key? UUID, as its name implies, is meant to be an identifier for
> a resource. Why are we generating integer key values?

this ^. use UUID please. you can google why auto increment is a probably 
not a good idea.

from a selfish pov, as gnocchi captures data on all resources in 
openstack, we store everything as a uuid anyways. even if your id 
doesn't clash in zun, it has a higher chance of clashing when you 
consider all the other resources from other services.

cheers,
-- 
gord

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-02-21 Thread Hongbin Lu
Gordon & Qiming,

Thanks for your inputs. The only reason Zun was using 'id' is because the data 
model was copied from other projects and those projects are using 'id', but I 
couldn't think of a reason why they were using 'id' at the first place. By 
aggregating the feedback so far, I think it makes sense for Zun to switch to 
'uuid' since we introduced etcd as an alternative datastore and etcd didn't 
support auto-increment primary key, unless someone pointed out a valid reason 
for stay using 'id'...

Best regards,
Hongbin

> -Original Message-
> From: gordon chung [mailto:g...@live.ca]
> Sent: February-21-17 8:29 AM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object
> ident in data model
> 
> 
> 
> On 21/02/17 01:28 AM, Qiming Teng wrote:
> >> in mysql[2].
> > Can someone remind me the benefits we get from Integer over UUID as
> > primary key? UUID, as its name implies, is meant to be an identifier
> > for a resource. Why are we generating integer key values?
> 
> this ^. use UUID please. you can google why auto increment is a
> probably not a good idea.
> 
> from a selfish pov, as gnocchi captures data on all resources in
> openstack, we store everything as a uuid anyways. even if your id
> doesn't clash in zun, it has a higher chance of clashing when you
> consider all the other resources from other services.
> 
> cheers,
> --
> gord
> 
> ___
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-02-21 Thread Pradeep Singh
I also agree to use 'uuid' as primary key as that makes more sense as
compared to id.

On Tue, Feb 21, 2017 at 9:40 PM, Hongbin Lu  wrote:

> Gordon & Qiming,
>
> Thanks for your inputs. The only reason Zun was using 'id' is because the
> data model was copied from other projects and those projects are using
> 'id', but I couldn't think of a reason why they were using 'id' at the
> first place. By aggregating the feedback so far, I think it makes sense for
> Zun to switch to 'uuid' since we introduced etcd as an alternative
> datastore and etcd didn't support auto-increment primary key, unless
> someone pointed out a valid reason for stay using 'id'...
>
> Best regards,
> Hongbin
>
> > -Original Message-
> > From: gordon chung [mailto:g...@live.ca]
> > Sent: February-21-17 8:29 AM
> > To: openstack-dev@lists.openstack.org
> > Subject: Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object
> > ident in data model
> >
> >
> >
> > On 21/02/17 01:28 AM, Qiming Teng wrote:
> > >> in mysql[2].
> > > Can someone remind me the benefits we get from Integer over UUID as
> > > primary key? UUID, as its name implies, is meant to be an identifier
> > > for a resource. Why are we generating integer key values?
> >
> > this ^. use UUID please. you can google why auto increment is a
> > probably not a good idea.
> >
> > from a selfish pov, as gnocchi captures data on all resources in
> > openstack, we store everything as a uuid anyways. even if your id
> > doesn't clash in zun, it has a higher chance of clashing when you
> > consider all the other resources from other services.
> >
> > cheers,
> > --
> > gord
> >
> > ___
> > ___
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-
> > requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-04-05 Thread Monty Taylor

On 02/21/2017 07:28 AM, gordon chung wrote:



On 21/02/17 01:28 AM, Qiming Teng wrote:

in mysql[2].

Can someone remind me the benefits we get from Integer over UUID as
primary key? UUID, as its name implies, is meant to be an identifier for
a resource. Why are we generating integer key values?


this ^. use UUID please. you can google why auto increment is a probably
not a good idea.

from a selfish pov, as gnocchi captures data on all resources in
openstack, we store everything as a uuid anyways. even if your id
doesn't clash in zun, it has a higher chance of clashing when you
consider all the other resources from other services.

cheers,



sorry - I just caught this.

Please do NOT use uuid as a primary key in MySQL:

* UUID has 36 characters which makes it bulky.
* InnoDB stores data in the PRIMARY KEY order and all the secondary keys 
also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index 
bigger which can not be fit into the memory

* Inserts are random and the data is scattered.

In cases where data has a large natural key (like a uuid) It is 
considered a best practice to use an auto-increment integer as the 
primary key and to put a second column in the table to store the uuid, 
potentially with a unique index applied to it for consistency.


That way the external identifier for things like gnocchi can still be 
the UUID, but the internal id for the database can be an efficient 
auto-increment primary key.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-04-05 Thread Akihiro Motoki
I noticed this thread by Monty's reply. Sorry for my late :(

I think we need to think 'id' separately for API modeling and DB modeling.

In the API perspective, one of the important things is that 'id' is
not predictable
and it rarely conflict. From this perspective, UUID works.

In the DB perspective, the context will be different.
Efficiency is another important point.
auto-incremental way brings us a good efficiency.

In most OpenStack projects, we use 'id' in a database as 'id' in an API layer.
I am okay with using incremental integer as 'id' in DB, but I don't think
it is not a good idea to use predictable 'id' in the API layer.

I don't know how 'id' in API and DB layer are related in Zun implementation
but I believe this is one of the important point.

Akihiro


2017-04-05 22:00 GMT+09:00 Monty Taylor :
> On 02/21/2017 07:28 AM, gordon chung wrote:
>>
>>
>>
>> On 21/02/17 01:28 AM, Qiming Teng wrote:

 in mysql[2].
>>>
>>> Can someone remind me the benefits we get from Integer over UUID as
>>> primary key? UUID, as its name implies, is meant to be an identifier for
>>> a resource. Why are we generating integer key values?
>>
>>
>> this ^. use UUID please. you can google why auto increment is a probably
>> not a good idea.
>>
>> from a selfish pov, as gnocchi captures data on all resources in
>> openstack, we store everything as a uuid anyways. even if your id
>> doesn't clash in zun, it has a higher chance of clashing when you
>> consider all the other resources from other services.
>>
>> cheers,
>>
>
> sorry - I just caught this.
>
> Please do NOT use uuid as a primary key in MySQL:
>
> * UUID has 36 characters which makes it bulky.
> * InnoDB stores data in the PRIMARY KEY order and all the secondary keys
> also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index
> bigger which can not be fit into the memory
> * Inserts are random and the data is scattered.
>
> In cases where data has a large natural key (like a uuid) It is considered a
> best practice to use an auto-increment integer as the primary key and to put
> a second column in the table to store the uuid, potentially with a unique
> index applied to it for consistency.
>
> That way the external identifier for things like gnocchi can still be the
> UUID, but the internal id for the database can be an efficient
> auto-increment primary key.
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-04-05 Thread gordon chung


On 05/04/17 09:00 AM, Monty Taylor wrote:
>
> Please do NOT use uuid as a primary key in MySQL:
>
> * UUID has 36 characters which makes it bulky.

you can store it as a binary if space is a concern.

> * InnoDB stores data in the PRIMARY KEY order and all the secondary keys
> also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index
> bigger which can not be fit into the memory
> * Inserts are random and the data is scattered.

can store a ordered uuid (uuid1) for performance but arguably not much 
diff from just autoincrement

>
> In cases where data has a large natural key (like a uuid) It is
> considered a best practice to use an auto-increment integer as the
> primary key and to put a second column in the table to store the uuid,
> potentially with a unique index applied to it for consistency.
>
> That way the external identifier for things like gnocchi can still be
> the UUID, but the internal id for the database can be an efficient
> auto-increment primary key.

very good points. i guess ultimately should probably just test to the 
scale you hope for

cheers,

-- 
gord
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-04-05 Thread Monty Taylor

On 04/05/2017 09:39 AM, Akihiro Motoki wrote:

I noticed this thread by Monty's reply. Sorry for my late :(

I think we need to think 'id' separately for API modeling and DB modeling.

In the API perspective, one of the important things is that 'id' is
not predictable
and it rarely conflict. From this perspective, UUID works.

In the DB perspective, the context will be different.
Efficiency is another important point.
auto-incremental way brings us a good efficiency.

In most OpenStack projects, we use 'id' in a database as 'id' in an API layer.
I am okay with using incremental integer as 'id' in DB, but I don't think
it is not a good idea to use predictable 'id' in the API layer.

I don't know how 'id' in API and DB layer are related in Zun implementation
but I believe this is one of the important point.


Yes! Very well said. UUID is the excellent choice for API - auto-inc is 
the excellent choice for the database.



2017-04-05 22:00 GMT+09:00 Monty Taylor :

On 02/21/2017 07:28 AM, gordon chung wrote:




On 21/02/17 01:28 AM, Qiming Teng wrote:


in mysql[2].


Can someone remind me the benefits we get from Integer over UUID as
primary key? UUID, as its name implies, is meant to be an identifier for
a resource. Why are we generating integer key values?



this ^. use UUID please. you can google why auto increment is a probably
not a good idea.

from a selfish pov, as gnocchi captures data on all resources in
openstack, we store everything as a uuid anyways. even if your id
doesn't clash in zun, it has a higher chance of clashing when you
consider all the other resources from other services.

cheers,



sorry - I just caught this.

Please do NOT use uuid as a primary key in MySQL:

* UUID has 36 characters which makes it bulky.
* InnoDB stores data in the PRIMARY KEY order and all the secondary keys
also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index
bigger which can not be fit into the memory
* Inserts are random and the data is scattered.

In cases where data has a large natural key (like a uuid) It is considered a
best practice to use an auto-increment integer as the primary key and to put
a second column in the table to store the uuid, potentially with a unique
index applied to it for consistency.

That way the external identifier for things like gnocchi can still be the
UUID, but the internal id for the database can be an efficient
auto-increment primary key.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-04-06 Thread Mike Bayer



On 04/05/2017 11:00 AM, Monty Taylor wrote:

On 04/05/2017 09:39 AM, Akihiro Motoki wrote:

I noticed this thread by Monty's reply. Sorry for my late :(

I think we need to think 'id' separately for API modeling and DB
modeling.

In the API perspective, one of the important things is that 'id' is
not predictable
and it rarely conflict. From this perspective, UUID works.

In the DB perspective, the context will be different.
Efficiency is another important point.
auto-incremental way brings us a good efficiency.

In most OpenStack projects, we use 'id' in a database as 'id' in an
API layer.
I am okay with using incremental integer as 'id' in DB, but I don't think
it is not a good idea to use predictable 'id' in the API layer.

I don't know how 'id' in API and DB layer are related in Zun
implementation
but I believe this is one of the important point.


Yes! Very well said. UUID is the excellent choice for API - auto-inc is
the excellent choice for the database.


+1

with primary key datatype, you also imply the datatype of columns 
constrained by foreign key as well, which itself usually gets indexed too.








2017-04-05 22:00 GMT+09:00 Monty Taylor :

On 02/21/2017 07:28 AM, gordon chung wrote:




On 21/02/17 01:28 AM, Qiming Teng wrote:


in mysql[2].


Can someone remind me the benefits we get from Integer over UUID as
primary key? UUID, as its name implies, is meant to be an
identifier for
a resource. Why are we generating integer key values?



this ^. use UUID please. you can google why auto increment is a
probably
not a good idea.

from a selfish pov, as gnocchi captures data on all resources in
openstack, we store everything as a uuid anyways. even if your id
doesn't clash in zun, it has a higher chance of clashing when you
consider all the other resources from other services.

cheers,



sorry - I just caught this.

Please do NOT use uuid as a primary key in MySQL:

* UUID has 36 characters which makes it bulky.
* InnoDB stores data in the PRIMARY KEY order and all the secondary keys
also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index
bigger which can not be fit into the memory
* Inserts are random and the data is scattered.

In cases where data has a large natural key (like a uuid) It is
considered a
best practice to use an auto-increment integer as the primary key and
to put
a second column in the table to store the uuid, potentially with a
unique
index applied to it for consistency.

That way the external identifier for things like gnocchi can still be
the
UUID, but the internal id for the database can be an efficient
auto-increment primary key.



__

OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__

OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zun]Use 'uuid' instead of 'id' as object ident in data model

2017-04-06 Thread Mike Bayer



On 04/05/2017 11:02 AM, gordon chung wrote:



On 05/04/17 09:00 AM, Monty Taylor wrote:


Please do NOT use uuid as a primary key in MySQL:

* UUID has 36 characters which makes it bulky.


you can store it as a binary if space is a concern.


this is highly inconvenient from a datadump / MySQL commandline 
perspective.






* InnoDB stores data in the PRIMARY KEY order and all the secondary keys
also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index
bigger which can not be fit into the memory
* Inserts are random and the data is scattered.


can store a ordered uuid (uuid1) for performance but arguably not much
diff from just autoincrement



In cases where data has a large natural key (like a uuid) It is
considered a best practice to use an auto-increment integer as the
primary key and to put a second column in the table to store the uuid,
potentially with a unique index applied to it for consistency.

That way the external identifier for things like gnocchi can still be
the UUID, but the internal id for the database can be an efficient
auto-increment primary key.


very good points. i guess ultimately should probably just test to the
scale you hope for


there's no advantage to the UUID being the physical primary key of the 
table.  If you don't care about the surrogate integer, just ignore it; 
it gets created for you.   The only argument I can see is that you 
really want to generate rows in Python that refer to the UUID of another 
row and you want that UUID to go straight into a foreign-key constrained 
column, in which case I'd urge you to instead use idiomatic SQLAlchemy 
ORM patterns for data manipulation (e.g. relationships).


The surrogate integer thing is the use case that all database engines 
are very well tested for and while it is not "pure" from Codd's point of 
view, it is definitely the most pragmatic approach from many different 
perspectives.





cheers,



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev