Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-20 Thread John Bresnahan
Mark, good thoughts (as usual)

On 08/19/2013 09:15 PM, Mark Washenberger wrote:
> The goal isn't really to replace sqlalchemy completely.

Perhaps my problem is that I am not exactly sure what the goals are.
Cleanup (BL mixed in the BL seems wrong)?  HA or performance (are people
hitting limits that are traced to SQL) ?  Flexibility/Research
(plug-able modules for experimentation)?  I think it would help scope
the effort (and temper my concern about the work/reward ratio) if they
were enumerated in clear place.

> I'm hoping I can
> create a space where multiple drivers can operate efficiently without
> introducing bugs (i.e. pull all that business logic out of the driver!)
> I'll be very interested to see if people can, after such a refactoring,
> try out some more storage approaches, such as dropping the sqlalchemy
> orm in favor of its generic engine support or direct sql execution, as
> well as NoSQL what-have-you. We don't have to make all of the drivers
> live in the project, so it really can be a good place for interested
> parties to experiment.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-20 Thread Mark Washenberger
On Tue, Aug 20, 2013 at 3:20 AM, Flavio Percoco  wrote:

> On 20/08/13 00:15 -0700, Mark Washenberger wrote:
>
>>
>>2) I highly caution folks who think a No-SQL store is a good
>> storage
>>solution for any of the data currently used by Nova, Glance
>> (registry),
>>Cinder (registry), Ceilometer, and Quantum. All of the data stored
>> and
>>manipulated in those projects is HIGHLY relational data, and not
>>objects/documents. Switching to use a KVS for highly relational
>> data is
>>a terrible decision. You will just end up implementing joins in
>> your
>>code...
>>
>>
>>
>>+1
>>
>>FWIW, I'm a huge fan of NoSQL technologies but I couldn't agree more
>>here.
>>
>>
>>
>> I have to say I'm kind of baffled by this sentiment (expressed here and
>> elsewhere in the thread.) I'm not a NoSQL expert, but I hang out with a
>> few and
>> I'm pretty confident Glance at least is not that relational. We do two
>> types of
>> joins in glance. The first, like image properties, is basically just an
>> implementation detail of the sql driver. Its not core to the application.
>> Any
>> NoSQL implementation will simply completely denormalize those properties
>> into
>> the image record. (And honestly, so might an optimized SQL
>> implementation. . .)
>>
>> The second type of join, image_members, is basically just a hack to solve
>> the
>> problem created because the glance api offers several simultaneous
>> implicit
>> "views" of images. Specifically, when you list images in glance, you are
>> seeing
>> a union of three views: public images, images you own, and images shared
>> with
>> you. IMO its actually a more scalable and sensible solution to make these
>> views
>> more explicit and independent in the API and code, taking a lesson from
>> filesystems which have to scale to a lot of metadata (notice how
>> visibility is
>> generally an attribute of a directory, not of regular files in your
>> typical
>> Unix FS?). And to solve this problem in SQL now we still have to do a
>> server-side union, which is a bit sad. But even before we can refactor
>> the API
>> (v3 anyone?) I don't see it as unworkably slow for a NoSQL driver to track
>> these kinds of views.
>>
>
> You make really good points here but I don't fully agree.
>

Thanks for your measured response. I wrote my previous response a bit late
at night for me and I hope I wasn't rude :-/

>
> I don't think the issue is actually translating Glance's models to
> NoSQL or NoSQL db's performance, I'm pretty sure we could benefit in some
> areas but not all of them. To me, and that's what my comment was referring
> to, this is more related to  what kind of data we're actually
> treating, the guarantees we should provide and how they are
> implemented now.
>
> There are a couple of things that would worry me about an hypothetic
> support for NoSQL but I guess one that I'd consider very critical is
> migrations. Some could argue asking whether we'd really need them or
> not  - when talking about NoSQL databases - but we do. Using a
> schemaless database wouldn't mean we don't have a schema. Migrations
> are not trivial for some NoSQL databases, plus, this would mean
> drivers, most probably, would have to have their own implementation.


I definitely think different drivers will need their own migrations. When
I've been playing around with this refactoring, I created a "Migrator"
interface and made it part of the driver interface to instantiate an
appropriate migrator object. But I was definitely concerned about a number
of things here. First off, is it just too confusing to have multiple
migrations? The migration sequences will definitely need to be different
per driver. How do we support cross-driver migrations?


>
>
>  The bigger concern to me is that Glance seems a bit trigger-happy with
>> indexes.
>> But I'm confident we're in a similar boat there: performance in NoSQL
>> won't be
>> that terrible for the most important use cases, and a later refactoring
>> can put
>> us on a more sustainable track in the long run.
>>
>
> I'm not worried about this, though.
>

Okay, that is reassuring.


>
>  All I'm saying is that we should be careful not to swap one set of
 problems for another.

>>>
>>  My 2 cents: I am in agreement with Jay.  I am leery of NoSQL being a
>>> direct sub in and I fear that this effort can be adding a large workload
>>> for little benefit.
>>>
>>
>> The goal isn't really to replace sqlalchemy completely. I'm hoping I can
>> create
>> a space where multiple drivers can operate efficiently without
>> introducing bugs
>> (i.e. pull all that business logic out of the driver!) I'll be very
>> interested
>> to see if people can, after such a refactoring, try out some more storage
>> approaches, such as dropping the sqlalchemy orm in favor of its generic
>> engine
>> support or direct sql execution, as well as NoSQL what-have-you. We don't
>> have
>> to make all of the drivers live

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-20 Thread Johannes Erdfelt
On Tue, Aug 20, 2013, Flavio Percoco  wrote:
> There are a couple of things that would worry me about an hypothetic
> support for NoSQL but I guess one that I'd consider very critical is
> migrations. Some could argue asking whether we'd really need them or
> not  - when talking about NoSQL databases - but we do. Using a
> schemaless database wouldn't mean we don't have a schema. Migrations
> are not trivial for some NoSQL databases, plus, this would mean
> drivers, most probably, would have to have their own implementation.

Migrations aren't always about the schema. Take migrations 015 and 017
in glance for instance. They migrate data by fixing the URI and making
sure it's quoted correctly. The schema doesn't change, but the data
does.

This shares many of the same practical problems that schema migrations
have and would apply to NoSQL databases.

JE


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-20 Thread Flavio Percoco

On 20/08/13 00:15 -0700, Mark Washenberger wrote:


   2) I highly caution folks who think a No-SQL store is a good storage
   solution for any of the data currently used by Nova, Glance (registry),
   Cinder (registry), Ceilometer, and Quantum. All of the data stored and
   manipulated in those projects is HIGHLY relational data, and not
   objects/documents. Switching to use a KVS for highly relational data is
   a terrible decision. You will just end up implementing joins in your
   code...



   +1

   FWIW, I'm a huge fan of NoSQL technologies but I couldn't agree more
   here.



I have to say I'm kind of baffled by this sentiment (expressed here and
elsewhere in the thread.) I'm not a NoSQL expert, but I hang out with a few and
I'm pretty confident Glance at least is not that relational. We do two types of
joins in glance. The first, like image properties, is basically just an
implementation detail of the sql driver. Its not core to the application. Any
NoSQL implementation will simply completely denormalize those properties into
the image record. (And honestly, so might an optimized SQL implementation. . .)

The second type of join, image_members, is basically just a hack to solve the
problem created because the glance api offers several simultaneous implicit
"views" of images. Specifically, when you list images in glance, you are seeing
a union of three views: public images, images you own, and images shared with
you. IMO its actually a more scalable and sensible solution to make these views
more explicit and independent in the API and code, taking a lesson from
filesystems which have to scale to a lot of metadata (notice how visibility is
generally an attribute of a directory, not of regular files in your typical
Unix FS?). And to solve this problem in SQL now we still have to do a
server-side union, which is a bit sad. But even before we can refactor the API
(v3 anyone?) I don't see it as unworkably slow for a NoSQL driver to track
these kinds of views.


You make really good points here but I don't fully agree.

I don't think the issue is actually translating Glance's models to
NoSQL or NoSQL db's performance, I'm pretty sure we could benefit in some
areas but not all of them. To me, and that's what my comment was referring
to, this is more related to  what kind of data we're actually
treating, the guarantees we should provide and how they are
implemented now.

There are a couple of things that would worry me about an hypothetic
support for NoSQL but I guess one that I'd consider very critical is
migrations. Some could argue asking whether we'd really need them or
not  - when talking about NoSQL databases - but we do. Using a
schemaless database wouldn't mean we don't have a schema. Migrations
are not trivial for some NoSQL databases, plus, this would mean
drivers, most probably, would have to have their own implementation.


The bigger concern to me is that Glance seems a bit trigger-happy with indexes.
But I'm confident we're in a similar boat there: performance in NoSQL won't be
that terrible for the most important use cases, and a later refactoring can put
us on a more sustainable track in the long run. 


I'm not worried about this, though. 




All I'm saying is that we should be careful not to swap one set of
problems for another.



My 2 cents: I am in agreement with Jay.  I am leery of NoSQL being a
direct sub in and I fear that this effort can be adding a large workload
for little benefit.


The goal isn't really to replace sqlalchemy completely. I'm hoping I can create
a space where multiple drivers can operate efficiently without introducing bugs
(i.e. pull all that business logic out of the driver!) I'll be very interested
to see if people can, after such a refactoring, try out some more storage
approaches, such as dropping the sqlalchemy orm in favor of its generic engine
support or direct sql execution, as well as NoSQL what-have-you. We don't have
to make all of the drivers live in the project, so it really can be a good
place for interested parties to experiment.


And this is exactly what I'm concerned about. There's a lot of
business logic implemented at the driver level right now which makes
it really difficult (impossible?) to even think about using a NoSQL
database. However, I'm not even sure that taking BL to a higher level
would be the "go-time" for new NoSQL drivers. 


As mentioned already, this might end up in app-level implementations
that shouldn't be there.

Again, I'm not arguing NoSQL capabilities in this matter - I'm a huge
fan of NoSQL technologies -, what I'd argue is whether they are the
best tool for this task. This is something that should be evaluated in
a per module basis, which I obviously don't have a complete knowledge
of.

Cheers,
FF

--
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listin

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-20 Thread Mark Washenberger
So much great stuff to respond to in this snip and response!


On Mon, Aug 19, 2013 at 2:17 AM, Flavio Percoco  wrote:

> On 19/08/13 02:51 -0400, Jay Pipes wrote:
>
>> On 08/19/2013 12:56 AM, Joshua Harlow wrote:
>>
>>> Another good article from an ex-coworker that keeps on making more and
>>> more sense the more projects I get into...
>>>
>>> http://seldo.com/weblog/2011/**08/11/orm_is_an_antipattern
>>>
>>> Your mileage/opinion though may vary :)
>>>
>>
This article looks great--but I think it depends on taking the incredibly
limited / incorrect view of an ORM that has been popularized and that we
currently employ in many OpenStack projects.

In particular, the critical issue is, are you actually using a mapper? Do
you *know* what the mapper pattern is? The key part of the mapper is that
you've got two components, say A and B, that you desperately want to keep
decoupled. Now, A and B need to interact, but they both are very likely to
need to change. And maybe what's worse, they really suck to change together
at the same time. (A great example of A and B is "db schema" and "business
logic".) Since they need to interact, which one is going to depend on the
other? A mapper M solves this problem by depending on both A and B,
allowing the two key modules to continue to evolve independently.

This approach would be amazing for our CD efforts, because it lets you move
one step at a time. In one deploy, you update the schema and mapper, but
keep the application code the same. In the next change, you just change the
application code. And so forth, allowing a solution to the problem of
temporary schema/code/functionality incompatibilities (well for part of the
problem anyway) that happens during a large-scale deployment.

But of course sqlalchemy's declarative models hamstring any such effort
while simultaneously teaching developers entirely the wrong lesson about
mappers! What I mean is that if I'm using sqlalchemy model objects and want
to change a table definition, I have to change a model object and thus much
of my application code. Decoupling went flying out the window. . . sad
times.



>
>> 2) I highly caution folks who think a No-SQL store is a good storage
>> solution for any of the data currently used by Nova, Glance (registry),
>> Cinder (registry), Ceilometer, and Quantum. All of the data stored and
>> manipulated in those projects is HIGHLY relational data, and not
>> objects/documents. Switching to use a KVS for highly relational data is a
>> terrible decision. You will just end up implementing joins in your code...
>>
>>
> +1
>
> FWIW, I'm a huge fan of NoSQL technologies but I couldn't agree more
> here.
>
>
I have to say I'm kind of baffled by this sentiment (expressed here and
elsewhere in the thread.) I'm not a NoSQL expert, but I hang out with a few
and I'm pretty confident Glance at least is not that relational. We do two
types of joins in glance. The first, like image properties, is basically
just an implementation detail of the sql driver. Its not core to the
application. Any NoSQL implementation will simply completely denormalize
those properties into the image record. (And honestly, so might an
optimized SQL implementation. . .)

The second type of join, image_members, is basically just a hack to solve
the problem created because the glance api offers several simultaneous
implicit "views" of images. Specifically, when you list images in glance,
you are seeing a union of three views: public images, images you own, and
images shared with you. IMO its actually a more scalable and sensible
solution to make these views more explicit and independent in the API and
code, taking a lesson from filesystems which have to scale to a lot of
metadata (notice how visibility is generally an attribute of a directory,
not of regular files in your typical Unix FS?). And to solve this problem
in SQL now we still have to do a server-side union, which is a bit sad. But
even before we can refactor the API (v3 anyone?) I don't see it as
unworkably slow for a NoSQL driver to track these kinds of views.

The bigger concern to me is that Glance seems a bit trigger-happy with
indexes. But I'm confident we're in a similar boat there: performance in
NoSQL won't be that terrible for the most important use cases, and a later
refactoring can put us on a more sustainable track in the long run.

And then, so I'm not just picking on flaper87. .

jbresnah sez:

>> All I'm saying is that we should be careful not to swap one set of
>> problems for another.

> My 2 cents: I am in agreement with Jay.  I am leery of NoSQL being a
> direct sub in and I fear that this effort can be adding a large workload
> for little benefit.

The goal isn't really to replace sqlalchemy completely. I'm hoping I can
create a space where multiple drivers can operate efficiently without
introducing bugs (i.e. pull all that business logic out of the driver!)
I'll be very interested to see i

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Ben Nemec
 

On 08/19/13 20:34, Joshua Harlow wrote: 

> Just a related question, 
> 
> Oslo 'incubator' db code I think depends on eventlet. This means any code 
> that uses the oslo.db code could/would(?) be dependent on eventlet. 
> 
> Will there be some refactoring there to not require it (useful for projects 
> that are trying to move away from eventlet). 
> 
> https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/session.py#L248
>  [1]

Glancing through that file, it looks like the greenthread import is only
used for playing nice with other greenthreads. It should be pretty easy
to make it conditional so we don't require it, but will use it if it's
available.

 -Ben 

Links:
--
[1]
https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/session.py#L248___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Joshua Harlow
Just a related question,

Oslo 'incubator' db code I think depends on eventlet. This means any code that 
uses the oslo.db code could/would(?) be dependent on eventlet.

Will there be some refactoring there to not require it (useful for projects 
that are trying to move away from eventlet).

https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/session.py#L248

From: Boris Pavlovic mailto:bo...@pavlovic.me>>
Reply-To: OpenStack Development Mailing List 
mailto:openstack-dev@lists.openstack.org>>
Date: Monday, August 19, 2013 2:12 PM
To: OpenStack Development Mailing List 
mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

Mark,

But for a variety of reasons, I do not consider the general thrust of "use oslo 
db code" to be approved. Instead, lets continue to consider features from olso 
db on a case by case basis, and see what the right resolution is in each case.

Absolutely agree with this point (e.g. we removed shadow tables from our 
roadmap after some discussion in other threads)
So we are planing to make all changes using our common approach called "baby 
steps" (Not by one giant patch set).

Btw I answered on your question about changed conf parameter in review (I mean 
sql_connection to database.connection).


Best regards,
Boris Pavlovic
---
Mirantis Inc.



On Mon, Aug 19, 2013 at 9:33 PM, Mark Washenberger 
mailto:mark.washenber...@markwash.net>> wrote:
Thanks for refocusing the discussion on your original questions!

Also thanks for this additional summary. I consider the patches you have up for 
review in glance to have a general direction-level green light at this point 
(though I've got a question on the specifics in the ultimate review).

But for a variety of reasons, I do not consider the general thrust of "use oslo 
db code" to be approved. Instead, lets continue to consider features from olso 
db on a case by case basis, and see what the right resolution is in each case.

Thanks for your patience and forbearance, hopefully getting in the patches you 
have submitted now will help unblock progress for your team.

On Mon, Aug 19, 2013 at 3:49 AM, Boris Pavlovic 
mailto:bo...@pavlovic.me>> wrote:
Mark,

Main part of oslo is:
1) common migration testing
2) common sqla.models
3) common hacks around sqla and sqla-migrate
4) common work around engines and sessions


All these points are implemented in Glance almost in the same way as in Oslo.
Also we are able to use only part of this code in Glance, and add some other 
things that are glance related over this code.

Our current 2 patches on review do next things:
1) Copy paste oslo.db code into glance
2) Use sqla session/engine/exception wrappers
3) Remove Glance code that covers session/engine/exception

So I really don't see any bad thing in this code:
1) If you would like to implement other backends => this change won't block it
2) If you would like to make some other sqla utitlites or glance related things 
=> this change won't block it
3) If there are bugs => fix it in oslo and sync => this change won't block it

 So I really don't see any reason to block work around migration to oslo.db 
code in Glance.


Best regards,
Boris Pavlovic
---
Mirantis Inc.




On Fri, Aug 16, 2013 at 10:41 PM, Mark Washenberger 
mailto:mark.washenber...@markwash.net>> wrote:
I would prefer to pick and choose which parts of oslo common db code to reuse 
in glance. Most parts there look great and very useful. However, some parts 
seem like they would conflict with several goals we have.

1) To improve code sanity, we need to break away from the idea of having one 
giant db api interface
2) We need to improve our position with respect to new, non SQL drivers
- mostly, we need to focus first on removing business logic (especially 
authz) from database driver code
- we also need to break away from the strict functional interface, because 
it limits our ability to express query filters and tends to lump all filter 
handling for a given function into a single code block (which ends up being 
defect-rich and confusing as hell to reimplement)
3) It is unfortunate, but I must admit that Glance's code in general is pretty 
heavily coupled to the database code and in particular the schema. Basically 
the only tool we have to manage that problem until we can fix it is to try to 
be as careful as possible about how we change the db code and schema. By 
importing another project, we lose some of that control. Also, even with the 
copy-paste model for oslo incubator, code in oslo does have some of its own 
reasons to change, so we could potentially end up in a conflict where glance db 
migrations (which are operationally costly) have to happen for reasons that 
don't really matter to glance.

So rather than framing this as "glance 

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Boris Pavlovic
Mark,

But for a variety of reasons, I do not consider the general thrust of "use
oslo db code" to be approved. Instead, lets continue to consider features
from olso db on a case by case basis, and see what the right resolution is
in each case.

Absolutely agree with this point (e.g. we removed shadow tables from our
roadmap after some discussion in other threads)
So we are planing to make all changes using our common approach called
"baby steps" (Not by one giant patch set).

Btw I answered on your question about changed conf parameter in review (I
mean sql_connection to database.connection).


Best regards,
Boris Pavlovic
---
Mirantis Inc.



On Mon, Aug 19, 2013 at 9:33 PM, Mark Washenberger <
mark.washenber...@markwash.net> wrote:

> Thanks for refocusing the discussion on your original questions!
>
> Also thanks for this additional summary. I consider the patches you have
> up for review in glance to have a general direction-level green light at
> this point (though I've got a question on the specifics in the ultimate
> review).
>
> But for a variety of reasons, I do not consider the general thrust of "use
> oslo db code" to be approved. Instead, lets continue to consider features
> from olso db on a case by case basis, and see what the right resolution is
> in each case.
>
> Thanks for your patience and forbearance, hopefully getting in the patches
> you have submitted now will help unblock progress for your team.
>
> On Mon, Aug 19, 2013 at 3:49 AM, Boris Pavlovic  wrote:
>
>> Mark,
>>
>> Main part of oslo is:
>> 1) common migration testing
>> 2) common sqla.models
>> 3) common hacks around sqla and sqla-migrate
>> 4) common work around engines and sessions
>>
>>
>> All these points are implemented in Glance almost in the same way as in
>> Oslo.
>> Also we are able to use only part of this code in Glance, and add some
>> other things that are glance related over this code.
>>
>> Our current 2 patches on review do next things:
>> 1) Copy paste oslo.db code into glance
>> 2) Use sqla session/engine/exception wrappers
>> 3) Remove Glance code that covers session/engine/exception
>>
>> So I really don't see any bad thing in this code:
>> 1) If you would like to implement other backends => this change won't
>> block it
>> 2) If you would like to make some other sqla utitlites or glance related
>> things => this change won't block it
>> 3) If there are bugs => fix it in oslo and sync => this change won't
>> block it
>>
>>  So I really don't see any reason to block work around migration to
>> oslo.db code in Glance.
>>
>>
>> Best regards,
>> Boris Pavlovic
>> ---
>> Mirantis Inc.
>>
>>
>>
>>
>> On Fri, Aug 16, 2013 at 10:41 PM, Mark Washenberger <
>> mark.washenber...@markwash.net> wrote:
>>
>>> I would prefer to pick and choose which parts of oslo common db code to
>>> reuse in glance. Most parts there look great and very useful. However, some
>>> parts seem like they would conflict with several goals we have.
>>>
>>> 1) To improve code sanity, we need to break away from the idea of having
>>> one giant db api interface
>>> 2) We need to improve our position with respect to new, non SQL drivers
>>> - mostly, we need to focus first on removing business logic
>>> (especially authz) from database driver code
>>> - we also need to break away from the strict functional interface,
>>> because it limits our ability to express query filters and tends to lump
>>> all filter handling for a given function into a single code block (which
>>> ends up being defect-rich and confusing as hell to reimplement)
>>> 3) It is unfortunate, but I must admit that Glance's code in general is
>>> pretty heavily coupled to the database code and in particular the schema.
>>> Basically the only tool we have to manage that problem until we can fix it
>>> is to try to be as careful as possible about how we change the db code and
>>> schema. By importing another project, we lose some of that control. Also,
>>> even with the copy-paste model for oslo incubator, code in oslo does have
>>> some of its own reasons to change, so we could potentially end up in a
>>> conflict where glance db migrations (which are operationally costly) have
>>> to happen for reasons that don't really matter to glance.
>>>
>>> So rather than framing this as "glance needs to use oslo common db
>>> code", I would appreciate framing it as "glance database code should have
>>> features X, Y, and Z, some of which it can get by using oslo code." Indeed,
>>> I believe in IRC we discussed the idea of writing up a wiki listing these
>>> feature improvements, which would allow a finer granularity for evaluation.
>>> I really prefer that format because it feels more like planning and less
>>> like debate :-)
>>>
>>>  I have a few responses inline below.
>>>
>>> On Fri, Aug 16, 2013 at 6:31 AM, Victor Sergeyev >> > wrote:
>>>
 Hello All.

 Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
 questions about Oslo DB code, and why i

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread John Bresnahan
> All I'm saying is that we should be careful not to swap one set of
> problems for another. 

My 2 cents: I am in agreement with Jay.  I am leery of NoSQL being a
direct sub in and I fear that this effort can be adding a large workload
for little benefit.

A somewhat related post:
http://www.joelonsoftware.com/articles/fog69.html

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Mark Washenberger
Thanks for refocusing the discussion on your original questions!

Also thanks for this additional summary. I consider the patches you have up
for review in glance to have a general direction-level green light at this
point (though I've got a question on the specifics in the ultimate review).

But for a variety of reasons, I do not consider the general thrust of "use
oslo db code" to be approved. Instead, lets continue to consider features
from olso db on a case by case basis, and see what the right resolution is
in each case.

Thanks for your patience and forbearance, hopefully getting in the patches
you have submitted now will help unblock progress for your team.

On Mon, Aug 19, 2013 at 3:49 AM, Boris Pavlovic  wrote:

> Mark,
>
> Main part of oslo is:
> 1) common migration testing
> 2) common sqla.models
> 3) common hacks around sqla and sqla-migrate
> 4) common work around engines and sessions
>
>
> All these points are implemented in Glance almost in the same way as in
> Oslo.
> Also we are able to use only part of this code in Glance, and add some
> other things that are glance related over this code.
>
> Our current 2 patches on review do next things:
> 1) Copy paste oslo.db code into glance
> 2) Use sqla session/engine/exception wrappers
> 3) Remove Glance code that covers session/engine/exception
>
> So I really don't see any bad thing in this code:
> 1) If you would like to implement other backends => this change won't
> block it
> 2) If you would like to make some other sqla utitlites or glance related
> things => this change won't block it
> 3) If there are bugs => fix it in oslo and sync => this change won't block
> it
>
>  So I really don't see any reason to block work around migration to
> oslo.db code in Glance.
>
>
> Best regards,
> Boris Pavlovic
> ---
> Mirantis Inc.
>
>
>
>
> On Fri, Aug 16, 2013 at 10:41 PM, Mark Washenberger <
> mark.washenber...@markwash.net> wrote:
>
>> I would prefer to pick and choose which parts of oslo common db code to
>> reuse in glance. Most parts there look great and very useful. However, some
>> parts seem like they would conflict with several goals we have.
>>
>> 1) To improve code sanity, we need to break away from the idea of having
>> one giant db api interface
>> 2) We need to improve our position with respect to new, non SQL drivers
>> - mostly, we need to focus first on removing business logic
>> (especially authz) from database driver code
>> - we also need to break away from the strict functional interface,
>> because it limits our ability to express query filters and tends to lump
>> all filter handling for a given function into a single code block (which
>> ends up being defect-rich and confusing as hell to reimplement)
>> 3) It is unfortunate, but I must admit that Glance's code in general is
>> pretty heavily coupled to the database code and in particular the schema.
>> Basically the only tool we have to manage that problem until we can fix it
>> is to try to be as careful as possible about how we change the db code and
>> schema. By importing another project, we lose some of that control. Also,
>> even with the copy-paste model for oslo incubator, code in oslo does have
>> some of its own reasons to change, so we could potentially end up in a
>> conflict where glance db migrations (which are operationally costly) have
>> to happen for reasons that don't really matter to glance.
>>
>> So rather than framing this as "glance needs to use oslo common db code",
>> I would appreciate framing it as "glance database code should have features
>> X, Y, and Z, some of which it can get by using oslo code." Indeed, I
>> believe in IRC we discussed the idea of writing up a wiki listing these
>> feature improvements, which would allow a finer granularity for evaluation.
>> I really prefer that format because it feels more like planning and less
>> like debate :-)
>>
>>  I have a few responses inline below.
>>
>> On Fri, Aug 16, 2013 at 6:31 AM, Victor Sergeyev 
>> wrote:
>>
>>> Hello All.
>>>
>>> Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
>>> questions about Oslo DB code, and why is it so important to use it instead
>>> of custom implementation and so on. As there were a lot of questions it was
>>> really hard to answer on all this questions in IRC. So we decided that
>>> mailing list is better place for such things.
>>>
>>> List of main questions:
>>>
>>> 1. What includes oslo DB code?
>>> 2. Why is it safe to replace custom implementation by Oslo DB code?
>>> 3. Why oslo DB code is better than custom implementation?
>>> 4. Why oslo DB code won’t slow up project development progress?
>>> 5. What we are going actually to do in Glance?
>>> 6. What is the current status?
>>>
>>> Answers:
>>>
>>> 1. What includes oslo DB code?
>>>
>>> Currently Oslo code improves different aspects around DB:
>>> -- Work with SQLAlchemy models, engine and session
>>> -- Lot of tools for work with SQLAlchemy
>>>
>> -- Work with unique keys
>>>

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Joshua Harlow
+1 for trying things differently :)

On 8/19/13 12:14 AM, "Jay Pipes"  wrote:

>On 08/18/2013 10:33 PM, Joe Gordon wrote:
>> An alternative I think would be better would be to scrap the
>>use of
>> the SQLAlchemy ORM; keep using the DB engine abstraction
>>support.
>>
>> +1, I am hoping this will provide noticeable performance benefits while
>> being agnostic of what DB back-end is being used.  With the way we use
>>   SQLALchemy being 25x slower then MySQL we have lots of room for
>> improvement (see http://paste.openstack.org/show/44143/ from
>> https://bugs.launchpad.net/nova/+bug/1212418).
>
>@require_admin_context
>def compute_node_get_all(context):
> return model_query(context, models.ComputeNode).\
> options(joinedload('service')).\
> options(joinedload('stats')).\
> all()
>
>Well, yeah... I suppose if you are attempting to create 115K objects in
>memory in Python (Need to collate each ComputeNode model object and each
>of its relation objects for Service and Stats) you are going to run into
>some performance problems. :)
>
>Would be interesting to see what the performance difference would be if
>you instead had dicts instead of model objects and did something like
>this instead (code not tested, just off top of head...):
>
># Assume a method to_dict() that takes a Model
># and returns a dict with appropriate empty dicts for
># relationship fields.
>
>qr = session.query(ComputeNode).join(Service).join(Stats)
>
>results = {}
>
>for record in qr:
>   node_id = record.ComputeNode.id
>   service_id = record.Service.id
>   stat_id = record.ComputeNodeStat.id
>   if node_id not in results.keys():
> results[node_id] = to_dict(record.ComputeNode)
>   if service_id not in results[node_id]['services'].keys():
> results[node_id]['services'][service_id] = to_dict(record.Service)
>   if stat_id not in results[node_id]['stats'].keys():
> results[node_id]['stats'][stat_id] = to_dict(record.ComputeNodeStat)
>
>return results
>
>Whether it would be any faster than SQLAlchemy's joinedload...
>
>Besides that, though, probably is a good idea to look at even the
>existence of DB calls that potentially do that kind of massive query
>returning as A Bad Thing...
>
>Best,
>-jay
>
>___
>OpenStack-dev mailing list
>OpenStack-dev@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Boris Pavlovic
Mark,

Main part of oslo is:
1) common migration testing
2) common sqla.models
3) common hacks around sqla and sqla-migrate
4) common work around engines and sessions


All these points are implemented in Glance almost in the same way as in
Oslo.
Also we are able to use only part of this code in Glance, and add some
other things that are glance related over this code.

Our current 2 patches on review do next things:
1) Copy paste oslo.db code into glance
2) Use sqla session/engine/exception wrappers
3) Remove Glance code that covers session/engine/exception

So I really don't see any bad thing in this code:
1) If you would like to implement other backends => this change won't block
it
2) If you would like to make some other sqla utitlites or glance related
things => this change won't block it
3) If there are bugs => fix it in oslo and sync => this change won't block
it

 So I really don't see any reason to block work around migration to oslo.db
code in Glance.


Best regards,
Boris Pavlovic
---
Mirantis Inc.




On Fri, Aug 16, 2013 at 10:41 PM, Mark Washenberger <
mark.washenber...@markwash.net> wrote:

> I would prefer to pick and choose which parts of oslo common db code to
> reuse in glance. Most parts there look great and very useful. However, some
> parts seem like they would conflict with several goals we have.
>
> 1) To improve code sanity, we need to break away from the idea of having
> one giant db api interface
> 2) We need to improve our position with respect to new, non SQL drivers
> - mostly, we need to focus first on removing business logic
> (especially authz) from database driver code
> - we also need to break away from the strict functional interface,
> because it limits our ability to express query filters and tends to lump
> all filter handling for a given function into a single code block (which
> ends up being defect-rich and confusing as hell to reimplement)
> 3) It is unfortunate, but I must admit that Glance's code in general is
> pretty heavily coupled to the database code and in particular the schema.
> Basically the only tool we have to manage that problem until we can fix it
> is to try to be as careful as possible about how we change the db code and
> schema. By importing another project, we lose some of that control. Also,
> even with the copy-paste model for oslo incubator, code in oslo does have
> some of its own reasons to change, so we could potentially end up in a
> conflict where glance db migrations (which are operationally costly) have
> to happen for reasons that don't really matter to glance.
>
> So rather than framing this as "glance needs to use oslo common db code",
> I would appreciate framing it as "glance database code should have features
> X, Y, and Z, some of which it can get by using oslo code." Indeed, I
> believe in IRC we discussed the idea of writing up a wiki listing these
> feature improvements, which would allow a finer granularity for evaluation.
> I really prefer that format because it feels more like planning and less
> like debate :-)
>
>  I have a few responses inline below.
>
> On Fri, Aug 16, 2013 at 6:31 AM, Victor Sergeyev 
> wrote:
>
>> Hello All.
>>
>> Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
>> questions about Oslo DB code, and why is it so important to use it instead
>> of custom implementation and so on. As there were a lot of questions it was
>> really hard to answer on all this questions in IRC. So we decided that
>> mailing list is better place for such things.
>>
>> List of main questions:
>>
>> 1. What includes oslo DB code?
>> 2. Why is it safe to replace custom implementation by Oslo DB code?
>> 3. Why oslo DB code is better than custom implementation?
>> 4. Why oslo DB code won’t slow up project development progress?
>> 5. What we are going actually to do in Glance?
>> 6. What is the current status?
>>
>> Answers:
>>
>> 1. What includes oslo DB code?
>>
>> Currently Oslo code improves different aspects around DB:
>> -- Work with SQLAlchemy models, engine and session
>> -- Lot of tools for work with SQLAlchemy
>>
> -- Work with unique keys
>> -- Base test case for work with database
>> -- Test migrations against different backends
>> -- Sync DB Models with actual schemas in DB (add test that they are
>> equivalent)
>>
>>
>> 2. Why is it safe to replace custom implementation by Oslo DB code?
>>
>> Oslo module, as base openstack module, takes care about code quality.
>> Usually, common code more readable (most of flake8 checks enabled in Oslo)
>> and have better test coverage.  Also it was tested in different use-cases
>> (in production also) in an other projects so bugs in Oslo code were already
>> fixed. So we can be sure, that we use high-quality code.
>>
>
> Alas, while testing and static style analysis are important, they are not
> the only relevant aspects of code quality. Architectural choices are also
> relevant. The best reusable code places few requirements on the code that
> reuses

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Flavio Percoco

On 19/08/13 02:51 -0400, Jay Pipes wrote:

On 08/19/2013 12:56 AM, Joshua Harlow wrote:

Another good article from an ex-coworker that keeps on making more and
more sense the more projects I get into...

http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern

Your mileage/opinion though may vary :)


2) I highly caution folks who think a No-SQL store is a good storage 
solution for any of the data currently used by Nova, Glance 
(registry), Cinder (registry), Ceilometer, and Quantum. All of the 
data stored and manipulated in those projects is HIGHLY relational 
data, and not objects/documents. Switching to use a KVS for highly 
relational data is a terrible decision. You will just end up 
implementing joins in your code...




+1

FWIW, I'm a huge fan of NoSQL technologies but I couldn't agree more
here.

FF

--
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Flavio Percoco

On 18/08/13 18:47 -0400, Jay Pipes wrote:

On 08/18/2013 06:28 PM, Joe Gordon wrote:


On Aug 18, 2013 3:58 PM, "Jay Pipes" mailto:jaypi...@gmail.com>> wrote:
>
> On 08/18/2013 03:53 AM, Joshua Harlow wrote:
>>
>> I always just liked SQL as the database abstraction layer ;)
>>
>> On a more serious note I think novas new object model might be a way
to go but in all honesty there won't be a one size fits all solution. I
just don't think sqlalchemy is that solution personally (maybe if we
just use sqlalchemy core it will be better and eject just the orm layer).
>
>
> What is specifically wrong with SQLAlchemy's ORM layer? What would
you replace it with? Why would use SQLAlchemy's "core" be better?
>
> I've seen little evidence that SQLAlchemy's ORM layer is the cause
for database performance problems. Rather, I've found that the database
schemas in use -- and in some cases, the *way* that the SQLAlchemy ORM
is called (for example, doing correlated subqueries instead of straight
joins) -- are primary causes for database performance issues.

From what I have seen the issue is both the queries and the ORM layer.
See https://bugs.launchpad.net/nova/+bug/1212418  for details.


Good point.

For the record, I'm not a fan of lazy/eager loading of relations in 
the models themselves, but instead always being explicit about the 
exact data you wish to query for.


It's similar in nature to the SQL best practice of never doing SELECT 
* FROM  and instead of always being explicity about the columns 
you wish to retrieve...




+1

I've seen a couple of cases where this is not being taken under
consideration. I'd like to see some of the lazy loaded relations being
explicitly loaded. 


I think a good rule for this is:

"If you know you'll need it, then load it. If you don't know it, then
you're *probably* doing something wrong."

Cheers,
FF

--
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Julien Danjou
On Mon, Aug 19 2013, Jay Pipes wrote:

> 2) I highly caution folks who think a No-SQL store is a good storage
> solution for any of the data currently used by Nova, Glance (registry),
> Cinder (registry), Ceilometer, and Quantum. All of the data stored and
> manipulated in those projects is HIGHLY relational data, and not
> objects/documents. Switching to use a KVS for highly relational data is a
> terrible decision. You will just end up implementing joins in your code...

+1000.

-- 
Julien Danjou
# Free Software hacker # freelance consultant
# http://julien.danjou.info


signature.asc
Description: PGP signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Jay Pipes
OK, cool. I'm in agreement with your explained storage/logic separation 
below.


Cheers,
-jay

On 08/19/2013 03:12 AM, Robert Collins wrote:

On 19 August 2013 18:35, Jay Pipes  wrote:


http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html

There is no proper use of an ORM.



I'm not a super-fan of ORMs, Robert. I'm not sure why you're insisting on
taking me down this road...


Sorry, not sure how we ended up here ;)


All I'm saying is that we should be careful not to swap one set of problems
for another. I say this because I've seen the Nova data-access code develop
from its very earliest days, up to this point. I've seen the horrors of
trying to mask an object approach on top of a non-relational data store,
witnessed numerous attempts to rewrite the way that connection pooling and
session handling is done, and in general just noticed the tension between
the two engineering factions that want to keep things agnostic towards
backend storage and at the same time make the backend storage perform and
scale adequately.


Ah! Ok, completely agree: playing flip-flop on problem sets would be a
poor outcome.


I'm not sure why you are being so aggressive about this topic. I certainly
am not being aggressive about my responses -- just cautioning that the
existing codebase has seen its fair share of refactoring, some of which has
been a failure and had to be reverted. I would hate to jump into a frenzy to
radically change the way that the data access code works in Nova without a
good discussion.


I didn't intend to be aggressive - sorry - super sorry in fact. I've
been burnt by months of effort turning around problem codebases where
the ORM was a significant cause of the problems.



But then I guarantee somebody is gonna spend a bunch of time writing an
object-oriented API to the model objects because the ORM is very useful
for
the data modification part of the DB interaction.



!cite - seriously...



? I give an example below... a cautionary tale if you will, about one
possible consequence of "getting rid of the ORM".


I think what I really meant here is 'you say months, but if we're
writing an object-orientated API surely we'd just use one of the
mapping techniques available in SQLAlchemy..'


This strawman is one way that it might be written. Given that a
growing set of our projects have non-SQL backends, this doesn't look
like the obvious way to phrase it to me.



I'm using the SQLAlchemy Core API above, with none of the SQLAlchemy ORM
code... which is (I thought), what you were proposing we do? How is that a
strawman argument? :(


So what is in my head is that we have two layers:
business logic
storage logic

And the thing I don't like about the ORM approach is that our business
logic objects are storage logic objects - even though we don't use
http://martinfowler.com/eaaCatalog/domainModel.html we can easily
trigger late evaluation when traversing collections. In particular
because we have large numbers of developers who are likely going to
not be holding the entire problem domain in their head; the churn that
results on code and design tends to throw things out again and again
over time. And we have IMO too much business logic in the
db/sqlalchemy/api.py files scattered around.

So, what I'd like to see is something where the storage layer and
logic layer are more thoroughly decoupled: only return plain ol Python
objects from the DB layer; but within that layer I wouldn't object to
an ORM being used; secondly I'd like to make sure we don't end up
making business decisions in the storage layer, because that makes it
harder when porting to a different storage layer - such as the nova
conductor is.

So the business logic layer for adding a fixed IP would be something like:
i = business.Instance.find(blah=blah)
ip = business.FixedIP(blah=blah)
i.fixed_ips.append(ip)
storage.save(i)

i and ip would be plain ol python objects
storage.save would have the same semantics as an RPC call - it could
do a transaction itself, but there's no holding transactions between
calls to save.

This is very close to:


instead of this:

i = Instance(blah=blah)
ip = FixedIp(blah=blah)
i.fixed_ips.append(ip)
session.add(u)
session.commit()


But there is no ORM exposed to the developers working with the storage
API - it's contained.


And so you've thrown the baby out with the bathwater and made more work
for
everyone.



Perhaps; or perhaps we've avoided a raft of death-by-thousand-cuts
bugs across the project.



Could just as easily introduce the same bugs by radically redesigning the
data access code without first considering all sides of the problem domain.


Totally!

Again, sorry for the tone before, I can only claim a) been burnt in
the past and and b) a week or so of reduced sleep thanks to baby :(.

-Rob




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/list

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Jay Pipes

On 08/18/2013 10:33 PM, Joe Gordon wrote:

An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.

+1, I am hoping this will provide noticeable performance benefits while
being agnostic of what DB back-end is being used.  With the way we use
  SQLALchemy being 25x slower then MySQL we have lots of room for
improvement (see http://paste.openstack.org/show/44143/ from
https://bugs.launchpad.net/nova/+bug/1212418).


@require_admin_context
def compute_node_get_all(context):
return model_query(context, models.ComputeNode).\
options(joinedload('service')).\
options(joinedload('stats')).\
all()

Well, yeah... I suppose if you are attempting to create 115K objects in 
memory in Python (Need to collate each ComputeNode model object and each 
of its relation objects for Service and Stats) you are going to run into 
some performance problems. :)


Would be interesting to see what the performance difference would be if 
you instead had dicts instead of model objects and did something like 
this instead (code not tested, just off top of head...):


# Assume a method to_dict() that takes a Model
# and returns a dict with appropriate empty dicts for
# relationship fields.

qr = session.query(ComputeNode).join(Service).join(Stats)

results = {}

for record in qr:
  node_id = record.ComputeNode.id
  service_id = record.Service.id
  stat_id = record.ComputeNodeStat.id
  if node_id not in results.keys():
results[node_id] = to_dict(record.ComputeNode)
  if service_id not in results[node_id]['services'].keys():
results[node_id]['services'][service_id] = to_dict(record.Service)
  if stat_id not in results[node_id]['stats'].keys():
results[node_id]['stats'][stat_id] = to_dict(record.ComputeNodeStat)

return results

Whether it would be any faster than SQLAlchemy's joinedload...

Besides that, though, probably is a good idea to look at even the 
existence of DB calls that potentially do that kind of massive query 
returning as A Bad Thing...


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-19 Thread Robert Collins
On 19 August 2013 18:35, Jay Pipes  wrote:

>> http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html
>>
>> There is no proper use of an ORM.
>
>
> I'm not a super-fan of ORMs, Robert. I'm not sure why you're insisting on
> taking me down this road...

Sorry, not sure how we ended up here ;)

> All I'm saying is that we should be careful not to swap one set of problems
> for another. I say this because I've seen the Nova data-access code develop
> from its very earliest days, up to this point. I've seen the horrors of
> trying to mask an object approach on top of a non-relational data store,
> witnessed numerous attempts to rewrite the way that connection pooling and
> session handling is done, and in general just noticed the tension between
> the two engineering factions that want to keep things agnostic towards
> backend storage and at the same time make the backend storage perform and
> scale adequately.

Ah! Ok, completely agree: playing flip-flop on problem sets would be a
poor outcome.

> I'm not sure why you are being so aggressive about this topic. I certainly
> am not being aggressive about my responses -- just cautioning that the
> existing codebase has seen its fair share of refactoring, some of which has
> been a failure and had to be reverted. I would hate to jump into a frenzy to
> radically change the way that the data access code works in Nova without a
> good discussion.

I didn't intend to be aggressive - sorry - super sorry in fact. I've
been burnt by months of effort turning around problem codebases where
the ORM was a significant cause of the problems.


>>> But then I guarantee somebody is gonna spend a bunch of time writing an
>>> object-oriented API to the model objects because the ORM is very useful
>>> for
>>> the data modification part of the DB interaction.
>>
>>
>> !cite - seriously...
>
>
> ? I give an example below... a cautionary tale if you will, about one
> possible consequence of "getting rid of the ORM".

I think what I really meant here is 'you say months, but if we're
writing an object-orientated API surely we'd just use one of the
mapping techniques available in SQLAlchemy..'

>> This strawman is one way that it might be written. Given that a
>> growing set of our projects have non-SQL backends, this doesn't look
>> like the obvious way to phrase it to me.
>
>
> I'm using the SQLAlchemy Core API above, with none of the SQLAlchemy ORM
> code... which is (I thought), what you were proposing we do? How is that a
> strawman argument? :(

So what is in my head is that we have two layers:
business logic
storage logic

And the thing I don't like about the ORM approach is that our business
logic objects are storage logic objects - even though we don't use
http://martinfowler.com/eaaCatalog/domainModel.html we can easily
trigger late evaluation when traversing collections. In particular
because we have large numbers of developers who are likely going to
not be holding the entire problem domain in their head; the churn that
results on code and design tends to throw things out again and again
over time. And we have IMO too much business logic in the
db/sqlalchemy/api.py files scattered around.

So, what I'd like to see is something where the storage layer and
logic layer are more thoroughly decoupled: only return plain ol Python
objects from the DB layer; but within that layer I wouldn't object to
an ORM being used; secondly I'd like to make sure we don't end up
making business decisions in the storage layer, because that makes it
harder when porting to a different storage layer - such as the nova
conductor is.

So the business logic layer for adding a fixed IP would be something like:
i = business.Instance.find(blah=blah)
ip = business.FixedIP(blah=blah)
i.fixed_ips.append(ip)
storage.save(i)

i and ip would be plain ol python objects
storage.save would have the same semantics as an RPC call - it could
do a transaction itself, but there's no holding transactions between
calls to save.

This is very close to:

>>> instead of this:
>>>
>>> i = Instance(blah=blah)
>>> ip = FixedIp(blah=blah)
>>> i.fixed_ips.append(ip)
>>> session.add(u)
>>> session.commit()

But there is no ORM exposed to the developers working with the storage
API - it's contained.

>>> And so you've thrown the baby out with the bathwater and made more work
>>> for
>>> everyone.
>>
>>
>> Perhaps; or perhaps we've avoided a raft of death-by-thousand-cuts
>> bugs across the project.
>
>
> Could just as easily introduce the same bugs by radically redesigning the
> data access code without first considering all sides of the problem domain.

Totally!

Again, sorry for the tone before, I can only claim a) been burnt in
the past and and b) a week or so of reduced sleep thanks to baby :(.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
h

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/19/2013 12:56 AM, Joshua Harlow wrote:

Another good article from an ex-coworker that keeps on making more and
more sense the more projects I get into...

http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern

Your mileage/opinion though may vary :)


I don't disagree with most of that article. All good points.

However, I will say a couple things:

1) We can still use the SQLAlchemy ORM module -- Query and Session 
object specifically, along with using the SQLAlchemy Model base class 
with no relation() loading at all in the Model classes -- and get good 
performance. We just wouldn't use the ActiveRecord pattern.


2) I highly caution folks who think a No-SQL store is a good storage 
solution for any of the data currently used by Nova, Glance (registry), 
Cinder (registry), Ceilometer, and Quantum. All of the data stored and 
manipulated in those projects is HIGHLY relational data, and not 
objects/documents. Switching to use a KVS for highly relational data is 
a terrible decision. You will just end up implementing joins in your code...


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/18/2013 11:07 PM, Robert Collins wrote:

On 19 August 2013 14:22, Jay Pipes  wrote:


I'm completely with Joshua here - the ORM layer is more often than not
a source of bugs and performance issues.


If used improperly, yep.


http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html

There is no proper use of an ORM.


I'm not a super-fan of ORMs, Robert. I'm not sure why you're insisting 
on taking me down this road...



We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
lower layer. It's the model objects themselves that we use the ORM
for, and we could use SQLAlchemy's lower layers but not the ORM.


Hmmm, not quite... see below.



An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.


Just keep in mind that the Session and Query objects and their related APIs
are in the SQLAlchemy ORM, not the SQLAlchemy Core.


Ok, so either it's not a bright line, or we'd need to have an
alternative thing - not just a reimplementation either, cause that's
pointless.


All I'm saying is that we should be careful not to swap one set of 
problems for another. I say this because I've seen the Nova data-access 
code develop from its very earliest days, up to this point. I've seen 
the horrors of trying to mask an object approach on top of a 
non-relational data store, witnessed numerous attempts to rewrite the 
way that connection pooling and session handling is done, and in general 
just noticed the tension between the two engineering factions that want 
to keep things agnostic towards backend storage and at the same time 
make the backend storage perform and scale adequately.


I'm not sure why you are being so aggressive about this topic. I 
certainly am not being aggressive about my responses -- just cautioning 
that the existing codebase has seen its fair share of refactoring, some 
of which has been a failure and had to be reverted. I would hate to jump 
into a frenzy to radically change the way that the data access code 
works in Nova without a good discussion.



But sure, ok.

But then I guarantee somebody is gonna spend a bunch of time writing an
object-oriented API to the model objects because the ORM is very useful for
the data modification part of the DB interaction.


!cite - seriously...


? I give an example below... a cautionary tale if you will, about one 
possible consequence of "getting rid of the ORM".



Because people will complain about having to do this:

conn = engine.connection()
# instances is the sqlalchemy Table object for instances
inst_ins = instances.insert().values(blah=blah)
ip_ins = fixed_ips.insert().values(blah=blah)
conn.execute(ip_ins)
conn.execute(inst_ins)
conn.close()


This strawman is one way that it might be written. Given that a
growing set of our projects have non-SQL backends, this doesn't look
like the obvious way to phrase it to me.


I'm using the SQLAlchemy Core API above, with none of the SQLAlchemy ORM 
code... which is (I thought), what you were proposing we do? How is that 
a strawman argument? :(



instead of this:

i = Instance(blah=blah)
ip = FixedIp(blah=blah)
i.fixed_ips.append(ip)
session.add(u)
session.commit()

And so you've thrown the baby out with the bathwater and made more work for
everyone.


Perhaps; or perhaps we've avoided a raft of death-by-thousand-cuts
bugs across the project.


Could just as easily introduce the same bugs by radically redesigning 
the data access code without first considering all sides of the problem 
domain.


-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joshua Harlow
Another good article from an ex-coworker that keeps on making more and more 
sense the more projects I get into...

http://seldo.com/weblog/2011/08/11/orm_is_an_antipattern

Your mileage/opinion though may vary :)

Sent from my really tiny device...

On Aug 18, 2013, at 8:12 PM, "Robert Collins" 
mailto:robe...@robertcollins.net>> wrote:

On 19 August 2013 14:22, Jay Pipes 
mailto:jaypi...@gmail.com>> wrote:

I'm completely with Joshua here - the ORM layer is more often than not
a source of bugs and performance issues.


If used improperly, yep.

http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html

There is no proper use of an ORM.


We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
lower layer. It's the model objects themselves that we use the ORM
for, and we could use SQLAlchemy's lower layers but not the ORM.


Hmmm, not quite... see below.



An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.


Just keep in mind that the Session and Query objects and their related APIs
are in the SQLAlchemy ORM, not the SQLAlchemy Core.

Ok, so either it's not a bright line, or we'd need to have an
alternative thing - not just a reimplementation either, cause that's
pointless.

But sure, ok.

But then I guarantee somebody is gonna spend a bunch of time writing an
object-oriented API to the model objects because the ORM is very useful for
the data modification part of the DB interaction.

!cite - seriously...

Because people will complain about having to do this:

conn = engine.connection()
# instances is the sqlalchemy Table object for instances
inst_ins = instances.insert().values(blah=blah)
ip_ins = fixed_ips.insert().values(blah=blah)
conn.execute(ip_ins)
conn.execute(inst_ins)
conn.close()

This strawman is one way that it might be written. Given that a
growing set of our projects have non-SQL backends, this doesn't look
like the obvious way to phrase it to me.

instead of this:

i = Instance(blah=blah)
ip = FixedIp(blah=blah)
i.fixed_ips.append(ip)
session.add(u)
session.commit()

And so you've thrown the baby out with the bathwater and made more work for
everyone.

Perhaps; or perhaps we've avoided a raft of death-by-thousand-cuts
bugs across the project.

-Rob

--
Robert Collins mailto:rbtcoll...@hp.com>>
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joshua Harlow
It will be I interesting to see how it works out in nova, correct me if I am 
wrong but nova has even more onion layers than other openstack projects.

For ex:

Nova compute <->unified object model <->rpc<->conductor<->sqlalchemy ORM 
model<->SQL<->your db

Is nova moving away from the ORM model or is the above somewhat right?

My opinion is still out on how this will all workout there and what it's impact 
will be in the end. Just I hope the onion layers are worth it in the end.

Sent from my really tiny device...

On Aug 18, 2013, at 7:37 PM, "Joe Gordon" 
mailto:joe.gord...@gmail.com>> wrote:




On Sun, Aug 18, 2013 at 10:22 PM, Jay Pipes 
mailto:jaypi...@gmail.com>> wrote:
On 08/18/2013 09:56 PM, Robert Collins wrote:
On 19 August 2013 10:43, Jay Pipes 
mailto:jaypi...@gmail.com>> wrote:
On 08/18/2013 06:08 PM, Joshua Harlow wrote:

In my opinion (and just an opinion that I know everyone doesn't share) ORM
layers are bulky, restrictive and overly complicate and confuse the reader
of the code (code is read more often than written) and require another layer
of understanding (a layer is useful if it adds good value, I am not sure
sqlalchemy ORM layer does add said value).


The usefulness of SQLAlchemy in this case is its ability to abstract away
the different database backends used in both development and production
environments (SQLite, MySQL, and PostgreSQL typically, though I'm sure folks
are running on other backends). The usefulness of the ORM over raw SQL is,
of course, the ability for the ORM to provide a singular interface for the
different SQL dialects that those underlying backends support.

Thats not the ORM layer. The SQL dialect layer is a layer below the
ORM : the ORM is the layer that provides sql <-> model translation,
including the descriptors that make assignment and dereferencing
trigger SQL.

OK, fair enough.


I'm completely with Joshua here - the ORM layer is more often than not
a source of bugs and performance issues.

If used improperly, yep.


If everyone was using PostgreSQL or everyone was using MySQL, there'd be
less of a point to using an ORM like SQLAlchemy's. Instead, you'd use a
simple db abstraction class like what's in Swift (which only uses SQLite).
But, one of OpenStack's design principles is to be as agnostic as possible
about underlying deployment things like database or MQ infrastructure, and
one of the ramifications of that is abstraction layers...

We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
lower layer. It's the model objects themselves that we use the ORM
for, and we could use SQLAlchemy's lower layers but not the ORM.

Hmmm, not quite... see below.


My point to Mark W was not that I preferred a procedural approach to an
object-oriented one. My point was that I would hope that the direction was
not to swap out the procedural abstraction DB API for an object-oriented
one; instead, we should scrap the entire abstraction DB API entirely...and
just use SQLAlchemy.

An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.

+1, I am hoping this will provide noticeable performance benefits while being 
agnostic of what DB back-end is being used.  With the way we use  SQLALchemy 
being 25x slower then MySQL we have lots of room for improvement (see 
http://paste.openstack.org/show/44143/ from 
https://bugs.launchpad.net/nova/+bug/1212418).



Just keep in mind that the Session and Query objects and their related APIs are 
in the SQLAlchemy ORM, not the SQLAlchemy Core.

But sure, ok.

But then I guarantee somebody is gonna spend a bunch of time writing an 
object-oriented API to the model objects because the ORM is very useful for the 
data modification part of the DB interaction.

Because people will complain about having to do this:

conn = engine.connection()
# instances is the sqlalchemy Table object for instances
inst_ins = instances.insert().values(blah=blah)
ip_ins = fixed_ips.insert().values(blah=blah)
conn.execute(ip_ins)
conn.execute(inst_ins)
conn.close()

instead of this:

i = Instance(blah=blah)
ip = FixedIp(blah=blah)
i.fixed_ips.append(ip)
session.add(u)
session.commit()

And so you've thrown the baby out with the bathwater and made more work for 
everyone.

Nova is already moving in the direction of using 
https://blueprints.launchpad.net/nova/+spec/unified-object-model 
https://wiki.openstack.org/wiki/ObjectProposal which is currently built on top 
of the procedural nova.db.api



-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Robert Collins
On 19 August 2013 14:22, Jay Pipes  wrote:

>> I'm completely with Joshua here - the ORM layer is more often than not
>> a source of bugs and performance issues.
>
>
> If used improperly, yep.

http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html

There is no proper use of an ORM.


>> We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
>> lower layer. It's the model objects themselves that we use the ORM
>> for, and we could use SQLAlchemy's lower layers but not the ORM.
>
>
> Hmmm, not quite... see below.
>
>

>> An alternative I think would be better would be to scrap the use of
>> the SQLAlchemy ORM; keep using the DB engine abstraction support.
>
>
> Just keep in mind that the Session and Query objects and their related APIs
> are in the SQLAlchemy ORM, not the SQLAlchemy Core.

Ok, so either it's not a bright line, or we'd need to have an
alternative thing - not just a reimplementation either, cause that's
pointless.

> But sure, ok.
>
> But then I guarantee somebody is gonna spend a bunch of time writing an
> object-oriented API to the model objects because the ORM is very useful for
> the data modification part of the DB interaction.

!cite - seriously...

> Because people will complain about having to do this:
>
> conn = engine.connection()
> # instances is the sqlalchemy Table object for instances
> inst_ins = instances.insert().values(blah=blah)
> ip_ins = fixed_ips.insert().values(blah=blah)
> conn.execute(ip_ins)
> conn.execute(inst_ins)
> conn.close()

This strawman is one way that it might be written. Given that a
growing set of our projects have non-SQL backends, this doesn't look
like the obvious way to phrase it to me.

> instead of this:
>
> i = Instance(blah=blah)
> ip = FixedIp(blah=blah)
> i.fixed_ips.append(ip)
> session.add(u)
> session.commit()
>
> And so you've thrown the baby out with the bathwater and made more work for
> everyone.

Perhaps; or perhaps we've avoided a raft of death-by-thousand-cuts
bugs across the project.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joe Gordon
On Sun, Aug 18, 2013 at 10:22 PM, Jay Pipes  wrote:

> On 08/18/2013 09:56 PM, Robert Collins wrote:
>
>> On 19 August 2013 10:43, Jay Pipes  wrote:
>>
>>> On 08/18/2013 06:08 PM, Joshua Harlow wrote:
>>>

 In my opinion (and just an opinion that I know everyone doesn't share)
 ORM
 layers are bulky, restrictive and overly complicate and confuse the
 reader
 of the code (code is read more often than written) and require another
 layer
 of understanding (a layer is useful if it adds good value, I am not sure
 sqlalchemy ORM layer does add said value).

>>>
>>>
>>> The usefulness of SQLAlchemy in this case is its ability to abstract away
>>> the different database backends used in both development and production
>>> environments (SQLite, MySQL, and PostgreSQL typically, though I'm sure
>>> folks
>>> are running on other backends). The usefulness of the ORM over raw SQL
>>> is,
>>> of course, the ability for the ORM to provide a singular interface for
>>> the
>>> different SQL dialects that those underlying backends support.
>>>
>>
>> Thats not the ORM layer. The SQL dialect layer is a layer below the
>> ORM : the ORM is the layer that provides sql <-> model translation,
>> including the descriptors that make assignment and dereferencing
>> trigger SQL.
>>
>
> OK, fair enough.
>
>
>  I'm completely with Joshua here - the ORM layer is more often than not
>> a source of bugs and performance issues.
>>
>
> If used improperly, yep.
>
>
>  If everyone was using PostgreSQL or everyone was using MySQL, there'd be
>>> less of a point to using an ORM like SQLAlchemy's. Instead, you'd use a
>>> simple db abstraction class like what's in Swift (which only uses
>>> SQLite).
>>> But, one of OpenStack's design principles is to be as agnostic as
>>> possible
>>> about underlying deployment things like database or MQ infrastructure,
>>> and
>>> one of the ramifications of that is abstraction layers...
>>>
>>
>> We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
>> lower layer. It's the model objects themselves that we use the ORM
>> for, and we could use SQLAlchemy's lower layers but not the ORM.
>>
>
> Hmmm, not quite... see below.
>
>
>  My point to Mark W was not that I preferred a procedural approach to an
>>> object-oriented one. My point was that I would hope that the direction
>>> was
>>> not to swap out the procedural abstraction DB API for an object-oriented
>>> one; instead, we should scrap the entire abstraction DB API
>>> entirely...and
>>> just use SQLAlchemy.
>>>
>>
>> An alternative I think would be better would be to scrap the use of
>> the SQLAlchemy ORM; keep using the DB engine abstraction support.
>>
>
+1, I am hoping this will provide noticeable performance benefits while
being agnostic of what DB back-end is being used.  With the way we use
 SQLALchemy being 25x slower then MySQL we have lots of room for
improvement (see http://paste.openstack.org/show/44143/ from
https://bugs.launchpad.net/nova/+bug/1212418).



>
> Just keep in mind that the Session and Query objects and their related
> APIs are in the SQLAlchemy ORM, not the SQLAlchemy Core.
>
> But sure, ok.
>
> But then I guarantee somebody is gonna spend a bunch of time writing an
> object-oriented API to the model objects because the ORM is very useful for
> the data modification part of the DB interaction.
>
> Because people will complain about having to do this:
>
> conn = engine.connection()
> # instances is the sqlalchemy Table object for instances
> inst_ins = instances.insert().values(**blah=blah)
> ip_ins = fixed_ips.insert().values(**blah=blah)
> conn.execute(ip_ins)
> conn.execute(inst_ins)
> conn.close()
>
> instead of this:
>
> i = Instance(blah=blah)
> ip = FixedIp(blah=blah)
> i.fixed_ips.append(ip)
> session.add(u)
> session.commit()
>
> And so you've thrown the baby out with the bathwater and made more work
> for everyone.


Nova is already moving in the direction of using
https://blueprints.launchpad.net/nova/+spec/unified-object-model
https://wiki.openstack.org/wiki/ObjectProposal which is currently built on
top of the procedural nova.db.api


>
> -jay
>
>
>
> __**_
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.**org 
> http://lists.openstack.org/**cgi-bin/mailman/listinfo/**openstack-dev
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/18/2013 09:56 PM, Robert Collins wrote:

On 19 August 2013 10:43, Jay Pipes  wrote:

On 08/18/2013 06:08 PM, Joshua Harlow wrote:


In my opinion (and just an opinion that I know everyone doesn't share) ORM
layers are bulky, restrictive and overly complicate and confuse the reader
of the code (code is read more often than written) and require another layer
of understanding (a layer is useful if it adds good value, I am not sure
sqlalchemy ORM layer does add said value).



The usefulness of SQLAlchemy in this case is its ability to abstract away
the different database backends used in both development and production
environments (SQLite, MySQL, and PostgreSQL typically, though I'm sure folks
are running on other backends). The usefulness of the ORM over raw SQL is,
of course, the ability for the ORM to provide a singular interface for the
different SQL dialects that those underlying backends support.


Thats not the ORM layer. The SQL dialect layer is a layer below the
ORM : the ORM is the layer that provides sql <-> model translation,
including the descriptors that make assignment and dereferencing
trigger SQL.


OK, fair enough.


I'm completely with Joshua here - the ORM layer is more often than not
a source of bugs and performance issues.


If used improperly, yep.


If everyone was using PostgreSQL or everyone was using MySQL, there'd be
less of a point to using an ORM like SQLAlchemy's. Instead, you'd use a
simple db abstraction class like what's in Swift (which only uses SQLite).
But, one of OpenStack's design principles is to be as agnostic as possible
about underlying deployment things like database or MQ infrastructure, and
one of the ramifications of that is abstraction layers...


We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
lower layer. It's the model objects themselves that we use the ORM
for, and we could use SQLAlchemy's lower layers but not the ORM.


Hmmm, not quite... see below.


My point to Mark W was not that I preferred a procedural approach to an
object-oriented one. My point was that I would hope that the direction was
not to swap out the procedural abstraction DB API for an object-oriented
one; instead, we should scrap the entire abstraction DB API entirely...and
just use SQLAlchemy.


An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.


Just keep in mind that the Session and Query objects and their related 
APIs are in the SQLAlchemy ORM, not the SQLAlchemy Core.


But sure, ok.

But then I guarantee somebody is gonna spend a bunch of time writing an 
object-oriented API to the model objects because the ORM is very useful 
for the data modification part of the DB interaction.


Because people will complain about having to do this:

conn = engine.connection()
# instances is the sqlalchemy Table object for instances
inst_ins = instances.insert().values(blah=blah)
ip_ins = fixed_ips.insert().values(blah=blah)
conn.execute(ip_ins)
conn.execute(inst_ins)
conn.close()

instead of this:

i = Instance(blah=blah)
ip = FixedIp(blah=blah)
i.fixed_ips.append(ip)
session.add(u)
session.commit()

And so you've thrown the baby out with the bathwater and made more work 
for everyone.


-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/18/2013 07:44 PM, Joshua Harlow wrote:

Using an ORM how does the ORM know what attributes u might access (forgive me 
if this is a documented sqlalchemy pattern/solution). Doesn't it have to give u 
back the full model since the ORM layer can't predict what u might do with the 
model object?


Depends on how you use SQLAlchemy. If you load a model object and then 
use that model object to get related information, you have little 
control over the fields in the relation that are loaded; you only really 
have control over when the relation is loaded into a model.


However, for the most part, code in Nova/Glance/Cinder that uses 
SQLAlchemy does not use this pattern of finding relations. Instead, code 
generally uses the SQLAlchemy Query interface to explicitly -- more or 
less -- ask for the columns in the underlying models that need to be 
returned or processed.


Best,
-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Robert Collins
On 19 August 2013 10:43, Jay Pipes  wrote:
> On 08/18/2013 06:08 PM, Joshua Harlow wrote:
>>
>> In my opinion (and just an opinion that I know everyone doesn't share) ORM
>> layers are bulky, restrictive and overly complicate and confuse the reader
>> of the code (code is read more often than written) and require another layer
>> of understanding (a layer is useful if it adds good value, I am not sure
>> sqlalchemy ORM layer does add said value).
>
>
> The usefulness of SQLAlchemy in this case is its ability to abstract away
> the different database backends used in both development and production
> environments (SQLite, MySQL, and PostgreSQL typically, though I'm sure folks
> are running on other backends). The usefulness of the ORM over raw SQL is,
> of course, the ability for the ORM to provide a singular interface for the
> different SQL dialects that those underlying backends support.

Thats not the ORM layer. The SQL dialect layer is a layer below the
ORM : the ORM is the layer that provides sql <-> model translation,
including the descriptors that make assignment and dereferencing
trigger SQL.

I'm completely with Joshua here - the ORM layer is more often than not
a source of bugs and performance issues.

> If everyone was using PostgreSQL or everyone was using MySQL, there'd be
> less of a point to using an ORM like SQLAlchemy's. Instead, you'd use a
> simple db abstraction class like what's in Swift (which only uses SQLite).
> But, one of OpenStack's design principles is to be as agnostic as possible
> about underlying deployment things like database or MQ infrastructure, and
> one of the ramifications of that is abstraction layers...

We don't use the SQLAlchemy ORM for cross-SQL-DB support - thats a
lower layer. It's the model objects themselves that we use the ORM
for, and we could use SQLAlchemy's lower layers but not the ORM.


> My point to Mark W was not that I preferred a procedural approach to an
> object-oriented one. My point was that I would hope that the direction was
> not to swap out the procedural abstraction DB API for an object-oriented
> one; instead, we should scrap the entire abstraction DB API entirely...and
> just use SQLAlchemy.

An alternative I think would be better would be to scrap the use of
the SQLAlchemy ORM; keep using the DB engine abstraction support.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joshua Harlow
Using an ORM how does the ORM know what attributes u might access (forgive me 
if this is a documented sqlalchemy pattern/solution). Doesn't it have to give u 
back the full model since the ORM layer can't predict what u might do with the 
model object?

Sent from my really tiny device...

On Aug 18, 2013, at 3:50 PM, "Jay Pipes"  wrote:

> On 08/18/2013 06:28 PM, Joe Gordon wrote:
>> 
>> On Aug 18, 2013 3:58 PM, "Jay Pipes" > > wrote:
>> >
>> > On 08/18/2013 03:53 AM, Joshua Harlow wrote:
>> >>
>> >> I always just liked SQL as the database abstraction layer ;)
>> >>
>> >> On a more serious note I think novas new object model might be a way
>> to go but in all honesty there won't be a one size fits all solution. I
>> just don't think sqlalchemy is that solution personally (maybe if we
>> just use sqlalchemy core it will be better and eject just the orm layer).
>> >
>> >
>> > What is specifically wrong with SQLAlchemy's ORM layer? What would
>> you replace it with? Why would use SQLAlchemy's "core" be better?
>> >
>> > I've seen little evidence that SQLAlchemy's ORM layer is the cause
>> for database performance problems. Rather, I've found that the database
>> schemas in use -- and in some cases, the *way* that the SQLAlchemy ORM
>> is called (for example, doing correlated subqueries instead of straight
>> joins) -- are primary causes for database performance issues.
>> 
>> From what I have seen the issue is both the queries and the ORM layer.
>> See https://bugs.launchpad.net/nova/+bug/1212418  for details.
> 
> Good point.
> 
> For the record, I'm not a fan of lazy/eager loading of relations in the 
> models themselves, but instead always being explicit about the exact data you 
> wish to query for.
> 
> It's similar in nature to the SQL best practice of never doing SELECT * FROM 
>  and instead of always being explicity about the columns you wish to 
> retrieve...
> 
> Best,
> -jay
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joshua Harlow
It would be neat to see what would happen if just the "raw" models were just 
used directly. Of course this must be treaded careful since I could see it 
spreading db logic all over.

+1 for turning off deferred loads, I think this encourages and actually hides 
bugs when lazy loads occur on demand. I have seen this become an issue for 
taskflow usage where the results of tasks can not be saved to persistent 
storage due to  deferred loading actually affected a feature in another module 
(the 'key in model' behaves differently than with dicts).

Sent from my really tiny device...

On Aug 18, 2013, at 3:46 PM, "Jay Pipes"  wrote:

> On 08/18/2013 06:08 PM, Joshua Harlow wrote:
>> In my opinion (and just an opinion that I know everyone doesn't share) ORM 
>> layers are bulky, restrictive and overly complicate and confuse the reader 
>> of the code (code is read more often than written) and require another layer 
>> of understanding (a layer is useful if it adds good value, I am not sure 
>> sqlalchemy ORM layer does add said value).
> 
> The usefulness of SQLAlchemy in this case is its ability to abstract away the 
> different database backends used in both development and production 
> environments (SQLite, MySQL, and PostgreSQL typically, though I'm sure folks 
> are running on other backends). The usefulness of the ORM over raw SQL is, of 
> course, the ability for the ORM to provide a singular interface for the 
> different SQL dialects that those underlying backends support.
> 
> If everyone was using PostgreSQL or everyone was using MySQL, there'd be less 
> of a point to using an ORM like SQLAlchemy's. Instead, you'd use a simple db 
> abstraction class like what's in Swift (which only uses SQLite). But, one of 
> OpenStack's design principles is to be as agnostic as possible about 
> underlying deployment things like database or MQ infrastructure, and one of 
> the ramifications of that is abstraction layers...
> 
>> What are the benefits in your mind??
> 
> There are not many benefits to ORMs (IMO) other than the abstraction of 
> underlying storage systems. I think it's unfortunate that -- due to the very 
> early use/support of Redis as the backend storage system for Nova -- that the 
> uber-abstraction DB API (as Mark W points out... a procedural API) exists in 
> Nova, Cinder, and other projects. Other than Keystone and Ceilometer, which 
> need to support non-RDBMS storage backends like LDAP or MongoDB, I would have 
> thought the SQLAlchemy abstraction layer and model would be perfectly fine as 
> *the* DB API in Nova, Glance, Cinder, etc.
> 
> But alas, there's another API on top of the already-abstracted SQLAlchemy DB 
> API...
> 
>> But this may just be me ranting since I don't like layers that seem to add 
>> little benefit, especially when most of the openstack projects put a big 
>> procedural  API.py (or more say in novas case) over the ORM layer anyway.
> 
> Yeah, I don't like the additional API on top of the SQLAlchemy API either -- 
> procedural or not. For Keystone and Ceilometer, I can see the point of it. 
> For the other projects, I don't.
> 
> My point to Mark W was not that I preferred a procedural approach to an 
> object-oriented one. My point was that I would hope that the direction was 
> not to swap out the procedural abstraction DB API for an object-oriented one; 
> instead, we should scrap the entire abstraction DB API entirely...and just 
> use SQLAlchemy.
> 
> Best,
> -jay
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/18/2013 06:28 PM, Joe Gordon wrote:


On Aug 18, 2013 3:58 PM, "Jay Pipes" mailto:jaypi...@gmail.com>> wrote:
 >
 > On 08/18/2013 03:53 AM, Joshua Harlow wrote:
 >>
 >> I always just liked SQL as the database abstraction layer ;)
 >>
 >> On a more serious note I think novas new object model might be a way
to go but in all honesty there won't be a one size fits all solution. I
just don't think sqlalchemy is that solution personally (maybe if we
just use sqlalchemy core it will be better and eject just the orm layer).
 >
 >
 > What is specifically wrong with SQLAlchemy's ORM layer? What would
you replace it with? Why would use SQLAlchemy's "core" be better?
 >
 > I've seen little evidence that SQLAlchemy's ORM layer is the cause
for database performance problems. Rather, I've found that the database
schemas in use -- and in some cases, the *way* that the SQLAlchemy ORM
is called (for example, doing correlated subqueries instead of straight
joins) -- are primary causes for database performance issues.

 From what I have seen the issue is both the queries and the ORM layer.
See https://bugs.launchpad.net/nova/+bug/1212418  for details.


Good point.

For the record, I'm not a fan of lazy/eager loading of relations in the 
models themselves, but instead always being explicit about the exact 
data you wish to query for.


It's similar in nature to the SQL best practice of never doing SELECT * 
FROM  and instead of always being explicity about the columns you 
wish to retrieve...


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/18/2013 06:08 PM, Joshua Harlow wrote:

In my opinion (and just an opinion that I know everyone doesn't share) ORM 
layers are bulky, restrictive and overly complicate and confuse the reader of 
the code (code is read more often than written) and require another layer of 
understanding (a layer is useful if it adds good value, I am not sure 
sqlalchemy ORM layer does add said value).


The usefulness of SQLAlchemy in this case is its ability to abstract 
away the different database backends used in both development and 
production environments (SQLite, MySQL, and PostgreSQL typically, though 
I'm sure folks are running on other backends). The usefulness of the ORM 
over raw SQL is, of course, the ability for the ORM to provide a 
singular interface for the different SQL dialects that those underlying 
backends support.


If everyone was using PostgreSQL or everyone was using MySQL, there'd be 
less of a point to using an ORM like SQLAlchemy's. Instead, you'd use a 
simple db abstraction class like what's in Swift (which only uses 
SQLite). But, one of OpenStack's design principles is to be as agnostic 
as possible about underlying deployment things like database or MQ 
infrastructure, and one of the ramifications of that is abstraction 
layers...



What are the benefits in your mind??


There are not many benefits to ORMs (IMO) other than the abstraction of 
underlying storage systems. I think it's unfortunate that -- due to the 
very early use/support of Redis as the backend storage system for Nova 
-- that the uber-abstraction DB API (as Mark W points out... a 
procedural API) exists in Nova, Cinder, and other projects. Other than 
Keystone and Ceilometer, which need to support non-RDBMS storage 
backends like LDAP or MongoDB, I would have thought the SQLAlchemy 
abstraction layer and model would be perfectly fine as *the* DB API in 
Nova, Glance, Cinder, etc.


But alas, there's another API on top of the already-abstracted 
SQLAlchemy DB API...



But this may just be me ranting since I don't like layers that seem to add 
little benefit, especially when most of the openstack projects put a big 
procedural  API.py (or more say in novas case) over the ORM layer anyway.


Yeah, I don't like the additional API on top of the SQLAlchemy API 
either -- procedural or not. For Keystone and Ceilometer, I can see the 
point of it. For the other projects, I don't.


My point to Mark W was not that I preferred a procedural approach to an 
object-oriented one. My point was that I would hope that the direction 
was not to swap out the procedural abstraction DB API for an 
object-oriented one; instead, we should scrap the entire abstraction DB 
API entirely...and just use SQLAlchemy.


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joe Gordon
On Aug 18, 2013 3:58 PM, "Jay Pipes"  wrote:
>
> On 08/18/2013 03:53 AM, Joshua Harlow wrote:
>>
>> I always just liked SQL as the database abstraction layer ;)
>>
>> On a more serious note I think novas new object model might be a way to
go but in all honesty there won't be a one size fits all solution. I just
don't think sqlalchemy is that solution personally (maybe if we just use
sqlalchemy core it will be better and eject just the orm layer).
>
>
> What is specifically wrong with SQLAlchemy's ORM layer? What would you
replace it with? Why would use SQLAlchemy's "core" be better?
>
> I've seen little evidence that SQLAlchemy's ORM layer is the cause for
database performance problems. Rather, I've found that the database schemas
in use -- and in some cases, the *way* that the SQLAlchemy ORM is called
(for example, doing correlated subqueries instead of straight joins) -- are
primary causes for database performance issues.

>From what I have seen the issue is both the queries and the ORM layer. See
https://bugs.launchpad.net/nova/+bug/1212418  for details.

>
> Note, I'm not speaking about database scalability issues but rather pure
query performance...
>
> Best,
> -jay
>
>
>> On Aug 16, 2013, at 12:07 PM, "Jay Pipes"  wrote:
>>
>>> On 08/16/2013 02:41 PM, Mark Washenberger wrote:

 I think the issue here for glance is whether or not oslo common code
 makes it easier or harder to make other planned improvements. In
 particular, using openstack.common.db.api will make it harder to
 refactor away from a giant procedural interface for the database
driver.
>>>
>>>
>>> And towards what? A giant object-oriented interface for the database
driver?
>>>
>>> -jay
>>>
>>> ___
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joshua Harlow
In my opinion (and just an opinion that I know everyone doesn't share) ORM 
layers are bulky, restrictive and overly complicate and confuse the reader of 
the code (code is read more often than written) and require another layer of 
understanding (a layer is useful if it adds good value, I am not sure 
sqlalchemy ORM layer does add said value). 

What are the benefits in your mind??

But this may just be me ranting since I don't like layers that seem to add 
little benefit, especially when most of the openstack projects put a big 
procedural  API.py (or more say in novas case) over the ORM layer anyway.

Sent from my really tiny device...

On Aug 18, 2013, at 12:58 PM, "Jay Pipes"  wrote:

> On 08/18/2013 03:53 AM, Joshua Harlow wrote:
>> I always just liked SQL as the database abstraction layer ;)
>> 
>> On a more serious note I think novas new object model might be a way to go 
>> but in all honesty there won't be a one size fits all solution. I just don't 
>> think sqlalchemy is that solution personally (maybe if we just use 
>> sqlalchemy core it will be better and eject just the orm layer).
> 
> What is specifically wrong with SQLAlchemy's ORM layer? What would you 
> replace it with? Why would use SQLAlchemy's "core" be better?
> 
> I've seen little evidence that SQLAlchemy's ORM layer is the cause for 
> database performance problems. Rather, I've found that the database schemas 
> in use -- and in some cases, the *way* that the SQLAlchemy ORM is called (for 
> example, doing correlated subqueries instead of straight joins) -- are 
> primary causes for database performance issues.
> 
> Note, I'm not speaking about database scalability issues but rather pure 
> query performance...
> 
> Best,
> -jay
> 
>> On Aug 16, 2013, at 12:07 PM, "Jay Pipes"  wrote:
>> 
>>> On 08/16/2013 02:41 PM, Mark Washenberger wrote:
 I think the issue here for glance is whether or not oslo common code
 makes it easier or harder to make other planned improvements. In
 particular, using openstack.common.db.api will make it harder to
 refactor away from a giant procedural interface for the database driver.
>>> 
>>> And towards what? A giant object-oriented interface for the database driver?
>>> 
>>> -jay
>>> 
>>> ___
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Jay Pipes

On 08/18/2013 03:53 AM, Joshua Harlow wrote:

I always just liked SQL as the database abstraction layer ;)

On a more serious note I think novas new object model might be a way to go but 
in all honesty there won't be a one size fits all solution. I just don't think 
sqlalchemy is that solution personally (maybe if we just use sqlalchemy core it 
will be better and eject just the orm layer).


What is specifically wrong with SQLAlchemy's ORM layer? What would you 
replace it with? Why would use SQLAlchemy's "core" be better?


I've seen little evidence that SQLAlchemy's ORM layer is the cause for 
database performance problems. Rather, I've found that the database 
schemas in use -- and in some cases, the *way* that the SQLAlchemy ORM 
is called (for example, doing correlated subqueries instead of straight 
joins) -- are primary causes for database performance issues.


Note, I'm not speaking about database scalability issues but rather pure 
query performance...


Best,
-jay


On Aug 16, 2013, at 12:07 PM, "Jay Pipes"  wrote:


On 08/16/2013 02:41 PM, Mark Washenberger wrote:

I think the issue here for glance is whether or not oslo common code
makes it easier or harder to make other planned improvements. In
particular, using openstack.common.db.api will make it harder to
refactor away from a giant procedural interface for the database driver.


And towards what? A giant object-oriented interface for the database driver?

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-18 Thread Joshua Harlow
I always just liked SQL as the database abstraction layer ;) 

On a more serious note I think novas new object model might be a way to go but 
in all honesty there won't be a one size fits all solution. I just don't think 
sqlalchemy is that solution personally (maybe if we just use sqlalchemy core it 
will be better and eject just the orm layer).

Sent from my really tiny device...

On Aug 16, 2013, at 12:07 PM, "Jay Pipes"  wrote:

> On 08/16/2013 02:41 PM, Mark Washenberger wrote:
>> I think the issue here for glance is whether or not oslo common code
>> makes it easier or harder to make other planned improvements. In
>> particular, using openstack.common.db.api will make it harder to
>> refactor away from a giant procedural interface for the database driver.
> 
> And towards what? A giant object-oriented interface for the database driver?
> 
> -jay
> 
> ___
> OpenStack-dev mailing list
> OpenStack-dev@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-16 Thread Jay Pipes

On 08/16/2013 02:41 PM, Mark Washenberger wrote:

I think the issue here for glance is whether or not oslo common code
makes it easier or harder to make other planned improvements. In
particular, using openstack.common.db.api will make it harder to
refactor away from a giant procedural interface for the database driver.


And towards what? A giant object-oriented interface for the database driver?

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-16 Thread Mark Washenberger
I would prefer to pick and choose which parts of oslo common db code to
reuse in glance. Most parts there look great and very useful. However, some
parts seem like they would conflict with several goals we have.

1) To improve code sanity, we need to break away from the idea of having
one giant db api interface
2) We need to improve our position with respect to new, non SQL drivers
- mostly, we need to focus first on removing business logic (especially
authz) from database driver code
- we also need to break away from the strict functional interface,
because it limits our ability to express query filters and tends to lump
all filter handling for a given function into a single code block (which
ends up being defect-rich and confusing as hell to reimplement)
3) It is unfortunate, but I must admit that Glance's code in general is
pretty heavily coupled to the database code and in particular the schema.
Basically the only tool we have to manage that problem until we can fix it
is to try to be as careful as possible about how we change the db code and
schema. By importing another project, we lose some of that control. Also,
even with the copy-paste model for oslo incubator, code in oslo does have
some of its own reasons to change, so we could potentially end up in a
conflict where glance db migrations (which are operationally costly) have
to happen for reasons that don't really matter to glance.

So rather than framing this as "glance needs to use oslo common db code", I
would appreciate framing it as "glance database code should have features
X, Y, and Z, some of which it can get by using oslo code." Indeed, I
believe in IRC we discussed the idea of writing up a wiki listing these
feature improvements, which would allow a finer granularity for evaluation.
I really prefer that format because it feels more like planning and less
like debate :-)

 I have a few responses inline below.

On Fri, Aug 16, 2013 at 6:31 AM, Victor Sergeyev wrote:

> Hello All.
>
> Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
> questions about Oslo DB code, and why is it so important to use it instead
> of custom implementation and so on. As there were a lot of questions it was
> really hard to answer on all this questions in IRC. So we decided that
> mailing list is better place for such things.
>
> List of main questions:
>
> 1. What includes oslo DB code?
> 2. Why is it safe to replace custom implementation by Oslo DB code?
> 3. Why oslo DB code is better than custom implementation?
> 4. Why oslo DB code won’t slow up project development progress?
> 5. What we are going actually to do in Glance?
> 6. What is the current status?
>
> Answers:
>
> 1. What includes oslo DB code?
>
> Currently Oslo code improves different aspects around DB:
> -- Work with SQLAlchemy models, engine and session
> -- Lot of tools for work with SQLAlchemy
>
-- Work with unique keys
> -- Base test case for work with database
> -- Test migrations against different backends
> -- Sync DB Models with actual schemas in DB (add test that they are
> equivalent)
>
>
> 2. Why is it safe to replace custom implementation by Oslo DB code?
>
> Oslo module, as base openstack module, takes care about code quality.
> Usually, common code more readable (most of flake8 checks enabled in Oslo)
> and have better test coverage.  Also it was tested in different use-cases
> (in production also) in an other projects so bugs in Oslo code were already
> fixed. So we can be sure, that we use high-quality code.
>

Alas, while testing and static style analysis are important, they are not
the only relevant aspects of code quality. Architectural choices are also
relevant. The best reusable code places few requirements on the code that
reuses it architecturally--in some cases it may make sense to refactor oslo
db code so that glance can reuse the correct parts.


>
>
> 3. Why oslo DB code is better than custom implementation?
>
> There are some arguments pro Oslo database code
>
> -- common code collects useful features from different projects
> Different utils, for work with database, common test class, module for
> database migration, and  other features are already in Oslo db code. Patch
> on automatic retry db.api query if db connection lost on review at the
> moment. If we use Oslo db code we should not care, how to port these (and
> others - in the future) features to Glance - it will came to all projects
> automaticly when it will came to Oslo.
>
> -- unified project work with database
> As it was already said,  It can help developers work with database in a
> same way in different projects. It’s useful if developer work with db in a
> few projects - he use same base things and got no surprises from them.
>

I'm not very motivated by this argument. I rarely find novelty that
challenging to understand when working with a project, personally. Usually
I'm much more stumped when code is heavily coupled to other modules or too
many responsibilities are lum

Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-16 Thread Flavio Percoco

On 16/08/13 11:42 -0400, Monty Taylor wrote:



On 08/16/2013 09:31 AM, Victor Sergeyev wrote:

Hello All.

Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
questions about Oslo DB code, and why is it so important to use it
instead of custom implementation and so on. As there were a lot of
questions it was really hard to answer on all this questions in IRC. So
we decided that mailing list is better place for such things.


There is another main point - which is at the last summit, we talked
about various legit database things that need to be done to support CD
and rolling deploys. The list is not small, and it's a task that's
important. Needing to implement it in all of the projects separately is
kind of an issue, whereas if the projects are all using the database the
same way, then the database team can engineer the same mechanisms for
doing rolling schema changes, and then operators can have a consistent
expectation when they're running a cloud.




Just to be clear, AFAIK, the concerns were around how / when to migrate
Glance and not about why we should share database code.



List of main questions:

1. What includes oslo DB code?
2. Why is it safe to replace custom implementation by Oslo DB code?
3. Why oslo DB code is better than custom implementation?
4. Why oslo DB code won’t slow up project development progress?
5. What we are going actually to do in Glance?
6. What is the current status?

Answers:

1. What includes oslo DB code?

Currently Oslo code improves different aspects around DB:
-- Work with SQLAlchemy models, engine and session
-- Lot of tools for work with SQLAlchemy
-- Work with unique keys
-- Base test case for work with database
-- Test migrations against different backends
-- Sync DB Models with actual schemas in DB (add test that they are
equivalent)


2. Why is it safe to replace custom implementation by Oslo DB code?

Oslo module, as base openstack module, takes care about code quality.
Usually, common code more readable (most of flake8 checks enabled in
Oslo) and have better test coverage.  Also it was tested in different
use-cases (in production also) in an other projects so bugs in Oslo code
were already fixed. So we can be sure, that we use high-quality code.




This is the point I was most worried about - and I'm still are. The
migration to Oslo's db code started a bit late in Glance and no code
has been merged yet. As for Glance, there still seems to be a lot of
work ahead on this matter.


That being said, thanks a lot for the email and for explaining all
those details.
FF

--
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-16 Thread Eric Windisch
On Fri, Aug 16, 2013 at 9:31 AM, Victor Sergeyev  wrote:
> Hello All.
>
> Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
> questions about Oslo DB code, and why is it so important to use it instead
> of custom implementation and so on. As there were a lot of questions it was
> really hard to answer on all this questions in IRC. So we decided that
> mailing list is better place for such things.
>
> List of main questions:
>
> 1. What includes oslo DB code?
> 2. Why is it safe to replace custom implementation by Oslo DB code?

Just to head off these two really quick. The database code in Oslo as
initially submitted was actually based largely from that in Glance,
merging in some of the improvements made in Nova. There might have
been some divergence since then, but migrating over shouldn't be
terribly difficult. While it isn't necessary for Glance to switch
over, it would be somewhat ironic if it didn't.

The database code in Oslo primarily keeps base models and various
things we can easily share, reuse, and improve across projects. I
suppose a big part of this is the session management which has been
moved out of api.py and into its own module of session.py. This
session management code is probably what you'll most have to decide is
worthwhile bringing in and if Glance really has such unique
requirements that it needs to bother with maintaining this code on its
own.

-- 
Regards,
Eric Windisch

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Replacing Glance DB code to Oslo DB code.

2013-08-16 Thread Monty Taylor


On 08/16/2013 09:31 AM, Victor Sergeyev wrote:
> Hello All.
> 
> Glance cores (Mark Washenberger, Flavio Percoco, Iccha Sethi) have some
> questions about Oslo DB code, and why is it so important to use it
> instead of custom implementation and so on. As there were a lot of
> questions it was really hard to answer on all this questions in IRC. So
> we decided that mailing list is better place for such things.

There is another main point - which is at the last summit, we talked
about various legit database things that need to be done to support CD
and rolling deploys. The list is not small, and it's a task that's
important. Needing to implement it in all of the projects separately is
kind of an issue, whereas if the projects are all using the database the
same way, then the database team can engineer the same mechanisms for
doing rolling schema changes, and then operators can have a consistent
expectation when they're running a cloud.

> List of main questions:
> 
> 1. What includes oslo DB code?  
> 2. Why is it safe to replace custom implementation by Oslo DB code? 
> 3. Why oslo DB code is better than custom implementation?
> 4. Why oslo DB code won’t slow up project development progress?
> 5. What we are going actually to do in Glance?
> 6. What is the current status?
> 
> Answers:
> 
> 1. What includes oslo DB code?
> 
> Currently Oslo code improves different aspects around DB:
> -- Work with SQLAlchemy models, engine and session
> -- Lot of tools for work with SQLAlchemy 
> -- Work with unique keys
> -- Base test case for work with database
> -- Test migrations against different backends
> -- Sync DB Models with actual schemas in DB (add test that they are
> equivalent)
> 
> 
> 2. Why is it safe to replace custom implementation by Oslo DB code? 
> 
> Oslo module, as base openstack module, takes care about code quality.
> Usually, common code more readable (most of flake8 checks enabled in
> Oslo) and have better test coverage.  Also it was tested in different
> use-cases (in production also) in an other projects so bugs in Oslo code
> were already fixed. So we can be sure, that we use high-quality code.
> 
> 
> 3. Why oslo DB code is better than custom implementation?
> 
> There are some arguments pro Oslo database code 
> 
> -- common code collects useful features from different projects
> Different utils, for work with database, common test class, module for
> database migration, and  other features are already in Oslo db code.
> Patch on automatic retry db.api query if db connection lost on review at
> the moment. If we use Oslo db code we should not care, how to port these
> (and others - in the future) features to Glance - it will came to all
> projects automaticly when it will came to Oslo. 
> 
> -- unified project work with database
> As it was already said,  It can help developers work with database in a
> same way in different projects. It’s useful if developer work with db in
> a few projects - he use same base things and got no surprises from them. 
> 
> -- it’s will reduce time for running tests.
> Maybe it’s minor feature, but it’s also can be important. We can removed
> some tests for base `DB` classes (such as session, engines, etc)  and
> replaced for work with DB to mock calls.
> 
> 
> 4. Why oslo DB code won’t slow up project development progress?
> 
> Oslo code for work with database already in such projects as Nova,
> Neutron, Celiometer and Ironic. AFAIK, these projects development speed
> doesn’t decelerated (please fix me, If I’m wrong). Work with database
> level already improved and tested in Oslo project, so we can concentrate
> on work with project features. All features, that already came to oslo
> code will be available in Glance, but if you want to add some specific
> feature to project *just now* you will be able to do it in project code.
> 
> 
> 5. What we are going actually to do in Glance?
> 
> -- Improve test coverage of DB API layer
> We are going to increase test coverage of glance/db/sqlalchemy/api
> module and fix bugs, if found. 
> 
> -- Run DB API tests on all backends
> -- Use Oslo migrations base test case for test migrations against
> different backends
> There are lot of different things in SQl backends. For example work with
> casting.
> In current SQLite we are able to store everything in column (with any
> type). Mysql will try to convert value to required type, and postgresql
> will raise IntegrityError. 
> If we will improve this feature, we will be sure, that all Glance DB
> migrations will run correctly on all backends.
> 
> -- Use Oslo code for SA models, engine and session
> -- Use Oslo SA utils
> Using common code for work with database was already discussed and
> approved for all projects. So we are going to implement common code for
> work with database instead of Glance implementation.
> 
> -- Fix work with session and transactions
> Our work items in Glance:
> - don't pass session instances to public DB methods
> - use explicit transactions