As an old-timer but a new cloudstack user, it strikes me as a bit odd
that changes to the database are allowed within a minor version change.
This seems to cause a lot more problems than it solves.
It could delay the release of someone's pet enhancement or bug fix but
the idea of not being able to upgrade from 4.5.3 to 4.6.2 is frightening.
The prospect of having upgrade scripts for 4.5.2 to 4.6.0, 4.6.1
and4.6.2 as well as as a separate upgrade from 4.5.3 to 4.6.2 and
similar scripts for 4.5.4, 4.5.5, etc. to 4.6.2, 4.6.3, 4.6.4 and so on,
is unpleasant.
This would have to continue until someone says that 4.5.x is dead and no
upgrade scripts to new 4.6.x releases will be available.
In projects that I have run, a change to the database required a new
major release so a single conversion will take one from 4.5.x to 4.6.x
The nice thing about release numbers is that one never runs out!
Ron
On 29/12/2015 10:08 AM, Daan Hoogland wrote:
CCYY == YYYY
On Tue, Dec 29, 2015 at 3:06 PM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:
I also liked the date-format, what did you mean with CCYY?
The way I think we might have a problem, would be to commits/PRs that end
up creating files with same names. Then, we would have to agree upon a way
to solve those conflicts, such as appending an extra character to indicate
a sequence to be followed or adding more data such as HH and mm to the
naming convention (YYYY-MM-DD-HH-mm).
I liked the way Wido suggested, we could just remove the “-” from
“YYYY-MM-DD-HH-mm” and use the value as an integer (YYYYMMDDHHmm).
It seems that we are reaching a consensus. I would love to hear back from
other devs though, especially committers.
BTW: do I have permission to create a page on the wiki so I can add
everything we discuss and agree upon here? This way, we could add that page
to the guidelines for devs creating PRs and committers reviewing and
merging them.
On Tue, Dec 29, 2015 at 12:00 PM, Wido den Hollander <w...@widodh.nl>
wrote:
On 29-12-15 14:46, Daan Hoogland wrote:
Wido, Rafael,
I like the date-format but then of course CCYY-MM-DD. I can still think
of
ways to screw up that (or the plain int;)
20151229 is a valid integer which you can simply use to compare with.
100, 101, 102 or 20151229, 20160103, 20160104, I don't care that much.
My point is that the database version should be separated from the code
base.
Wido
On Tue, Dec 29, 2015 at 1:40 PM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:
Wido, that is true, you are right; the naming on upgrade routines can
use a
numeric value independent of the number of the version. The numeric
value
can be a simple integer that is incremented each routine that is added
or a
time stamp when the routine was added. The point is that we would have
to
link a version to a number. That would enable us to use flywaydb.
To use that approach I think we might need to break compatibility as
you
pointed out earlier, but I believe that the benefits of an improved
way
to
manage upgrade routines will compensate by the breaking of
compatibility.
On Tue, Dec 29, 2015 at 10:25 AM, Wido den Hollander <w...@widodh.nl>
wrote:
On 29-12-15 13:21, Rafael Weingärtner wrote:
I got your point Daan.
Well, and if we linked a version of ACS with a time stamp in the
format
of
DD.MM.YYYY?
In that case you could also say.
ACS 4.6.0 == db ver X
You don't have to say ver >= X, you can also say ver = X.
We could then use the time stamp in the same format to name upgrade
routines. This way the idea of running all of the routines in
between
version during upgrades could be applied.
Same goes for giving all database changes a simple numeric int which
keeps incrementing each time a change is applied ;)
Wido
On Tue, Dec 29, 2015 at 10:03 AM, Daan Hoogland <
daan.hoogl...@gmail.com
wrote:
Rafael,
On Tue, Dec 29, 2015 at 12:22 PM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:
Thanks, Daan and Wido for your contributions, I will discuss them
as
follows.
Daan, about the idea of per commit upgrades. Do you mean that we
separate
each change in the database that is introduced by PRs/Commits in a
different file (routine upgrade) per ACS version?
So we would have, V_480_A.sql (for a PR),V_480_B.sql (for another
PR)
and
so forth
If that is the case, we can achieve that using a simple convention
naming
as I suggested. Each developer when she/he needs to change or add
something
in the database creates an upgrade routine separately and gives it
an
execution order to be taken by Flywaydb. I think that could help
RMs
to
track and isolate the problem, right?
Yes, with one little caveat. We do not know in what version a
feature/PR
will end up at the time of implementing, so a name containing the
version
would not be ideal.
Hi Wido, now I understand your example.
I understand your worry about upgrade paths, and that is the
point I
want
to discuss and solve. In your example, if we release a 4.6.0 and
later
a
4.5.3. You said that there would be no upgrade path from 4.5.3 to
4.6.0.
Well, today that is what happens. However, if we change the
technology
we
use to upgrade the database (using a tool such as Flywaydb) and if
we
define a standard to create upgrade routines that would not be a
problem.
As I have written in my first email, to go from a version to
another
we
should be able to run all of the upgrade routines in between them
(including the upgrade routine of the goal version). Therefore, if
we
release a version 4.6.0, and then 4.5.3, if someone upgrades to
4.5.3
from
any other version, and then wants to upgrade to 4.6.0, that would
not
be
a
problem, it would be a metter of running only the routine upgrade
of
4.6.0
version. We do not need to explicitly create upgrade paths. They
should
be
implicit by our upgrade conventions.
About creating versions of the code that rely on some version of
the
database. I do not like much because of compatibility issues that
might
arise. For instance, let’s say version X of ACS depends on version
=Y
of
the database. If I upgrade the database to version Y + 1 or +2,
the
same
ACS version has to keep running nice and shiny. My worry is that
may
bring
some complications, such as to remove columns that cease to be
used
or
data
structure that we might want to improve.
I normally see that the database version and the code base are
tied
in
a
mapping 1 to 1. Maybe I am having troubles identifying the
benefits
of
that
change.
Thanks for your time ;)
On Tue, Dec 29, 2015 at 8:15 AM, Wido den Hollander <
w...@widodh.nl
wrote:
On 28-12-15 21:34, Rafael Weingärtner wrote:
Hi Wido, Rohit,
I have just read the feature suggestion.
Wido, I am not trying to complicate things, quite the opposite,
I
just
illustrate a simple thing that can happen and is happening; I
just
pointed
how it can be easily solved.
About the release of .Z, releases more constant and others, I do
not
want
to mix topics. Let’s keep this thread strict to discuss database
upgrades.
I do not want to start the release discussion, but what I meant
is
that
we try to find a technical solution to something which might be
solved
easier by just changing the way we release.
4.6.0 is released and afterwards 4.5.3 is released. How does
somebody
upgrade from 4.5.3 to 4.6.0? He can't, since the 4.6.0 code
doesn't
support that path.
So my idea is to split the database version from the code
version.
The code requires database version >= X and during boot it simply
checks
that.
The database migration tool can indeed do the DB migration, it
doesn't
have to be the mgmt server who does the upgrade.
Now, about the FS. I agree with Rohit that we should have only
one
way
of
managing database upgrades and creation. I just do not like the
idea
of
creating a tool that work as a wrapper on frameworks/tools such
as
flywaydb. I think that those frameworks already work pretty good
as
they
are; and, I would rather maintain configurations than some
wrapper
code.
I personally like the way ACS works during upgrades (I just do
not
like
the
code itself and how things are structured), as a system
administrator I
like to change the version in the
“/etc/apt/sources.list.d/cloudstack.list”
and use the "apt-get" "update" and "install" from the command
line. I
do
not see the need to add another tool that is just a wrapper to
the
mix.
If
I update ACS code to 4.7.0, why would I let the database schema
in
an
older
version? And if we want version DB schemas and application code
separately
maintaining somehow compatibility between them, which would
bring
a
whole
other level of complexity to the code; I think we should avoid
that.
The flywaydb can be easily integrated with everything we have
now;
we
could
have a maven profile for developers and integrate it in ACS
bootstrap
using
its API as a Spring bean. Therefore, we could remove the current
“DatabaseUpgradeChecker “, “DbUpgrade” and other classes that
aim
to
do
that. We could even add the creation of the schema into the
first
time
it
boots using flywaydb and retire the “cloudstack-setup-database”
script,
or
at least make it less complicated, using it just to configure
the
database
URL and users.
The point is that to use Flywaydb we would have to agree upon a
convention
on creating routines (java and SQL) to execute upgrades.
Moreover,
using
a
tool such as Flywaydb we do not need to worry about upgrade
paths.
As I
wrote in the email I used to start this thread, the upgrade has
to
be
straightforward, to go to a version we have to run all of the
upgrade
routines between the current version until the desired one. Our
job
is
to
create upgrade routines that work and name them properly, the
job
of
the
tool is to check the current version, the desired one, the
upgrades
that
it
needs to run and execute everything properly.
Yes, indeed. I just wanted to start the discussion if we
shouldn't
version the database differently from the code.
Additionally, I do not see the need to break compatibility as
Rohit
suggested in the FS; in my opinion, everything we have up today
can
be
migrated to the new structure I proposed. If we use a tool such
as
Flywaydb, I even volunteered for that. The only thing we have to
discuss
and agree upon is the naming conventions for upgrades routines,
where
to
put them and the configurations for flywaydb.
Thanks for your contribution and time.
On Mon, Dec 28, 2015 at 2:10 PM, Rohit Yadav <
rohit.ya...@shapeblue.com>
wrote:
Hi Rafael and Wido,
Thanks for starting a conversation in this regard, I could not
pursue
the
Chimp tool due to other $dayjob work though it’s good to see
some
discussion has started again. Hope we’ll solve this in 2016.
In my opinion, we will need to first separate the database
init/migration
tooling away from mgmt server (right now the mgmt server does
db
migrations
when it starts and there is a code/db version mismatch) and
secondly
make
sure that we’re using the same code/tool to deploy database
(right
now,
users use the cloudstack-setup-database python tool while
developer
use
the
maven/java DatabaseCreator activated by the -Ddeploydb flag).
After we’ve addressed these two issues we can look into how we
can
support
minor releases workflow (or decide to do something else, like
not
support
.Z releases like Wido mentioned), and see if we can or want to
use
any
existing migration tool or write a wrapper tool “chimp” that
uses
existing
tools (some of those are mentioned in the Chimp FS like
flywaydb
etc).
For
allowing users to go back and forth from a db schema/version,
we’ll
also
need some new DB migration
conventions/versioning/rules/static-checking,
and how developer need to write such paths (forward and
reverse)
etc.
The best approach I figured at the time was to decide that
we’ll
use
the
previous db upgrade path mechanism till a certain CloudStack
version
(say
4.8.0) and after that we’ll use the new approach or tooling to
upgrade/downgrade DB schemas (thereby retiring away from the
old
DB
upgrade
path mess).
[image: ShapeBlue] <http://www.shapeblue.com> Rohit Yadav
Software
Architect , ShapeBlue d: * | s: +44 203 603 0540*
<%7C%20s:%20+44%20203%20603%200540> | m: *+91 8826230892*
<+91%208826230892> e: *rohit.ya...@shapeblue.com | t: *
<rohit.ya...@shapeblue.com%20%7C%20t:> | w: *
www.shapeblue.com
*
<http://www.shapeblue.com> a:
53 Chandos Place, Covent Garden London WC2N 4HS UK Shape Blue
Ltd
is a
company incorporated in England & Wales. ShapeBlue Services
India
LLP
is a
company incorporated in India and is operated under license
from
Shape
Blue
Ltd. Shape Blue Brasil Consultoria Ltda is a company
incorporated
in
Brasil
and is operated under license from Shape Blue Ltd. ShapeBlue SA
Pty
Ltd
is
a company registered by The Republic of South Africa and is
traded
under
license from Shape Blue Ltd. ShapeBlue is a registered
trademark.
This email and any attachments to it may be confidential and
are
intended
solely for the use of the individual to whom it is addressed.
Any
views
or
opinions expressed are solely those of the author and do not
necessarily
represent those of Shape Blue Ltd or related companies. If you
are
not
the
intended recipient of this email, you must neither take any
action
based
upon its contents, nor copy or show it to anyone. Please
contact
the
sender
if you believe you have received this email in error.
On 28-Dec-2015, at 9:10 PM, Wido den Hollander <w...@widodh.nl
wrote:
On 28-12-15 16:21, Rafael Weingärtner wrote:
Thanks for your contribution Wido,
I have not seen Rohit’s email; I will take a look at it.
Ok, he has a FS here:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Chimp
About database schema changes happening only in X.Y, I also
agree
with
you
(that is a convention we all could agree on, and such as
conding
and
release procedures we could have a wiki page for that).
However, I
think we
still might have scripts in versions X.Y.Z to add data to a
table
such
as
“guest_os_hypervisor”.
Yes, that is true. A bugfix could be a addition into the
database,
but
we have to prevent it as much as possible.
The point to manage such scripts is that, if we are in
version
such
as
4.7.0 and a new script emerges in version 4.5.3, we would
have
to
decide to
run or not to run it. I would rather not run them, since if
they
add
something to the code base; those changes should also be
applied
into
master and as a consequence it will be available in a future
update.
I understand, but this is where our release cycle becomes the
problem.
It is because we release a X.Y.Z release we run into these
kind
of
problems.
If we as a project simple do not release the .Z releases we
would
be
fine as well ;)
You can try to complicate things with technical things, or if
we
release
every two / three weeks we don't run into these kind of
situations
:)
We might even cut the database version loose from the code
version.
Database version is simple 100, 101, 102, 103, 104, 105. And a
code
version requires a certain version of the database.
Wido
On Mon, Dec 28, 2015 at 12:50 PM, Wido den Hollander <
w...@widodh.nl>
wrote:
On 28-12-15 14:16, Rafael Weingärtner wrote:
Hi all devs,
First of all, sorry the long text, but I hope we can start
a
discussion
here and improve that part of ACS.
A while ago I have faced the code that Apache CloudStack
(ACS)
uses
to
upgrade from a version to newer one and that did not seem
to
be
a
good
way
to execute our upgrades. Therefore, I decided to use some
time
to
search
for alternatives.
I think we all saw that happen once or more :)
I have read some material about versioning of scripts used
to
upgrade
a
database (DB) of a system and went through some frameworks
that
could
help
us.
In the literature of software engineering, it is firmly
stated
that
we
have
to version DB scripts as we do with the source code of the
application,
using the baseline approach. Gladly, we were not that bad
at
this
point,
we
already versioned our routines for DB upgrade (.sql and
.java).
Therefore,
it seemed that we just did not have used a practical
approach
to
help
us
during DB upgrades.
From my readings and looking at the ACS source code I
raised
the
following
requirement:
• We should be able to write more than one routine to
upgrade
to a
version; those routines can be written in Java and SQL. We
might
have
more
than a routine to be executed for each version and we
should
be
able
to
define an order of execution. Additionally, to go to an
upper
version, we
have to run all of the routines from smaller versions
first,
until
we
achieve the desired version.
We could also add another requirement that is the downgrade
from a
version,
which we currently do not support. With that comes my first
question
for
discussion:
• Do we want/need a method to downgrade from a version to a
previous
one?
I personally do not care. Usually people should create a
backup
PRIOR
to
a upgrade. If that fails they can restore the backup.
I found an explanation for not supporting downgrades, and I
liked
it:
http://flywaydb.org/documentation/faq.html#downgrade
So, what I devised for us:
First the bureaucracy part - our migrations occur basically
in
three
(3)
steps, first we have a "prepare script", then a cleanup
script
and
finally
the migration per se that is written in Java, at least,
that
is
what
we
can
expect when reading the interface
“com.cloud.upgrade.dao.DbUpgrade”.
Additionally, our scripts have the following naming
convention:
schema-<currentVersion>to<desiredVersion>, which in IMHO
may
cause
some
confusion because at first sight we may think that from the
same
version
we
could have different paths to an upper version, which in
practice
is
not
happening. Instead of a <currentVersion>to<version> we
could
simply
use
V_<numberOfVersion>_<sequencial>.<fileExtension>, giving
that,
we
have to
execute all of the V_<version> scripts that are smaller
than
the
version
we
want to upgrade.
To clarify what I am saying, I will use an example. Let’s
say
we
have
just
installed ACS and ran the cloudstack-setup-database. That
command
will
create a database schema in version 4.0.0. To upgrade that
schema
to
version 4.3.0 (it is just an example, it could be any other
version),
ACS
will use the following mapping:
_upgradeMap.put("4.0.0", new DbUpgrade[] {new
Upgrade40to41(),
new
Upgrade410to420(), new Upgrade420to421(), new
Upgrade421to430())
After loading the mapping, ACS will execute the scripts
defined
in
each
one
of the Upgrade path classes and the migration code per se.
Now, let’s say we change the “.sql” scripts name to the
pattern
I
mentioned, we would have the following scripts; those are
the
scripts
found
that aim to upgrade to versions between the interval 4.0.0
–
4.3.0
(considering 4.3.0, since that is the goal version):
- schema-40to410, can be named to: V_410_A.sql
- schema-40to410-cleanup, can be named to: V_410_B.sql
- schema-410to420, can be named to: V_420_A.sql
- schema-410to420-cleanup , can be named to: V_420_b.sql
- schema-420to421, can be named to: V_421_A.sql
- schema-421to430, can be named to: V_430_A.sql
- schema-421to430-cleanup, can be named to: V_430_B.sql
Additionally, all of the java code would have to follow the
same
convention. For instance, we have
“com.cloud.upgrade.dao.Upgrade40to41”,
which has some java code to migrate from 4.0.0 to 4.1.0.
The
idea
is
to
extract that migration code to a Java class named:
V_410_C.java,
giving
that it has to execute the SQL scripts before the java
code.
In order to go from a smaller version (4.0.0) to an upper
one
(4.3.0), we
have to run all of the migration routines from intermediate
versions.
That
is what we are already doing, but we do all of that
manually.
Bottom line, I think we could simple use the convention
V_<numberOfVersion>_<sequencial>.<fileExtension> to name
upgrade
routines.
That would facilitate us to use a framework to help us with
that
process.
Additionally, I believe that we should always assume that
to
go
from a
smaller version to a higher one, we should run all of the
scripts
that
exist between them. What do you guys think of that?
That seems good to me. But we still have to prevent that we
perform
database changes in a X.Y.Z release since that is branched
off
to a
different branch.
Imho database changes should only happen in X.Y releases.
After the bureaucracy, we can discuss tools. If we use that
convention to
name migration (upgrade) routines, we can start thinking on
tools
to
support our migration process. I found two (2) promising
ones:
Liquibase
and Flywaydb (both seem to be under Apache license, but the
first
one
has
an enterprise version?!). After reading the documentation
and
some
usage
examples I found the flywaydb easier and simpler to use.
What are the options of tools that we can use to help us
manage
the
database upgrade, without needing to code the upgrade path
that
you
know?
After that, I think we should decide if we should create
another
project/component to take care of migrations, or we can
just
add
the
dependency of the tool to a project such as
“cloud-framework-db”
and
start
using it.
The “cloud-framework-db” project seems to have a focus on
other
things
such
as managing transactions and generating SQLs from
annotations
(?!?
That
should be a topic for another discussion). Therefore, I
would
rather
create
a new project that has the specific goal of managing ACS DB
upgrades.
I
would also move all of the routines (SQL and Java) to this
new
project.
This project would be a module of the CloudStack project
and
it
would
execute the upgrade routines at the startup of ACS.
I believe that going from a homemade solution to one that
is
more
consolidated and used by other communities would be the way
to
go.
I can volunteer myself to create a PR with the
aforementioned
changes
and
using flywaydb to manage our upgrades. However, I prefer to
have a
good
discussion with other devs first, before starting coding.
Do you have suggestions or points that should be raised
before
we
start
working on that?
Rohit suggested Chimp earlier this year:
http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201508.mbox/%3c677bd09f-fc75-4888-8dc8-2b7af7439...@shapeblue.com%3E
The thread is called: "[DISCUSS] Let's fix CloudStack
Upgrades
and
DB
migrations with CloudStack Chimp"
Maybe there is something good in there.
--
Rafael Weingärtner
Regards.
Find out more about ShapeBlue and our range of CloudStack
related
services:
IaaS Cloud Design & Build
<http://shapeblue.com/iaas-cloud-design-and-build//> |
CSForge –
rapid
IaaS deployment framework <http://shapeblue.com/csforge/>
CloudStack Consulting <
http://shapeblue.com/cloudstack-consultancy/
|
CloudStack
Software Engineering
<http://shapeblue.com/cloudstack-software-engineering/>
CloudStack Infrastructure Support
<http://shapeblue.com/cloudstack-infrastructure-support/> |
CloudStack
Bootcamp Training Courses <
http://shapeblue.com/cloudstack-training/>
--
Rafael Weingärtner
--
Daan
--
Rafael Weingärtner
--
Rafael Weingärtner
--
Ron Wheeler
President
Artifact Software Inc
email: rwhee...@artifact-software.com
skype: ronaldmwheeler
phone: 866-970-2435, ext 102