[ 
https://issues.apache.org/jira/browse/OOZIE-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated OOZIE-2245:
---------------------------------
    Attachment: OOZIE-2245.patch

The patch adds a new {{SchemaCheckService}} and {{SchemaCheckXCommand}}.  It 
uses the java {{Connection}} to get metadata on the database that Oozie is 
configured to use.  Because it's not using OpenJPA stuff, we have to make a new 
connection to the database, which means it isn't going to work for Derby.  It 
checks the following:
- The expected tables
- The expected columns, their types, and default values
- The expected indexes and primary keys

To make things easier to maintain in the future when we next change the 
database schema, a lot of the expected results are generated dynamically using 
reflection to inspect the Bean classes.  

There's a few places where different databases behave differently or want 
different things (e.g. MySQL and Oracle want uppercase table names but Postgres 
uses lowercase), so there's some checks for those.  Otherwise I tried to keep 
it as generic as possible.  

There's configs for:
- The interval the checker runs at.  I set it to run once a week.  It's also 
hardcoded to run 1 hour after Oozie startup
- To ignore extra columns, tables, indexes.  I imagine some users, who 
hopefully know what they're doing, may want to add extra indexes or other 
things, so turning this on will make the checker not count those as "bad"

The checker also exposes an instrumentation metrics to report the status of the 
database schema and the last time it was run.  The specific problems it finds 
are logged as ERRORs.  There's also a bunch of DEBUG logging for things that 
are correct.

I've tested this against MySQL, PostgreSQL, and Oracle and it works correctly.  
Though I'm not a database expert, and my Oracle install was kind of hacky, so 
any input or additional testing from [~rohini], [~puru], [~chitnis], 
[~egashira] would be helpful there.  Oracle was also a bit different in that 
instead of a catalog (i.e. the database name which the Connection determined 
from the URL) it used a schema that I think is hardcoded to "OOZIE", but I'm a 
little iffy on that.
I also don't have access to an MS SQLServer, so I disabled that for now.  
[~bowenzhangusa], could you try this against a SQLServer and make any tweaks 
that it needs?  The core code should be correct, but you may have to make minor 
if statements like I did for the others.  We can either do that as a followup 
JIRA or here.

> Service to periodically check database schema
> ---------------------------------------------
>
>                 Key: OOZIE-2245
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2245
>             Project: Oozie
>          Issue Type: New Feature
>          Components: core
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: OOZIE-2245.patch
>
>
> We've seen a number of issues related to the database schema being incorrect 
> (more than you would think).  It seems some users go and muck around in the 
> Oozie database, adding/removing columns and indexes, changing the default 
> value of columns, etc.  The issues caused by this can be very difficult to 
> track down because their cause is not obvious and we generally assume the 
> database schema is correct.  For example, we saw an issue where Oozie was 
> taking a long time to create Coordinator actions, and it turned out that the 
> cause was that some indexes were missing, which made the Purge queries slow, 
> which slowed down the whole database whenever the PurgeService ran.  Another 
> example was that the pause time was automatically being set whenever a 
> Coordinator job was submitted, because the default value for the column was 
> incorrect.
> We should create a Service which periodically runs and checks that the schema 
> is correct.  It can output details about what's wrong to the log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to