RE: ask for help on best practices when changing table name from uppercase to lowercase

2021-01-11 Thread Ke Zhu - k...@us.ibm.com
On Tue, 2021-01-05 at 13:09 -0500, Mike Bayer wrote:
>    On Tue, Jan 5, 2021, at 9:50 AM, kz...@us.ibm.com wrote:
> Thanks...                   
>                      
>                     
> This Message Is From an External Sender
>    This message came from outside your organization.
> 
> 
> 
> On Tue, Jan 5, 2021, at 9:50 AM, kz...@us.ibm.com wrote:
> > Thanks mike!
> > 
> > I understand that it should work when querying over SQLAlchemy API.
> > the missing part of my use case I forgot to provide is, it only
> > uses Alembic/SQLAlchemy to manage schema changes, while the actual
> > query may comes from other query engine like a dashboard, a desktop
> > DB client or a notebook where data scientists knows SQL only, the
> > name cases would force user to remember such convention from DB
> > design (unpleasant part when switching RDBMS).
> 
> if these other folks are writing raw SQL then they just use
> identifiers without quotes, with any casing they want in their SQL
> string, and the names are case insensitive.  so there should be no
> need for them to remember any convention since case insensitive means
> there is no convention in the first place.   they can write all
> uppercase names in their PostgreSQL query if they'd like:
> 
> psql (12.4) 
> Type "help" for help.
> 
> test=> create table foo(colx integer, ColY integer);
> CREATE TABLE
> test=> SELECT COLX, COLY FROM FOO;
> colx | coly
> --+--
> (0 rows)
> 
> 
> if they are writing SQLAlchemy Python code, then they would use all
> lower case names at all times.
> 
> on the other hand, if the names were created *with quotes*, now the
> names are case sensitive, and users *must* use quotes, uppercase is
> not enough:
> 
> test=> create table "FOO" ("COLX" integer, "COLY" integer); 
> CREATE TABLE
> test=> select colx from FOO;
> ERROR:  relation "foo" does not exist
> 
> test=> select colx from "FOO";
> ERROR:  column "colx" does not exist
> 
> 
> test=> select "COLX" from "FOO";
> COLX
> --
> (0 rows)
> 
> 
> 
> 
> 
> > 
> > A simple question is, if it already created tables via Alembic
> > migration with uppercase table names, 
> 
> "uppercase table names" is not specific enough, were they quoted as
> UPPERCASE?   which database?

Postgres.

I created the model like this:

employee_key = Column('EMPLOYEE_KEY', Integer, primary_key=True)

So based on the doc:
https://docs.sqlalchemy.org/en/14/core/metadata.html?highlight=column#sqlalchemy.schema.Column

A test shows it requires quoted column name in query via SQL directly.
> 
> 
> > 
> > 
> > 
> > 
> > is there any API in Alembic simply support the statement like
> > `ALTER TABLE XXX_YYY_ZZZ RENAME TO xxx_yyy_zzz`? Since I noticed
> > that the revision autogenerate is not sensitive to table name.
> > should I manually create a revision for postgres only?
> 
> if you have a case sensitive name in Alembic and want to migrate to
> case insensitive, on PostrgreSQL you likely could emit
> op.rename_table("XXX_YYY_ZZZ", "xxx_yyy_zzz") and it will just work,
> the first identifier will be quoted and the second one will not.  if
> you want to make it fully explicit, you can do this:
> 
> from sqlalchemy.sql import quoted_name
> op.rename_table(quoted_name("XXX_YYY_ZZZ", quote=True),
> quoted_name("xxx_yyy_zzz", quote=False))

Thank you! This is what I need!

I guess it can rename column like this:

op.alter_column('xxx_yyy_zzz', 'EMPLOYEE_KEY',
new_column_name='employee_key')
> 
> 
> 
> > 
> > On Monday, January 4, 2021 at 6:57:33 PM UTC-5  
> > mik...@zzzcomputing.com wrote:
> > > 
> > > This is the casing convention of the database and SQLAlchemy does
> > > not consider the name to be in "uppercase" or "lowercase" (or
> > > even "mixed case") but rather "case insensitive", which really
> > > means the name is referred towards without any quoting.    When
> > > no quoting is applied, there is essentially no "casing" at all to
> > > the identifier that is stored by the database.
> > > 
> > > When you refer to the table name in SQLAlchemy, such as in a
> > > Table() object, or in Alembic as in op.create_table(), refer to
> > > it as all lower case at all times in the Python code- this
> > > indicates to SQLAlchemy that this is a case insensitive name and
> > > should not be quoted (as long as it has no special characters,
> > > spaces, etc).  SQLAlchemy will automatically render the name
> > > without quotes in DDL, DML and DQL, and render it as UPPERCASE
> > > when it queries DB2's catalogs, and as lower case when it queries
> > > PostgreSQL's catalogs.
> > > 
> > > 
> > > On Mon, Jan 4, 2021, at 4:12 PM, Ke Zhu - kz...@us.ibm.com wrote:
> > > > The use case is migrating data from a Db2 instance (where
> > > > schema is not
> > > > managed under Alembic migration) to a postgresql instance
> > > > (where I plan
> > > > to manage schema via Alembic).
> > > > 
> > > > In

ask for help on best practices when changing table name from uppercase to lowercase

2021-01-04 Thread Ke Zhu - k...@us.ibm.com
The use case is migrating data from a Db2 instance (where schema is not
managed under Alembic migration) to a postgresql instance (where I plan
to manage schema via Alembic).

In Db2, by default, it stores unquoted mixed case identifiers in
Uppercase. While the postgres I got it stores unquoted mixed case
indetifiers in lowercase. So If I plan to keep such convention in
postgres (store unquoted mixed case identifiers like table name in
lowercase), what's the suggested practice to migrate an existing table
like "XXX_YYY_ZZZ" to "xxx_yyy_zzz"?

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/5d4bf2d1f07e6958f94ee08d941db96c96fd91f0.camel%40us.ibm.com.


Re: KeyError: 'sqlnotapplicable' when running alembic revision --autogenerate

2020-09-04 Thread Ke Zhu - k...@us.ibm.com
Mike,

Thanks for replying!

This is the CREATE TABLE statement when running command `alembic upgrade +1 
—sql`:

CREATE TABLE "GITHUB_ACTIVE_USERS" (
"EMAIL" VARCHAR(255),
"GITHUB_LOGIN" VARCHAR(255),
"DT" VARCHAR(10),
"DT_MONTH" VARCHAR(7)
);

I’ve noticed that no constraints are created at all.

Will try to file a bug report in ibm_db_sa code repo.


On Sep 4, 2020, at 9:22 AM, Mike Bayer 
mailto:mike...@zzzcomputing.com>> wrote:

this warning is the culprit:

SAWarning: index key 'sqlnotapplicable' was not located in columns for table 
‘github_active_users'

I believe you should report this to the ibm_db_sa folks. 
https://github.com/ibmdb/python-ibmdbsa/issues<https://github.com/ibmdb/python-ibmdbsa/issues>

you would need to share the "CREATE TABLE" statements which produce this 
problem.



On Thu, Sep 3, 2020, at 11:54 PM, Ke Zhu - 
k...@us.ibm.com<mailto:k...@us.ibm.com> wrote:
I added new model, then run

$ pipenv run alembic revision --autogenerate -m "Add new table 
TRAVIS_ACTIVE_USERS" --depends-on=

Then I got this:

INFO  [alembic.runtime.migration] Context impl Db2Impl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py:943:
 SAWarning: index key 'sqlnotapplicable' was not located in columns for table 
‘github_active_users'
  "columns for table '%s'" % (flavor, c, table_name)
Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/butane-c47Aapyx/bin/alembic", line 8, in 

sys.exit(main())
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/config.py",
 line 577, in main
CommandLine(prog=prog).main(argv=argv)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/config.py",
 line 571, in main
self.run_cmd(cfg, options)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/config.py",
 line 551, in run_cmd
**dict((k, getattr(options, k, None)) for k in kwarg)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/command.py",
 line 214, in revision
script_directory.run_env()
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/script/base.py",
 line 489, in run_env
util.load_python_file(self.dir, "env.py")
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/pyfiles.py",
 line 98, in load_python_file
module = load_module_py(module_id, path)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/compat.py",
 line 184, in load_module_py
spec.loader.exec_module(module)
  File "", line 728, in exec_module
  File "", line 219, in _call_with_frames_removed
  File "butane/warehouse/env.py", line 91, in 
run_migrations_online()
  File "butane/warehouse/env.py", line 85, in run_migrations_online
context.run_migrations()
  File "", line 8, in run_migrations
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/runtime/environment.py",
 line 846, in run_migrations
self.get_context().run_migrations(**kw)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/runtime/migration.py",
 line 509, in run_migrations
for step in self._migrations_fn(heads, self):
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/command.py",
 line 190, in retrieve_migrations
revision_context.run_autogenerate(rev, context)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/api.py",
 line 442, in run_autogenerate
self._run_environment(rev, migration_context, True)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/api.py",
 line 482, in _run_environment
autogen_context, migration_script
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 25, in _populate_migration_script
_produce_net_changes(autogen_context, upgrade_ops)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 51, in _produce_net_changes
autogen_context, upgrade_ops, schemas
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/langhelpers.py",
 line 303, in go
fn(*arg, **kw)
  File 
"/root/.local/share/

KeyError: 'sqlnotapplicable' when running alembic revision --autogenerate

2020-09-03 Thread Ke Zhu - k...@us.ibm.com
I added new model, then run

$ pipenv run alembic revision --autogenerate -m "Add new table 
TRAVIS_ACTIVE_USERS" --depends-on=

Then I got this:

INFO  [alembic.runtime.migration] Context impl Db2Impl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py:943:
 SAWarning: index key 'sqlnotapplicable' was not located in columns for table 
‘github_active_users'
  "columns for table '%s'" % (flavor, c, table_name)
Traceback (most recent call last):
  File "/root/.local/share/virtualenvs/butane-c47Aapyx/bin/alembic", line 8, in 

sys.exit(main())
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/config.py",
 line 577, in main
CommandLine(prog=prog).main(argv=argv)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/config.py",
 line 571, in main
self.run_cmd(cfg, options)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/config.py",
 line 551, in run_cmd
**dict((k, getattr(options, k, None)) for k in kwarg)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/command.py",
 line 214, in revision
script_directory.run_env()
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/script/base.py",
 line 489, in run_env
util.load_python_file(self.dir, "env.py")
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/pyfiles.py",
 line 98, in load_python_file
module = load_module_py(module_id, path)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/compat.py",
 line 184, in load_module_py
spec.loader.exec_module(module)
  File "", line 728, in exec_module
  File "", line 219, in _call_with_frames_removed
  File "butane/warehouse/env.py", line 91, in 
run_migrations_online()
  File "butane/warehouse/env.py", line 85, in run_migrations_online
context.run_migrations()
  File "", line 8, in run_migrations
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/runtime/environment.py",
 line 846, in run_migrations
self.get_context().run_migrations(**kw)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/runtime/migration.py",
 line 509, in run_migrations
for step in self._migrations_fn(heads, self):
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/command.py",
 line 190, in retrieve_migrations
revision_context.run_autogenerate(rev, context)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/api.py",
 line 442, in run_autogenerate
self._run_environment(rev, migration_context, True)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/api.py",
 line 482, in _run_environment
autogen_context, migration_script
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 25, in _populate_migration_script
_produce_net_changes(autogen_context, upgrade_ops)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 51, in _produce_net_changes
autogen_context, upgrade_ops, schemas
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/langhelpers.py",
 line 303, in go
fn(*arg, **kw)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 83, in _autogen_for_tables
autogen_context,
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 170, in _compare_tables
autogen_context, modify_table_ops, s, tname, t, None
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/util/langhelpers.py",
 line 303, in go
fn(*arg, **kw)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 493, in _compare_indexes_and_uniques
conn_indexes = set(_make_index(ix, conn_table) for ix in conn_indexes)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 493, in 
conn_indexes = set(_make_index(ix, conn_table) for ix in conn_indexes)
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 235, in _make_index
*[conn_table.c[cname] for cname in params["column_names"]],
  File 
"/root/.local/share/virtualenvs/butane-c47Aapyx/lib/python3.7/site-packages/alembic/autogenerate/compare.py",
 line 235, in 
 

Re: Re: Re: Re: Re: Integrate PyHive and Alembic

2020-03-02 Thread Ke Zhu - k...@us.ibm.com
Mike,

Thanks for merging my change on Alembic!

On Thu, 2020-02-27 at 15:38 -0500, Mike Bayer wrote:
> 
> 
> On Thu, Feb 27, 2020, at 2:49 PM, Ke Zhu - k...@us.ibm.com wrote:
> > On Wed, 2020-02-26 at 11:07 -0500, Mike Bayer wrote:
> > > oh, that issue is talking about rowcount.  Alembic does not need
> > > rowcount to function correctly.I see that Alembic is doing
> > > this now, however there is a dialect-level flag called
> > > "supports_sane_rowcount", if this were False, Alembic should be
> > > checking this and skipping that particular check.
> > 
> > I see. it would be great if Alembic will respect the
> > "support_sane_rowcount" specified in a SQLAlchemy dialect when
> > updating/deleting revision. If this is a suggested design, I can
> > patch it via a PR.
> 
> yes.
> 
> 
> > According to specific dialect, it just need to use this option
> > instead of returning "-1". I can document this in the issue 
> > https://github.com/dropbox/PyHive/issues/315
> 
> the pyhive dialect advertises the attribute correctly right?

That's the fix I proposed in that issue.

> 
> > > is the "rowcount" check the only issue here?   this part can be
> > > changed on the Alembic side.
> > 
> > Besides the above issue, I want to get input on 
> > https://github.com/dropbox/PyHive/issues/314 as well. How could I
> > use SQLAlchemy API to create table like what this sql does:
> > CREATE TABLE hello_acid (key int, value int)
> > STORED AS ORC TBLPROPERTIES ('transactional'='true');
> > I've no idea how to make `STORED AS and TBLPROPERTIES` part for a
> > Hive table. Now I just use SQLAlchemy compiler to append them for
> > CreateTable.
> 
> There's a few paths to make this work depending on where and how far
> you want to go, or at least how this works.
> 
> 
> Assuming these are *optional* arguments for the tables that you
> create with the pyhive dialect, the official way is that the dialect
> can be made to support these (I thought we had a generic "suffixes"
> section right now, but we don't).   The way they work can be seen in
> the Postgresql dialect.  start with the keywords you want, here I
> might call it pyhive_stored_as="ORC" 
> pyhive_tblproperties="('transactional'='true').  then they'd be set
> up as "stored_as" and "tblproperties" in the dialect's
> construct_arguments like Postgresql does here: 
> 
> https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/postgresql/base.py#L2407
> 
> then you consume these in post_create_table:
> 
> https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/dialects/postgresql/base.py#L2037
> 
> 
> if you need Alembic's alembic_version to have these keywords, im not
> sure we have a keyword for that right now however you can use
> metadata events to intercept when alembic_version is built up, or you
> can intercept the compiler as you're doing now.
> 
> if alternatively all pyhive tables need these keywords
> unconditionally, then you'd just implement post_create_table in the
> compiler to add them in.
> 

Thanks for the pointers! It's very useful. I will try these paths since
not all Hive tables need such keywords so far. It depends on how data
is ingested in Hive/hadoop.

> 
> > 
> > 
> > 
> > --
> > You received this message because you are subscribed to the Google
> > Groups "sqlalchemy-alembic" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to sqlalchemy-alembic+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/sqlalchemy-alembic/ab42854af49a98c83fd3693c92dbd9ad5c59b9df.camel%40us.ibm.com
> > .
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/c1186fc5514dda850a9ae08e61624b765cc8d686.camel%40us.ibm.com.


Re: Re: Re: Re: Integrate PyHive and Alembic

2020-02-27 Thread Ke Zhu - k...@us.ibm.com
On Wed, 2020-02-26 at 11:07 -0500, Mike Bayer wrote:
oh, that issue is talking about rowcount.  Alembic does not need rowcount to 
function correctly.I see that Alembic is doing this now, however there is a 
dialect-level flag called "supports_sane_rowcount", if this were False, Alembic 
should be checking this and skipping that particular check.

I see. it would be great if Alembic will respect the "support_sane_rowcount" 
specified in a SQLAlchemy dialect when updating/deleting revision. If this is a 
suggested design, I can patch it via a PR.

According to specific dialect, it just need to use this option instead of 
returning "-1". I can document this in the issue 
https://github.com/dropbox/PyHive/issues/315

is the "rowcount" check the only issue here?   this part can be changed on the 
Alembic side.

Besides the above issue, I want to get input on 
https://github.com/dropbox/PyHive/issues/314 as well. How could I use 
SQLAlchemy API to create table like what this sql does:

CREATE TABLE hello_acid (key int, value int)
STORED AS ORC TBLPROPERTIES ('transactional'='true');

I've no idea how to make `STORED AS and TBLPROPERTIES` part for a Hive table. 
Now I just use SQLAlchemy compiler to append them for CreateTable.


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/ab42854af49a98c83fd3693c92dbd9ad5c59b9df.camel%40us.ibm.com.


Re: Re: Re: Integrate PyHive and Alembic

2020-02-25 Thread Ke Zhu - k...@us.ibm.com
According to Apache Hive, Update/Delete can only be performed on tables that 
supports ACID. see 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Update

> this is a database that can change schema structures but not 
> insert/update/delete rows? is there some kind of SQL layer that has to be in 
> use that isn't there when you create structures?

Yes! Basically it separate data store and data model so that you can update 
table schema as metadata while the actual data can be loaded/streamed into 
external data store w/o SQL interface.

> that wouldn't be very easy but also it would be way better to store the 
> version info in the target DB itself.

totally understand this. I see this design as advantage as well just like other 
schema management tool like sqitch. I'm just looking for possiblilities since 
I've seen it uses the same connection to execute schema changes and versioning 
change in a migration context.

> I don't know anything about Hive/Presto, but they *are* databases so I'd 
> assume you can put data in them

Yes to Hive, when using PyHive, it just needs to fix 
https://github.com/dropbox/PyHive/issues/315

While, the answer will be no to PrestoSQL which is just a SQL query engine that 
delegates data model and data store to query targets 
(MySQL/Postgres/Kafka/Elasticsearch etc) via connectors.

On Mon, 2020-02-24 at 18:28 -0500, Mike Bayer wrote:


On Mon, Feb 24, 2020, at 3:44 PM, Ke Zhu - k...@us.ibm.com wrote:
Mike,

Thanks for the pointers. I've figured out the programming part and discovers 
more things when integrating PyHive:

1. It requires the table alembic_version to be transactional so that it can do 
Update/Delete when upgrade/downgrade. which is challege for Hive3 which has 
limited ACID support. Since I could not figure out a way to program a 
transactional table via SQLAlechemy API (see 
https://github.com/dropbox/PyHive/issues/314<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dropbox_PyHive_issues_314&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=e0JaCS5rMieNhsG1YWcybg&m=vatsSAFNcrPb4GQHM2wFMnkeSpn14krYrz0SPGRZ-pI&s=OhjmAXLNxrY73OkGOm2Q87GwKeYeEKmfcxxnhM-GnVQ&e=>),
 it ended up a decoration to patch sqlalchemy.schema.CreateTable by appending 
table properties.

it's not a hard requirement that there's a DB transaction in use, you could run 
alembic in an "autocommit" mode and that shouldn't cause any problem.  you just 
won't be able to roll back if something fails.   im not completely sure what 
you mean by "the table to be transactional so that it can do update/delete" but 
from Alembic's point of view it just needs to run INSERT/UPDATE/DELETE but 
there doesn't have to be any ACID guarantees.


2. PyHive doesn't fully support Update/Delete for Hive/Presto yet. it's easy to 
patch PyHive but the key problem is it doesn't support transactional DDL like 
rollback in Hive3.

Although I've managed to get `alembic upgrade` and `alembic downgrade` to work 
on Hive 3, it's still not a fully transactional experience (e.g., changed 
schema sccessfully but failed to update table alembic_version).

so..this is a database that can change schema structures but not 
insert/update/delete rows?  im not following.  is there some kind of SQL layer 
that has to be in use that isn't there when you create structures?



I wonder if there's any design direction in Alembic to allowing storing the 
version table `alembic_version` in another db when dealing with non-RDBMS SQL 
engine (Hive/Presto). e.g., supporting a postgres db to store table alembic 
while delivering the actual changes to Hive. I had a PoC to using multi-db 
template to manage the table `alembic_version` in a RDBMS while denying any 
operation on table `alembic_version` in Hive/Presto. it works now but does it 
sound right?

that wouldn't be very easy but also it would be way better to store the version 
info in the target DB itself.   I don't know anything about Hive/Presto, but 
they *are* databases so I'd assume you can put data in them.



On Sat, 2020-01-25 at 18:19 -0500, Mike Bayer wrote:


On Fri, Jan 24, 2020, at 1:56 PM, Ke Zhu wrote:
Just discovered this post when trying to do exact same thing (besides planning 
to support one more dialect).

> Anywhere in your  hive dialect, simply put the above code that you have 
> (using the correct imports of course).

Does it mean it must introduce dependency to alembic (since it uses 
alembic.ddl.impl.DefaultImpl) in a package (.e.g, pyHive) that supports 
sqlalchemy interfaces?

well you have to put it in a try/except ImportError block so that if alembic 
isn't installed, it silently passes.   there's a github issue to add support 
for real entrypoints but it hasn't been that critical.


If not, is there any guidance to sup

Re: Re: Integrate PyHive and Alembic

2020-02-24 Thread Ke Zhu - k...@us.ibm.com
Mike,
Thanks for the pointers. I've figured out the programming part and
discovers more things when integrating PyHive:
1. It requires the table alembic_version to be transactional so that it
can do Update/Delete when upgrade/downgrade. which is challege for
Hive3 which has limited ACID support. Since I could not figure out a
way to program a transactional table via SQLAlechemy API (see 
https://github.com/dropbox/PyHive/issues/314), it ended up a decoration
to patch sqlalchemy.schema.CreateTable by appending table properties.2.
PyHive doesn't fully support Update/Delete for Hive/Presto yet. it's
easy to patch PyHive but the key problem is it doesn't support
transactional DDL like rollback in Hive3.
Although I've managed to get `alembic upgrade` and  `alembic downgrade`
to work on Hive 3, it's still not a fully transactional experience
(e.g., changed schema sccessfully but failed to update table
alembic_version).
I wonder if there's any design direction in Alembic to allowing storing
the version table `alembic_version` in another db when dealing with
non-RDBMS SQL engine (Hive/Presto). e.g., supporting a postgres db to
store table alembic while delivering the actual changes to Hive. I had
a PoC to using multi-db template to manage the table `alembic_version`
in a RDBMS while denying any operation on table `alembic_version` in
Hive/Presto. it works now but does it sound right?
On Sat, 2020-01-25 at 18:19 -0500, Mike Bayer wrote:
> On Fri, Jan 24, 2020, at 1:56 PM, Ke Zhu wrote:
> > Just discovered this post when trying to do exact same thing
> > (besides planning to support one more dialect).
> > 
> > > Anywhere in your  hive dialect, simply put the above code that
> > you have (using the correct imports of course). 
> > 
> > Does it mean it must introduce dependency to alembic (since it uses
> > alembic.ddl.impl.DefaultImpl) in a package (.e.g, pyHive) that
> > supports sqlalchemy interfaces?
> 
> well you have to put it in a try/except ImportError block so that if
> alembic isn't installed, it silently passes.   there's a github issue
> to add support for real entrypoints but it hasn't been that critical.
> 
> > If not, is there any guidance to support this at alembic level in a
> > plug-gable way? E.g., declare a HiveImpl class in `env.py` of a
> > project uses alembic?
> 
> you could put one in your env.py also but if you are the person
> working on the dialect you can have this built in, see the example in
> sqlalchemy-redshift: 
> https://github.com/sqlalchemy-redshift/sqlalchemy-redshift/blob/master/sqlalchemy_redshift/dialect.py#L27
> 
> 
> 
> > PS: I raised this question in stackoverflow but raised this group
> > is a better place to get help.
> > 
> > 
> > On Friday, February 10, 2017 at 9:45:38 AM UTC-5, mike bayer wrote:
> > > On 02/10/2017 07:41 AM, Alexander Peletz wrote: 
> > > > Hello, 
> > > > 
> > > > I would like to use Alembic to manage my Hive Metastore. I
> > > have 
> > > > installed, PyHive, SqlAlchemy, and Alembic. I am able to create
> > > a 
> > > > functional engine object using the 'hive' dialect in
> > > sqlalchemy, however 
> > > > I cannot get Alembic to recognize this dialect. The problem
> > > appears to 
> > > > be a lack of a HiveImpl class in the Alembic package. I
> > > attempted to 
> > > > resolve this by creating an alembic/ddl/hive.py module and
> > > pasting the 
> > > > following code into that module: 
> > > > 
> > > > 
> > > > from .impl import DefaultImpl 
> > > > 
> > > > class HiveImpl(DefaultImpl): 
> > > > __dialect__ = 'hive' 
> > >  
> > >  
> > >  
> > > you don't actually have to create a "hive.py" file.  Anywhere in
> > > your 
> > > hive dialect, simply put the above code that you have (using the
> > > correct 
> > > imports of course).  The DefaultImpl uses a metaclass that will
> > > allow 
> > > the "hive" name to be available to alembic as a result of this
> > > class 
> > > being created. 
> > >  
> > >  
> > > > 
> > > > 
> > > > 
> > > > 
> > > > I simply want to be able to execute raw SQL against a Hive
> > > instance (no 
> > > > ORM implementation needed) and I was hoping to use Alembic to
> > > manage the 
> > > > minimum upgrade/downgrade functionality. Are there any simple
> > > edits I 
> > > > can make to the Alembic source code to allow me to achieve this
> > > goal? 
> > > > 
> > > > 
> > > > 
> > > > Thanks, 
> > > > 
> > > > Alexander 
> > > > 
> > > > -- 
> > 
> > --
> >  You received this message because you are subscribed to the Google
> > Groups "sqlalchemy-alembic" group.
> >  To unsubscribe from this group and stop receiving emails from it,
> > send an email to sqlalchemy-alembic+unsubscr...@googlegroups.com.
> >  To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/sqlalchemy-alembic/59b308d9-7a9f-4038-bb52-f578c2c9cb69%40googlegroups.com
> > .
> 
> 
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group an