According to Apache Hive, Update/Delete can only be performed on tables that 
supports ACID. see 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Update

> this is a database that can change schema structures but not 
> insert/update/delete rows? is there some kind of SQL layer that has to be in 
> use that isn't there when you create structures?

Yes! Basically it separate data store and data model so that you can update 
table schema as metadata while the actual data can be loaded/streamed into 
external data store w/o SQL interface.

> that wouldn't be very easy but also it would be way better to store the 
> version info in the target DB itself.

totally understand this. I see this design as advantage as well just like other 
schema management tool like sqitch. I'm just looking for possiblilities since 
I've seen it uses the same connection to execute schema changes and versioning 
change in a migration context.

> I don't know anything about Hive/Presto, but they *are* databases so I'd 
> assume you can put data in them

Yes to Hive, when using PyHive, it just needs to fix 
https://github.com/dropbox/PyHive/issues/315

While, the answer will be no to PrestoSQL which is just a SQL query engine that 
delegates data model and data store to query targets 
(MySQL/Postgres/Kafka/Elasticsearch etc) via connectors.

On Mon, 2020-02-24 at 18:28 -0500, Mike Bayer wrote:


On Mon, Feb 24, 2020, at 3:44 PM, Ke Zhu - k...@us.ibm.com wrote:
Mike,

Thanks for the pointers. I've figured out the programming part and discovers 
more things when integrating PyHive:

1. It requires the table alembic_version to be transactional so that it can do 
Update/Delete when upgrade/downgrade. which is challege for Hive3 which has 
limited ACID support. Since I could not figure out a way to program a 
transactional table via SQLAlechemy API (see 
https://github.com/dropbox/PyHive/issues/314<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dropbox_PyHive_issues_314&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=e0JaCS5rMieNhsG1YWcybg&m=vatsSAFNcrPb4GQHM2wFMnkeSpn14krYrz0SPGRZ-pI&s=OhjmAXLNxrY73OkGOm2Q87GwKeYeEKmfcxxnhM-GnVQ&e=>),
 it ended up a decoration to patch sqlalchemy.schema.CreateTable by appending 
table properties.

it's not a hard requirement that there's a DB transaction in use, you could run 
alembic in an "autocommit" mode and that shouldn't cause any problem.  you just 
won't be able to roll back if something fails.   im not completely sure what 
you mean by "the table to be transactional so that it can do update/delete" but 
from Alembic's point of view it just needs to run INSERT/UPDATE/DELETE but 
there doesn't have to be any ACID guarantees.


2. PyHive doesn't fully support Update/Delete for Hive/Presto yet. it's easy to 
patch PyHive but the key problem is it doesn't support transactional DDL like 
rollback in Hive3.

Although I've managed to get `alembic upgrade` and `alembic downgrade` to work 
on Hive 3, it's still not a fully transactional experience (e.g., changed 
schema sccessfully but failed to update table alembic_version).

so..this is a database that can change schema structures but not 
insert/update/delete rows?  im not following.  is there some kind of SQL layer 
that has to be in use that isn't there when you create structures?



I wonder if there's any design direction in Alembic to allowing storing the 
version table `alembic_version` in another db when dealing with non-RDBMS SQL 
engine (Hive/Presto). e.g., supporting a postgres db to store table alembic 
while delivering the actual changes to Hive. I had a PoC to using multi-db 
template to manage the table `alembic_version` in a RDBMS while denying any 
operation on table `alembic_version` in Hive/Presto. it works now but does it 
sound right?

that wouldn't be very easy but also it would be way better to store the version 
info in the target DB itself.   I don't know anything about Hive/Presto, but 
they *are* databases so I'd assume you can put data in them.



On Sat, 2020-01-25 at 18:19 -0500, Mike Bayer wrote:


On Fri, Jan 24, 2020, at 1:56 PM, Ke Zhu wrote:
Just discovered this post when trying to do exact same thing (besides planning 
to support one more dialect).

> Anywhere in your  hive dialect, simply put the above code that you have 
> (using the correct imports of course).

Does it mean it must introduce dependency to alembic (since it uses 
alembic.ddl.impl.DefaultImpl) in a package (.e.g, pyHive) that supports 
sqlalchemy interfaces?

well you have to put it in a try/except ImportError block so that if alembic 
isn't installed, it silently passes.   there's a github issue to add support 
for real entrypoints but it hasn't been that critical.


If not, is there any guidance to support this at alembic level in a plug-gable 
way? E.g., declare a HiveImpl class in `env.py` of a project uses alembic?

you could put one in your env.py also but if you are the person working on the 
dialect you can have this built in, see the example in sqlalchemy-redshift: 
https://github.com/sqlalchemy-redshift/sqlalchemy-redshift/blob/master/sqlalchemy_redshift/dialect.py#L27<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_sqlalchemy-2Dredshift_sqlalchemy-2Dredshift_blob_master_sqlalchemy-5Fredshift_dialect.py-23L27&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=e0JaCS5rMieNhsG1YWcybg&m=froqtQlu2FU5RhgkzLXlily86IEGWTyf6iZZV2219n8&s=0MY0coM-F1OHHH0XURBJmXbbzbboExEXzrx_GekfzqA&e=>




PS: I raised this 
question<https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_59887588_how-2Dto-2Dadd-2Dnew-2Ddialect-2Dto-2Dalembic-2Dbesides-2Dbuilt-2Din-2Ddialects&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=e0JaCS5rMieNhsG1YWcybg&m=froqtQlu2FU5RhgkzLXlily86IEGWTyf6iZZV2219n8&s=5iBXdz3bhfClgt2qA9bV1Q-KzEuG4P2n0KsHFx6w4VI&e=>
 in stackoverflow but raised this group is a better place to get help.


On Friday, February 10, 2017 at 9:45:38 AM UTC-5, mike bayer wrote:


On 02/10/2017 07:41 AM, Alexander Peletz wrote:
> Hello,
>
> I would like to use Alembic to manage my Hive Metastore. I have
> installed, PyHive, SqlAlchemy, and Alembic. I am able to create a
> functional engine object using the 'hive' dialect in sqlalchemy, however
> I cannot get Alembic to recognize this dialect. The problem appears to
> be a lack of a HiveImpl class in the Alembic package. I attempted to
> resolve this by creating an alembic/ddl/hive.py module and pasting the
> following code into that module:
>
>
> from .impl import DefaultImpl
>
> class HiveImpl(DefaultImpl):
>     __dialect__ = 'hive'



you don't actually have to create a "hive.py" file.  Anywhere in your
hive dialect, simply put the above code that you have (using the correct
imports of course).  The DefaultImpl uses a metaclass that will allow
the "hive" name to be available to alembic as a result of this class
being created.


>
>
>
>
> I simply want to be able to execute raw SQL against a Hive instance (no
> ORM implementation needed) and I was hoping to use Alembic to manage the
> minimum upgrade/downgrade functionality. Are there any simple edits I
> can make to the Alembic source code to allow me to achieve this goal?
>
>
>
> Thanks,
>
> Alexander
>
> --


--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
sqlalchemy-alembic+unsubscr...@googlegroups.com<mailto:sqlalchemy-alembic+unsubscr...@googlegroups.com>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/59b308d9-7a9f-4038-bb52-f578c2c9cb69%40googlegroups.com<https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_sqlalchemy-2Dalembic_59b308d9-2D7a9f-2D4038-2Dbb52-2Df578c2c9cb69-2540googlegroups.com-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dfooter&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=e0JaCS5rMieNhsG1YWcybg&m=froqtQlu2FU5RhgkzLXlily86IEGWTyf6iZZV2219n8&s=ezHG2RhuQTiK2c1uRvomzUwSwnBDGsv_ZcSlPqsArFQ&e=>.




--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
sqlalchemy-alembic+unsubscr...@googlegroups.com<mailto:sqlalchemy-alembic+unsubscr...@googlegroups.com>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/d0623681577b433554974720744cb5dd835dd126.camel%40us.ibm.com<https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_sqlalchemy-2Dalembic_d0623681577b433554974720744cb5dd835dd126.camel-2540us.ibm.com-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dfooter&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=e0JaCS5rMieNhsG1YWcybg&m=vatsSAFNcrPb4GQHM2wFMnkeSpn14krYrz0SPGRZ-pI&s=ykXVWBrag1g-IiNP13I7r2F6Av3hW_YtR78sPwzKSR8&e=>.


Attachments:

  *   smime.p7s

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy-alembic" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy-alembic+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy-alembic/8414ea00ba7a587499b71457649aa7717661d82c.camel%40us.ibm.com.

Reply via email to