On Tue, Feb 25, 2020, at 11:06 AM, Ke Zhu - k...@us.ibm.com wrote: > According to Apache Hive, Update/Delete can only be performed on tables that > supports ACID. see > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Update > > > this is a database that can change schema structures but not > > insert/update/delete rows? is there some kind of SQL layer that has to be > > in use that isn't there when you create structures? > > Yes! Basically it separate data store and data model so that you can update > table schema as metadata while the actual data can be loaded/streamed into > external data store w/o SQL interface. > > > that wouldn't be very easy but also it would be way better to store the > > version info in the target DB itself. > > totally understand this. I see this design as advantage as well just like > other schema management tool like sqitch. I'm just looking for possiblilities > since I've seen it uses the same connection to execute schema changes and > versioning change in a migration context. > > > I don't know anything about Hive/Presto, but they *are* databases so I'd > > assume you can put data in them > > Yes to Hive, when using PyHive, it just needs to fix > https://github.com/dropbox/PyHive/issues/315
oh, that issue is talking about rowcount. Alembic does not need rowcount to function correctly. I see that Alembic is doing this now, however there is a dialect-level flag called "supports_sane_rowcount", if this were False, Alembic should be checking this and skipping that particular check. is the "rowcount" check the only issue here? this part can be changed on the Alembic side. > > While, the answer will be no to PrestoSQL which is just a SQL query engine > that delegates data model and data store to query targets > (MySQL/Postgres/Kafka/Elasticsearch etc) via connectors. > > On Mon, 2020-02-24 at 18:28 -0500, Mike Bayer wrote: >> >> >> On Mon, Feb 24, 2020, at 3:44 PM, Ke Zhu - k...@us.ibm.com wrote: >>> Mike, >>> >>> Thanks for the pointers. I've figured out the programming part and >>> discovers more things when integrating PyHive: >>> >>> 1. It requires the table alembic_version to be transactional so that it can >>> do Update/Delete when upgrade/downgrade. which is challege for Hive3 which >>> has limited ACID support. Since I could not figure out a way to program a >>> transactional table via SQLAlechemy API (see >>> https://github.com/dropbox/PyHive/issues/314), it ended up a decoration to >>> patch sqlalchemy.schema.CreateTable by appending table properties. >> >> it's not a hard requirement that there's a DB transaction in use, you could >> run alembic in an "autocommit" mode and that shouldn't cause any problem. >> you just won't be able to roll back if something fails. im not completely >> sure what you mean by "the table to be transactional so that it can do >> update/delete" but from Alembic's point of view it just needs to run >> INSERT/UPDATE/DELETE but there doesn't have to be any ACID guarantees. >> >> >>> 2. PyHive doesn't fully support Update/Delete for Hive/Presto yet. it's >>> easy to patch PyHive but the key problem is it doesn't support >>> transactional DDL like rollback in Hive3. >>> >>> Although I've managed to get `alembic upgrade` and `alembic downgrade` to >>> work on Hive 3, it's still not a fully transactional experience (e.g., >>> changed schema sccessfully but failed to update table alembic_version). >> >> so..this is a database that can change schema structures but not >> insert/update/delete rows? im not following. is there some kind of SQL layer >> that has to be in use that isn't there when you create structures? >> >> >>> >>> I wonder if there's any design direction in Alembic to allowing storing the >>> version table `alembic_version` in another db when dealing with non-RDBMS >>> SQL engine (Hive/Presto). e.g., supporting a postgres db to store table >>> alembic while delivering the actual changes to Hive. I had a PoC to using >>> multi-db template to manage the table `alembic_version` in a RDBMS while >>> denying any operation on table `alembic_version` in Hive/Presto. it works >>> now but does it sound right? >> >> that wouldn't be very easy but also it would be way better to store the >> version info in the target DB itself. I don't know anything about >> Hive/Presto, but they *are* databases so I'd assume you can put data in them. >> >> >>> >>> On Sat, 2020-01-25 at 18:19 -0500, Mike Bayer wrote: >>>> >>>> >>>> On Fri, Jan 24, 2020, at 1:56 PM, Ke Zhu wrote: >>>>> Just discovered this post when trying to do exact same thing (besides >>>>> planning to support one more dialect). >>>>> >>>>> > Anywhere in your hive dialect, simply put the above code that you have >>>>> > (using the correct imports of course). >>>>> >>>>> Does it mean it must introduce dependency to alembic (since it uses >>>>> alembic.ddl.impl.DefaultImpl) in a package (.e.g, pyHive) that supports >>>>> sqlalchemy interfaces? >>>> >>>> well you have to put it in a try/except ImportError block so that if >>>> alembic isn't installed, it silently passes. there's a github issue to add >>>> support for real entrypoints but it hasn't been that critical. >>>> >>>>> >>>>> If not, is there any guidance to support this at alembic level in a >>>>> plug-gable way? E.g., declare a HiveImpl class in `env.py` of a project >>>>> uses alembic? >>>> >>>> you could put one in your env.py also but if you are the person working on >>>> the dialect you can have this built in, see the example in >>>> sqlalchemy-redshift: >>>> https://github.com/sqlalchemy-redshift/sqlalchemy-redshift/blob/master/sqlalchemy_redshift/dialect.py#L27 >>>> >>>> >>>> >>>>> >>>>> PS: I raised this question >>>>> <https://stackoverflow.com/questions/59887588/how-to-add-new-dialect-to-alembic-besides-built-in-dialects> >>>>> in stackoverflow but raised this group is a better place to get help. >>>>> >>>>> >>>>> On Friday, February 10, 2017 at 9:45:38 AM UTC-5, mike bayer wrote: >>>>>> >>>>>> >>>>>> On 02/10/2017 07:41 AM, Alexander Peletz wrote: >>>>>> > Hello, >>>>>> > >>>>>> > I would like to use Alembic to manage my Hive Metastore. I have >>>>>> > installed, PyHive, SqlAlchemy, and Alembic. I am able to create a >>>>>> > functional engine object using the 'hive' dialect in sqlalchemy, >>>>>> > however >>>>>> > I cannot get Alembic to recognize this dialect. The problem appears to >>>>>> > be a lack of a HiveImpl class in the Alembic package. I attempted to >>>>>> > resolve this by creating an alembic/ddl/hive.py module and pasting the >>>>>> > following code into that module: >>>>>> > >>>>>> > >>>>>> > from .impl import DefaultImpl >>>>>> > >>>>>> > class HiveImpl(DefaultImpl): >>>>>> > __dialect__ = 'hive' >>>>>> >>>>>> >>>>>> >>>>>> you don't actually have to create a "hive.py" file. Anywhere in your >>>>>> hive dialect, simply put the above code that you have (using the correct >>>>>> imports of course). The DefaultImpl uses a metaclass that will allow >>>>>> the "hive" name to be available to alembic as a result of this class >>>>>> being created. >>>>>> >>>>>> >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > I simply want to be able to execute raw SQL against a Hive instance >>>>>> > (no >>>>>> > ORM implementation needed) and I was hoping to use Alembic to manage >>>>>> > the >>>>>> > minimum upgrade/downgrade functionality. Are there any simple edits I >>>>>> > can make to the Alembic source code to allow me to achieve this goal? >>>>>> > >>>>>> > >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Alexander >>>>>> > >>>>>> > -- >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "sqlalchemy-alembic" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send an >>>>> email to sqlalchemy-alembic+unsubscr...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/sqlalchemy-alembic/59b308d9-7a9f-4038-bb52-f578c2c9cb69%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/sqlalchemy-alembic/59b308d9-7a9f-4038-bb52-f578c2c9cb69%40googlegroups.com?utm_medium=email&utm_source=footer>. >>>> >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "sqlalchemy-alembic" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to sqlalchemy-alembic+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/sqlalchemy-alembic/d0623681577b433554974720744cb5dd835dd126.camel%40us.ibm.com >>> >>> <https://groups.google.com/d/msgid/sqlalchemy-alembic/d0623681577b433554974720744cb5dd835dd126.camel%40us.ibm.com?utm_medium=email&utm_source=footer>. >>> >>> >>> *Attachments:* >>> * smime.p7s >> > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy-alembic" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy-alembic+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy-alembic/8414ea00ba7a587499b71457649aa7717661d82c.camel%40us.ibm.com > > <https://groups.google.com/d/msgid/sqlalchemy-alembic/8414ea00ba7a587499b71457649aa7717661d82c.camel%40us.ibm.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "sqlalchemy-alembic" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy-alembic+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy-alembic/584f1f4e-7fa6-45aa-9e32-e2a476e18b47%40www.fastmail.com.