Hey Arash -- We wrote this for a similar use case to yours (as I understand it). It's an opinionated operator (assumes loading data from AWS S3) but it has an pseudo-"upsert" (INSERT ... ON DUPLICATE KEY UPDATE) method for loading data so you might be able to adapt to your needs.
https://github.com/airflow-plugins/mysql_plugin/blob/master/operators/s3_to_mysql_operator.py#L9 -Ben On Tue, Jun 5, 2018 at 8:55 PM Arash Soheili <tonyar...@gmail.com> wrote: > I have looked through those and didn't find what I needed. Although there > is the mysql operator and I have used that to implement and insert or > update. > > I was looking for something like this > > https://wiki.pentaho.com/plugins/servlet/mobile?contentId=8292089#content/view/8292089 > . > A way to bulk insert or update based on lookup key. What would be the most > optimized way to do this in Airflow? > > On Tue, Jun 5, 2018, 9:47 PM Taylor Edmiston <tedmis...@gmail.com> wrote: > > > Hey Arash - > > > > There are some common operators built-in > > < > https://github.com/apache/incubator-airflow/tree/master/airflow/operators > > > > > to Airflow and some in contrib > > < > > > https://github.com/apache/incubator-airflow/tree/master/airflow/contrib/operators > > > > > as well. > > > > We also maintain a community sourced GitHub org of Airflow plugins > (mostly > > hooks and operators) at https://github.com/airflow-plugins. > > > > Are there specific sources/destinations you're looking for to match what > > you use in Pentaho? > > > > Best, > > Taylor > > > > *Taylor Edmiston* > > Blog <https://blog.tedmiston.com/> | CV > > <https://stackoverflow.com/cv/taylor> | LinkedIn > > <https://www.linkedin.com/in/tedmiston/> | AngelList > > <https://angel.co/taylor> | Stack Overflow > > <https://stackoverflow.com/users/149428/taylor-edmiston> > > > > > > On Tue, Jun 5, 2018 at 8:57 PM, Arash Soheili <tonyar...@gmail.com> > wrote: > > > > > Hi, > > > > > > I'm new to Airlfow and helping to setup our organization to transition > > away > > > from using Pentaho Data Integration for our ETL. Although there are a > lot > > > of things I don't like about Pentaho they do have some nice standard > > > modules like batch databae insert/update which are common ETL tasks. > > > > > > As I'm new to Airflow I haven't seen any standard Operators for this > kind > > > of task which I would think would be a common use case in Airflow or > any > > > ETL. Am I missing this information or is it expected upon each Airflow > > > users to implement their own standard operators for this kind of > > operation. > > > I would think this should at some point become part of Airflow > codebase. > > > > > > Arash > > > > > > -- [image: Astronomer Logo] <https://www.astronomer.io/> *Ben Gregory* Data Engineer Mobile: +1-615-483-3653 • Online: astronomer.io <https://www.astronomer.io/> Download our new ebook. <http://marketing.astronomer.io/guide/> From Volume to Value - A Guide to Data Engineering.