Hi, One more thing - i am talking about spark in cluster mode without hadoop.
Regards, Upkar Sent from my iPhone > On 30-Jun-2017, at 07:55, upkar.ko...@gmail.com wrote: > > Hi, > > This is my line of thinking - Spark offers a variety of transformations which > would support most of the use cases for replacing an ETL tool such as > Informatica. ET part of ETL is perfectly covered. Loading may generally > require more functionality though. Spinning up Informatica cluster which also > has a master slave architecture would cost $$. I know pentaho and other such > tools are there to support the use case. But, can we do the same with spark > cluster. > > Regards, > Upkar > > Sent from my iPhone > >> On 29-Jun-2017, at 22:06, Gourav Sengupta <gourav.sengu...@gmail.com> wrote: >> >> SPARK + JDBC. >> >> But Why? >> >> Regards, >> Gourav Sengupta >> >>> On Thu, Jun 29, 2017 at 3:44 PM, upkar_kohli <upkar.ko...@gmail.com> wrote: >>> Hi, >>> >>> Has anyone tried mixing Spark with some of the other python jdbc/odbc >>> packages to create an end to end ETL framework. Framwork would enable >>> making update, delete and other DML operations along with Stored proc / >>> function calls across variety of databases. Any setup that would be easy to >>> use. >>> >>> I know only know of few odbc python packages that are production ready and >>> widely used, such as pyodbc or sqlAlchemy >>> >>> JayDeBeApi which can interface with JDBC is in Beta stage >>> >>> Would it be a bad use case if this is attempted with foreachpartition >>> through Spark? If not, what could be a good stack for such an >>> implementation using python. >>> >>> Regards, >>> Upkar >>