So you want to use Spark as the query engine accessing DB2 tables via JDBC?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 6 July 2016 at 20:39, Andreas Bauer <dabuks...@gmail.com> wrote:

> The sql statements are embedded in a PL/1 program using DB2 running ob
> z/OS. Quite powerful, but expensive and foremost shared withother jobs in
> the comapny. The whole job takes approx. 20 minutes.
>
> So I was thinking to use Spark and let the Spark job run on 10 or 20
> virtual instances, which I can spawn easily, on-demand and almost for free
> using a cloud infrastructure.
>
>
>
>
> On 6. Juli 2016 um 21:29:53 MESZ, Jean Georges Perrin <j...@jgp.net> wrote:
>
> What are you doing it on right now?
>
> > On Jul 6, 2016, at 3:25 PM, dabuki wrote:
> >
> > I was thinking about to replace a legacy batch job with Spark, but I'm
> not
> > sure if Spark is suited for this use case. Before I start the proof of
> > concept, I wanted to ask for opinions.
> >
> > The legacy job works as follows: A file (100k - 1 mio entries) is
> iterated.
> > Every row contains a (book) order with an id and for each row approx. 15
> > processing steps have to be performed that involve access to multiple
> > database tables. In total approx. 25 tables (each containing 10k-700k
> > entries) have to be scanned using the book's id and the retrieved data is
> > joined together.
> >
> > As I'm new to Spark I'm not sure if I can leverage Spark's processing
> model
> > for this use case.
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-suited-for-replacing-a-batch-job-using-many-database-tables-tp27300.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >
>
>

Reply via email to