What are you doing it on right now? > On Jul 6, 2016, at 3:25 PM, dabuki <dabuks...@gmail.com> wrote: > > I was thinking about to replace a legacy batch job with Spark, but I'm not > sure if Spark is suited for this use case. Before I start the proof of > concept, I wanted to ask for opinions. > > The legacy job works as follows: A file (100k - 1 mio entries) is iterated. > Every row contains a (book) order with an id and for each row approx. 15 > processing steps have to be performed that involve access to multiple > database tables. In total approx. 25 tables (each containing 10k-700k > entries) have to be scanned using the book's id and the retrieved data is > joined together. > > As I'm new to Spark I'm not sure if I can leverage Spark's processing model > for this use case. > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-suited-for-replacing-a-batch-job-using-many-database-tables-tp27300.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org