Re: migration from Teradata to Spark SQL

2016-05-04 Thread Lohith Samaga M
Hi Can you look at Apache Drill as sql engine on hive? Lohith Sent from my Sony Xperia™ smartphone Tapan Upadhyay wrote Thank you everyone for guidance. Jorn our motivation is to move bulk of adhoc queries to hadoop so that we have enough bandwidth on our DB for imp batch/queries.

Re: migration from Teradata to Spark SQL

2016-05-04 Thread Tapan Upadhyay
Thank you everyone for guidance. *Jorn* our motivation is to move bulk of adhoc queries to hadoop so that we have enough bandwidth on our DB for imp batch/queries. For implementing lambda architecture is it possible to get the real time updates from Teradata of any insert/update/delete? DBlogs?

Re: migration from Teradata to Spark SQL

2016-05-04 Thread Alonso Isidoro Roman
I agree with Deepak and i would try to save data in parquet and avro format, if you can, try to measure the performance and choose the best, it will probably be parquet, but you have to know for yourself. Alonso Isidoro Roman. Mis citas preferidas (de hoy) : "Si depurar es el proceso de quitar

Re: migration from Teradata to Spark SQL

2016-05-04 Thread Jörn Franke
Look at lambda architecture. What is the motivation of your migration? > On 04 May 2016, at 03:29, Tapan Upadhyay wrote: > > Hi, > > We are planning to move our adhoc queries from teradata to spark. We have > huge volume of queries during the day. What is best way to go

Re: migration from Teradata to Spark SQL

2016-05-04 Thread Mich Talebzadeh
Hi, How are you going to sync your data following migration? Spark SQL is a tool for querying data. It is not a database per se like Hive or anything else. I am just doing the same migrating Sybase IQ to Hive. Sqoop can do the initial ELT (read ELT not ETL). In other words use Sqoop to get

Re: migration from Teradata to Spark SQL

2016-05-03 Thread Deepak Sharma
Hi Tapan I would suggest an architecture where you have different storage layer and data servng layer. Spark is still best for batch processing of data. So what i am suggesting here is you can have your data stored as it is in some hdfs raw layer , run your ELT in spark on this raw data and

migration from Teradata to Spark SQL

2016-05-03 Thread Tapan Upadhyay
Hi, We are planning to move our adhoc queries from teradata to spark. We have huge volume of queries during the day. What is best way to go about it - 1) Read data directly from teradata db using spark jdbc 2) Import data using sqoop by EOD jobs into hive tables stored as parquet and then run