Hi
Can you look at Apache Drill as sql engine on hive?

Lohith

Sent from my Sony Xperia™ smartphone


---- Tapan Upadhyay wrote ----

Thank you everyone for guidance.

Jorn our motivation is to move bulk of adhoc queries to hadoop so that we have 
enough bandwidth on our DB for imp batch/queries.

For implementing lambda architecture is it possible to get the real time 
updates from Teradata of any insert/update/delete? DBlogs?

Deepak should we query data from cassandra using spark? how it will be 
different in terms of performance if we store our data in hive tables(parquet) 
and query using spark? in case there is not much performance gain why add one 
more layer of processing

Mich we plan to sync the data using sqoop hourly/EOD jobs? still not decided 
how frequently we would need to do that. It will be based on user requirement. 
In case they need real time data we need to think of an alternative? How are 
you doing the same for Sybase? How you sync real time?

Thank you!!


Regards,
Tapan Upadhyay
+1 973 652 8757

On Wed, May 4, 2016 at 4:33 AM, Alonso Isidoro Roman 
<alons...@gmail.com<mailto:alons...@gmail.com>> wrote:
I agree with Deepak and i would try to save data in parquet and avro format, if 
you can, try to measure the performance and choose the best, it will probably 
be parquet, but you have to know for yourself.

Alonso Isidoro Roman.

Mis citas preferidas (de hoy) :
"Si depurar es el proceso de quitar los errores de software, entonces programar 
debe ser el proceso de introducirlos..."
 -  Edsger Dijkstra

My favorite quotes (today):
"If debugging is the process of removing software bugs, then programming must 
be the process of putting ..."
  - Edsger Dijkstra

"If you pay peanuts you get monkeys"


2016-05-04 9:22 GMT+02:00 Jörn Franke 
<jornfra...@gmail.com<mailto:jornfra...@gmail.com>>:
Look at lambda architecture.

What is the motivation of your migration?

On 04 May 2016, at 03:29, Tapan Upadhyay 
<tap...@gmail.com<mailto:tap...@gmail.com>> wrote:

Hi,

We are planning to move our adhoc queries from teradata to spark. We have huge 
volume of queries during the day. What is best way to go about it -

1) Read data directly from teradata db using spark jdbc

2) Import data using sqoop by EOD jobs into hive tables stored as parquet and 
then run queries on hive tables using spark sql or spark hive context.

any other ways through which we can do it in a better/efficiently?

Please guide.

Regards,
Tapan



Information transmitted by this e-mail is proprietary to Mphasis, its 
associated companies and/ or its customers and is intended 
for use only by the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended 
recipient or it appears that this mail has been forwarded 
to you without proper authority, you are notified that any use or dissemination 
of this information in any manner is strictly 
prohibited. In such cases, please notify us immediately at 
mailmas...@mphasis.com and delete this mail from your records.

Reply via email to