Is there an efficient way to append new data to a registered Spark SQL Table?

Xuelin Cao Mon, 08 Dec 2014 18:30:12 -0800

Hi,
      I'm wondering whether there is an  efficient way to continuously append 
new data to a registered spark SQL table.
      This is what I want:      I want to make an ad-hoc query service to a 
json formated system log. Certainly, the system log is continuously generated. 
I will use spark streaming to connect the system log as my input, and I want to 
find a way to effectively append the new data into an existed spark SQL table. 
Further more, I want the whole table being cached in memory/tachyon.
      It looks like spark sql supports the "INSERT" method, but only for 
parquet file. In addition, it is inefficient to insert a single row every time.
      I do know that somebody build a similar system that I want (ad-hoc query 
service to a on growing system log). So, there must be an efficient way. Anyone 
knows?

Is there an efficient way to append new data to a registered Spark SQL Table?

Reply via email to