Re: Spark 1.3.1 + Hive: write output to CSV with header on S3

spark user Fri, 17 Jul 2015 08:40:07 -0700

Hi Roberto 
I have question regarding HiveContext . 
when you create HiveContext where you define Hive connection properties ?  
Suppose Hive is not in local machine i need to connect , how HiveConext will 
know the data base info like url ,username and password ?
String  username = "";
String  password = "";
String url = "jdbc:hive2://quickstart.cloudera:10000/default";



     On Friday, July 17, 2015 2:29 AM, Roberto Coluccio 
<roberto.coluc...@gmail.com> wrote:
   

 Hello community,
I'm currently using Spark 1.3.1 with Hive support for outputting processed data 
on an external Hive table backed on S3. I'm using a manual specification of the 
delimiter, but I'd want to know if is there any "clean" way to write in CSV 
format:
val sparkConf = new SparkConf()val sc = new SparkContext(sparkConf)val 
hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)import 
hiveContext.implicits._   hiveContext.sql( "CREATE EXTERNAL TABLE IF NOT EXISTS 
table_name(field1 STRING, field2 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED 
BY ',' LOCATION '" + path_on_s3 + "'")hiveContext.sql(<an INSERT OVERWRITE 
query to write into the above table>)
I also need the header of the table to be printed on each written file. I tried 
with:
hiveContext.sql("set hive.cli.print.header=true")
But it didn't work.
Any hint?
Thank you.
Best regards,Roberto

Re: Spark 1.3.1 + Hive: write output to CSV with header on S3

Reply via email to