Thanks, I will try.
Peter -- Google Sent with Airmail On January 19, 2016 at 12:44:46, Jeff Zhang (zjf...@gmail.com) wrote: Please make sure you export environment variable HADOOP_CONF_DIR which contains the core-site.xml On Mon, Jan 18, 2016 at 8:23 PM, Peter Zhang <zhangju...@gmail.com> wrote: Hi all, http://spark.apache.org/docs/latest/sparkr.html#sparkr-dataframes From Hive tables You can also create SparkR DataFrames from Hive tables. To do this we will need to create a HiveContext which can access tables in the Hive MetaStore. Note that Spark should have been built with Hive support and more details on the difference between SQLContext and HiveContext can be found in the SQL programming guide. # sc is an existing SparkContext. hiveContext <- sparkRHive.init(sc) sql(hiveContext, "CREATE TABLE IF NOT EXISTS src (key INT, value STRING)") sql(hiveContext, "LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src") # Queries can be expressed in HiveQL. results <- sql(hiveContext, "FROM src SELECT key, value") # results is now a DataFrame head(results) ## key value ## 1 238 val_238 ## 2 86 val_86 ## 3 311 val_311 I use RStudio to run above command, when I run "sql(hiveContext, "CREATE TABLE IF NOT EXISTS src (key INT, value STRING)”)” I got exception: 16/01/19 12:11:51 INFO FileUtils: Creating directory if it doesn't exist: file:/user/hive/warehouse/src 16/01/19 12:11:51 ERROR DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:file:/user/hive/warehouse/src is not a directory or unable to create one) How to use HDFS instead of local file system(file)? Which parameter should to set? Thanks a lot. Peter Zhang -- Google Sent with Airmail -- Best Regards Jeff Zhang