I forgot something.
Before query data from hive. We should set 
set hive.mapred.supports.subdirectories=true;
set mapreduce.input.fileinputformat.input.dir.recursive=true;


------------------ Original ------------------
From:  "261810726";<261810...@qq.com>;
Date:  Thu, Mar 23, 2017 09:58 PM
To:  "chenliang613"<chenliang...@apache.org>; 
"dev"<dev@carbondata.incubator.apache.org>; 
Cc:  "Mention"<ment...@noreply.github.com>; 
Subject:  Re:  [apache/incubator-carbondata] [CARBONDATA-727][WIP] add 
hiveintegration for carbon (#672)



Hi, liang:
    I create a new profile "integration/hive" and the CI is OK now. But I still 
have some problems in altering hive metastore schema.
    My steps are as following:
    
1.build carbondata


mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package -Phadoop-2.7.2 
-Phive-1.2.1



2.copy jars


mkdir ~/spark-2.1/carbon_lib
cp 
~/cenyuhai/incubator-carbondata/assembly/target/scala-2.11/carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar
 ~/spark-2.1/carbon_lib/
cp 
~/cenyuhai/incubator-carbondata/integration/hive/target/carbondata-hive-1.1.0-incubating-SNAPSHOT.jar
 ~/spark-2.1/carbon_lib/



3.create sample.csv and put it into hdfs


id,name,scale,country,salary
1,yuhai,1.77,china,33000.0
2,runlin,1.70,china,32000.0



4.create table in spark


spark-shell --jars 
"/data/hadoop/spark-2.1/carbon_lib/carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar,/data/hadoop/spark-2.1/carbon_lib/carbondata-hive-1.1.0-incubating-SNAPSHOT.jar"


#execute these commands:
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.CarbonSession._
val rootPath = "hdfs:////user/hadoop/carbon"
val storeLocation = s"$rootPath/store"
val warehouse = s"$rootPath/warehouse"
val metastoredb = s"$rootPath/metastore_db"


val carbon = 
SparkSession.builder().enableHiveSupport().config("spark.sql.warehouse.dir", 
warehouse).config(org.apache.carbondata.core.constants.CarbonCommonConstants.STORE_LOCATION,
 storeLocation).getOrCreateCarbonSession(storeLocation, metastoredb)


carbon.sql("create table hive_carbon(id int, name string, scale decimal, 
country string, salary double) STORED BY 'carbondata'")
carbon.sql("LOAD DATA INPATH 'hdfs://mycluster/user/hadoop/sample.csv' INTO 
TABLE hive_carbon")



5.alter table schema in hive


cp ~/spark-2.1/carbon_lib/carbon-assembly-*.jar hive/auxlibs/
cp spark-catalyst*.jar hive/auxlibs/
export HIVE_AUX_JARS_PATH=hive/auxlibs/


#start hive cli
./$HIVE_HOME/bin/hive


#execute commands:
alter table hive_carbon set FILEFORMAT
INPUTFORMAT "org.apache.carbondata.hive.MapredCarbonInputFormat"
OUTPUTFORMAT "org.apache.carbondata.hive.MapredCarbonOutputFormat"
SERDE "org.apache.carbondata.hive.CarbonHiveSerDe";


alter table hive_carbon set LOCATION 
'hdfs://mycluster-tj/user/hadoop/carbon/store/default/hive_carbon';
alter table hive_carbon change col id INT;  
alter table hive_carbon add columns(name string, scale decimal, country string, 
salary double);





6.check table schema


execute "show create table hive_carbon"





7. execute "select * from hive_carbon" and "select * from hive_carbon order by 
id"


















8.the table are still available in spark 









------------------ Original ------------------
From:  "Liang Chen";<notificati...@github.com>;
Date:  Thu, Mar 23, 2017 00:09 AM
To:  "apache/incubator-carbondata"<incubator-carbond...@noreply.github.com>; 
Cc:  "Sea"<261810...@qq.com>; "Mention"<ment...@noreply.github.com>; 
Subject:  Re: [apache/incubator-carbondata] [CARBONDATA-727][WIP] add 
hiveintegration for carbon (#672)




@cenyuhai  Thank you contributed this feature.
 Suggest creating a new profile for "integration/hive" module,  and let all 
hive related code decoupled from current modules,  let CI run normally first.
 
??
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Reply via email to