Hi Joaquin, Suresh was faster than me ...
Also, you should check this : https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-Dynamic-PartitionInsert On Tue, Dec 13, 2016 at 3:19 PM, Suresh Kumar Sethuramaswamy < rock...@gmail.com> wrote: > Hi Joaquin > > In hive , when u run 'select * from employee' it is going to return > the partitioned columns also at the end, whereas you don't want that to be > inserted into ur ORC table , so ur insert query should look like > > INSERT INTO TABLE employee_orc PARTITION (country='USA', > office='HQ-TX') select eid,salary from employee where country='USA' and > office='HQ-TX'; > > > Remember partition in hive is a physical folder name > > Regards > Suresh > > > > On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <joaquin.alz...@lebara.com> > wrote: > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi List >> >> >> >> >> >> I change Spark to 2.0.2 and Hive 2.0.1. >> >> >> I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION >> (country='USA', office='HQ-TX') select * from employee where country='USA' >> and office='HQ-TX'; >> >> >> Is giving me à Cannot insert into table `default`.`employee_orc` because >> the number of columns are different: need 4 columns, but query has 6 >> columns.; >> >> >> >> >> >> When doing select it is adding the Partition as columns …. >> >> >> >> >> >> >> CREATE TABLE IF NOT EXISTS employee ( eid int, name String, >> >> >> salary String, destination String) >> >> >> COMMENT 'Employee details' >> >> >> PARTITIONED BY(country string, office string) >> >> >> ROW FORMAT DELIMITED >> >> >> FIELDS TERMINATED BY '\t' >> >> >> LINES TERMINATED BY '\n' >> >> >> STORED AS TEXTFILE; >> >> >> >> >> >> CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String, >> >> >> salary String, destination String) >> >> >> COMMENT 'Employee details' >> >> >> PARTITIONED BY(country string, office string) >> >> >> STORED AS ORC tblproperties ("orc.compress"="ZLIB"); >> >> >> >> >> >> 0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH >> '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA', >> office='HQ-TX'); >> >> >> +---------+--+ >> >> >> | Result | >> >> >> +---------+--+ >> >> >> +---------+--+ >> >> >> No rows selected (0.685 seconds) >> >> >> 0: jdbc:hive2://localhost:10000> select * from employee; >> >> >> +-------+--------------+---------+--------------------+----- >> -----+---------+--+ >> >> >> | eid | name | salary | destination | country | >> office | >> >> >> +-------+--------------+---------+--------------------+----- >> -----+---------+--+ >> >> >> | 1201 | Gopal | 45000 | Technical manager | USA | >> HQ-TX | >> >> >> | 1202 | Manisha | 45000 | Proof reader | USA | >> HQ-TX | >> >> >> | 1203 | Masthanvali | 40000 | Technical writer | USA | >> HQ-TX | >> >> >> | 1204 | Kiran | 40000 | Hr Admin | USA | >> HQ-TX | >> >> >> | 1205 | Kranthi | 30000 | Op Admin | USA | >> HQ-TX | >> >> >> +-------+--------------+---------+--------------------+----- >> -----+---------+--+ >> >> >> 5 rows selected (0.358 seconds) >> >> >> 0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION >> (country='USA', office='HQ-TX') select * from employee where country='USA' >> and office='HQ-TX'; >> >> >> Error: org.apache.spark.sql.AnalysisException: Cannot insert into table >> `default`.`employee_orc` because the number of columns are different: need >> 4 columns, but query has 6 columns.; (state=,code=0) >> >> >> >> >> >> >> >> >> 0: jdbc:hive2://localhost:10000> describe employee_orc; >> >> >> +--------------------------+------------+----------+--+ >> >> >> | col_name | data_type | comment | >> >> >> +--------------------------+------------+----------+--+ >> >> >> | eid | int | NULL | >> >> >> | name | string | NULL | >> >> >> | salary | string | NULL | >> >> >> | destination | string | NULL | >> >> >> | country | string | NULL | >> >> >> | office | string | NULL | >> >> >> | # Partition Information | | | >> >> >> | # col_name | data_type | comment | >> >> >> | country | string | NULL | >> >> >> | office | string | NULL | >> >> >> +--------------------------+------------+----------+--+ >> >> >> >> >> >> 0: jdbc:hive2://localhost:10000> describe employee; >> >> >> +--------------------------+------------+----------+--+ >> >> >> | col_name | data_type | comment | >> >> >> +--------------------------+------------+----------+--+ >> >> >> | eid | int | NULL | >> >> >> | name | string | NULL | >> >> >> | salary | string | NULL | >> >> >> | destination | string | NULL | >> >> >> | country | string | NULL | >> >> >> | office | string | NULL | >> >> >> | # Partition Information | | | >> >> >> | # col_name | data_type | comment | >> >> >> | country | string | NULL | >> >> >> | office | string | NULL | >> >> >> +--------------------------+------------+----------+--+ >> >> >> 10 rows selected (0.045 seconds) >> >> >> >> >> This email is confidential and may be subject to privilege. If you are >> not the intended recipient, please do not copy or disclose its content but >> contact the sender immediately upon receipt. >> >> >> >> >> >>