Re: PARTITION error because different columns size

Furcy Pin Tue, 13 Dec 2016 06:24:25 -0800

Hi Joaquin,

Suresh was faster than me ...


Also, you should check this :
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-Dynamic-PartitionInsert

On Tue, Dec 13, 2016 at 3:19 PM, Suresh Kumar Sethuramaswamy <
rock...@gmail.com> wrote:

> Hi Joaquin
>
>     In hive , when u run 'select * from employee' it is going to return
> the partitioned columns also at the end,  whereas you don't want that to be
> inserted into ur ORC table , so ur insert query should look like
>
>       INSERT INTO TABLE employee_orc PARTITION (country='USA',
> office='HQ-TX') select eid,salary from employee where country='USA' and
> office='HQ-TX';
>
>
>  Remember partition in hive is a physical folder name
>
> Regards
> Suresh
>
>
>
> On Tue, Dec 13, 2016 at 6:37 AM Joaquin Alzola <joaquin.alz...@lebara.com>
> wrote:
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hi List
>>
>>
>>
>>
>>
>> I change Spark to 2.0.2 and Hive 2.0.1.
>>
>>
>> I have the bellow tables but the INSERT INTO TABLE employee_orc PARTITION
>> (country='USA', office='HQ-TX') select * from employee where country='USA'
>> and office='HQ-TX';
>>
>>
>> Is giving me à Cannot insert into table `default`.`employee_orc` because
>> the number of columns are different: need 4 columns, but query has 6
>> columns.;
>>
>>
>>
>>
>>
>> When doing select it is adding the Partition as columns ….
>>
>>
>>
>>
>>
>>
>> CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
>>
>>
>> salary String, destination String)
>>
>>
>> COMMENT 'Employee details'
>>
>>
>> PARTITIONED BY(country string, office string)
>>
>>
>> ROW FORMAT DELIMITED
>>
>>
>> FIELDS TERMINATED BY '\t'
>>
>>
>> LINES TERMINATED BY '\n'
>>
>>
>> STORED AS TEXTFILE;
>>
>>
>>
>>
>>
>> CREATE TABLE IF NOT EXISTS employee_orc ( eid int, name String,
>>
>>
>> salary String, destination String)
>>
>>
>> COMMENT 'Employee details'
>>
>>
>> PARTITIONED BY(country string, office string)
>>
>>
>> STORED AS ORC tblproperties ("orc.compress"="ZLIB");
>>
>>
>>
>>
>>
>> 0: jdbc:hive2://localhost:10000> LOAD DATA LOCAL INPATH
>> '/mnt/sample.txt.gz' INTO TABLE employee PARTITION (country='USA',
>> office='HQ-TX');
>>
>>
>> +---------+--+
>>
>>
>> | Result  |
>>
>>
>> +---------+--+
>>
>>
>> +---------+--+
>>
>>
>> No rows selected (0.685 seconds)
>>
>>
>> 0: jdbc:hive2://localhost:10000> select * from employee;
>>
>>
>> +-------+--------------+---------+--------------------+-----
>> -----+---------+--+
>>
>>
>> |  eid  |     name     | salary  |    destination     | country  |
>> office  |
>>
>>
>> +-------+--------------+---------+--------------------+-----
>> -----+---------+--+
>>
>>
>> | 1201  | Gopal        | 45000   | Technical manager  | USA      |
>> HQ-TX   |
>>
>>
>> | 1202  | Manisha      | 45000   | Proof reader       | USA      |
>> HQ-TX   |
>>
>>
>> | 1203  | Masthanvali  | 40000   | Technical writer   | USA      |
>> HQ-TX   |
>>
>>
>> | 1204  | Kiran        | 40000   | Hr Admin           | USA      |
>> HQ-TX   |
>>
>>
>> | 1205  | Kranthi      | 30000   | Op Admin           | USA      |
>> HQ-TX   |
>>
>>
>> +-------+--------------+---------+--------------------+-----
>> -----+---------+--+
>>
>>
>> 5 rows selected (0.358 seconds)
>>
>>
>> 0: jdbc:hive2://localhost:10000> INSERT INTO TABLE employee_orc PARTITION
>> (country='USA', office='HQ-TX') select * from employee where country='USA'
>> and office='HQ-TX';
>>
>>
>> Error: org.apache.spark.sql.AnalysisException: Cannot insert into table
>> `default`.`employee_orc` because the number of columns are different: need
>> 4 columns, but query has 6 columns.; (state=,code=0)
>>
>>
>>
>>
>>
>>
>>
>>
>> 0: jdbc:hive2://localhost:10000> describe employee_orc;
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> |         col_name         | data_type  | comment  |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> | eid                      | int        | NULL     |
>>
>>
>> | name                     | string     | NULL     |
>>
>>
>> | salary                   | string     | NULL     |
>>
>>
>> | destination              | string     | NULL     |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> | # Partition Information  |            |          |
>>
>>
>> | # col_name               | data_type  | comment  |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>>
>>
>>
>> 0: jdbc:hive2://localhost:10000>  describe employee;
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> |         col_name         | data_type  | comment  |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> | eid                      | int        | NULL     |
>>
>>
>> | name                     | string     | NULL     |
>>
>>
>> | salary                   | string     | NULL     |
>>
>>
>> | destination              | string     | NULL     |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> | # Partition Information  |            |          |
>>
>>
>> | # col_name               | data_type  | comment  |
>>
>>
>> | country                  | string     | NULL     |
>>
>>
>> | office                   | string     | NULL     |
>>
>>
>> +--------------------------+------------+----------+--+
>>
>>
>> 10 rows selected (0.045 seconds)
>>
>>
>>
>>
>> This email is confidential and may be subject to privilege. If you are
>> not the intended recipient, please do not copy or disclose its content but
>> contact the sender immediately upon receipt.
>>
>>
>>
>>
>>
>>

Re: PARTITION error because different columns size

Reply via email to