Re: Multitable insert does not work with ORCFiles

2014-10-17 Thread Prasanth Jayachandran
Here is the JIRA for tracking https://issues.apache.org/jira/browse/HIVE-8498




- Prasanth

On Fri, Oct 17, 2014 at 11:25 AM, Dmitry Tolpeko 
wrote:

> Thank you, Prasanth. If you file a Jira please post its number here for
> tracking.
> Dmitry
> On Fri, Oct 17, 2014 at 8:59 PM, Prasanth Jayachandran <
> pjayachand...@hortonworks.com> wrote:
>> Hi Dmitry
>>
>> Yes. I can confirm that this is an issue. But the issue seems to be with
>> vectorized execution and not with ORC. If I disable vectorization the query
>> seems to work fine even for ORC tables. I will dig through the JIRAs to see
>> if this is a known issue else I will file a bug. Thanks for reporting.
>>
>> - Prasanth
>>
>>
>> On Thu, Oct 16, 2014 at 1:30 AM, Dmitry Tolpeko 
>> wrote:
>>
>>> Hi,
>>>
>>> With ORCFiles data inserted into the first table only. Hive 0.13 test:
>>>
>>>  create table orc1
>>>   stored as orc
>>>   tblproperties ("orc.compress"="ZLIB")
>>>   as
>>> select rn
>>> from
>>> (
>>>   select cast(1 as int) as rn from dual
>>>   union all
>>>   select cast(100 as int) as rn from dual
>>>   union all
>>>   select cast(1 as int) as rn from dual
>>> ) t;
>>>
>>>
>>> create table orc_rn1 (rn int);
>>> create table orc_rn2 (rn int);
>>> create table orc_rn3 (rn int);
>>>
>>> from orc1 a
>>> insert overwrite table orc_rn1 select a.* where a.rn < 100
>>> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn <
>>> 1000
>>> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
>>>
>>> select * from orc_rn1
>>> union all
>>> select * from orc_rn2
>>> union all
>>> select * from orc_rn3;
>>>  
>>> Result (only one row):
>>> 1
>>>
>>> If I change orc1 to SequenceFile everything works fine (the last query
>>> returns 3 rows). Can please someone check this? Is it a known issue?
>>>
>>> Thanks,
>>>
>>> Dmitry Tolpeko
>>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: Multitable insert does not work with ORCFiles

2014-10-17 Thread Dmitry Tolpeko
Thank you, Prasanth. If you file a Jira please post its number here for
tracking.

Dmitry

On Fri, Oct 17, 2014 at 8:59 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Hi Dmitry
>
> Yes. I can confirm that this is an issue. But the issue seems to be with
> vectorized execution and not with ORC. If I disable vectorization the query
> seems to work fine even for ORC tables. I will dig through the JIRAs to see
> if this is a known issue else I will file a bug. Thanks for reporting.
>
> - Prasanth
>
>
> On Thu, Oct 16, 2014 at 1:30 AM, Dmitry Tolpeko 
> wrote:
>
>> Hi,
>>
>> With ORCFiles data inserted into the first table only. Hive 0.13 test:
>>
>>  create table orc1
>>   stored as orc
>>   tblproperties ("orc.compress"="ZLIB")
>>   as
>> select rn
>> from
>> (
>>   select cast(1 as int) as rn from dual
>>   union all
>>   select cast(100 as int) as rn from dual
>>   union all
>>   select cast(1 as int) as rn from dual
>> ) t;
>>
>>
>> create table orc_rn1 (rn int);
>> create table orc_rn2 (rn int);
>> create table orc_rn3 (rn int);
>>
>> from orc1 a
>> insert overwrite table orc_rn1 select a.* where a.rn < 100
>> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn <
>> 1000
>> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
>>
>> select * from orc_rn1
>> union all
>> select * from orc_rn2
>> union all
>> select * from orc_rn3;
>>  
>> Result (only one row):
>> 1
>>
>> If I change orc1 to SequenceFile everything works fine (the last query
>> returns 3 rows). Can please someone check this? Is it a known issue?
>>
>> Thanks,
>>
>> Dmitry Tolpeko
>>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.


Re: Multitable insert does not work with ORCFiles

2014-10-17 Thread Prasanth Jayachandran
Hi Dmitry




Yes. I can confirm that this is an issue. But the issue seems to be with 
vectorized execution and not with ORC. If I disable vectorization the query 
seems to work fine even for ORC tables. I will dig through the JIRAs to see if 
this is a known issue else I will file a bug. Thanks for reporting.


- Prasanth

On Thu, Oct 16, 2014 at 1:30 AM, Dmitry Tolpeko 
wrote:

> Hi,
> With ORCFiles data inserted into the first table only. Hive 0.13 test:
> create table orc1
>   stored as orc
>   tblproperties ("orc.compress"="ZLIB")
>   as
> select rn
> from
> (
>   select cast(1 as int) as rn from dual
>   union all
>   select cast(100 as int) as rn from dual
>   union all
>   select cast(1 as int) as rn from dual
> ) t;
> create table orc_rn1 (rn int);
> create table orc_rn2 (rn int);
> create table orc_rn3 (rn int);
> from orc1 a
> insert overwrite table orc_rn1 select a.* where a.rn < 100
> insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
> insert overwrite table orc_rn3 select a.* where a.rn >= 1000;
> select * from orc_rn1
> union all
> select * from orc_rn2
> union all
> select * from orc_rn3;
> 
> Result (only one row):
> 1
> If I change orc1 to SequenceFile everything works fine (the last query
> returns 3 rows). Can please someone check this? Is it a known issue?
> Thanks,
> Dmitry Tolpeko
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Multitable insert does not work with ORCFiles

2014-10-16 Thread Dmitry Tolpeko
Hi,

With ORCFiles data inserted into the first table only. Hive 0.13 test:

create table orc1
  stored as orc
  tblproperties ("orc.compress"="ZLIB")
  as
select rn
from
(
  select cast(1 as int) as rn from dual
  union all
  select cast(100 as int) as rn from dual
  union all
  select cast(1 as int) as rn from dual
) t;


create table orc_rn1 (rn int);
create table orc_rn2 (rn int);
create table orc_rn3 (rn int);

from orc1 a
insert overwrite table orc_rn1 select a.* where a.rn < 100
insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
insert overwrite table orc_rn3 select a.* where a.rn >= 1000;

select * from orc_rn1
union all
select * from orc_rn2
union all
select * from orc_rn3;

Result (only one row):
1

If I change orc1 to SequenceFile everything works fine (the last query
returns 3 rows). Can please someone check this? Is it a known issue?

Thanks,

Dmitry Tolpeko