Re: query orc file by hive

2015-11-13 Thread patcharee

Hi,

It work with non-partition ORC, but does not work with (2-column) 
partitioned ORC.


Thanks,
Patcharee


On 09. nov. 2015 10:55, Elliot West wrote:

Hi,

You can create a table and point the location property to the folder 
containing your ORC file:


CREATE EXTERNAL TABLE orc_table (
  
)
STORED AS ORC
LOCATION '/hdfs/folder/containing/orc/file'
;


https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable

Thanks - Elliot.

On 9 November 2015 at 09:44, patcharee > wrote:


Hi,

How can I query an orc file (*.orc) by Hive? This orc file is
created by other apps, like spark, mr.

Thanks,
Patcharee






Re: query orc file by hive

2015-11-13 Thread Dave Maughan
Hi,

Expand Elliot's answer for a partitioned table, e.g.:

CREATE EXTERNAL TABLE orc_table (
  
)
PARTITIONED BY (col1 type, col2 type)
STORED AS ORC
LOCATION '/hdfs/folder/containing/orc/files';

ALTER TABLE orc_table ADD PARTITION (col1 = 'val1', col2 = 'val2') LOCATION
'/hdfs/folder/containing/orc/files/col1=val1/col2=val2';

Thanks - Dave


On Fri, 13 Nov 2015 at 11:59 patcharee  wrote:

> Hi,
>
> It work with non-partition ORC, but does not work with (2-column)
> partitioned ORC.
>
> Thanks,
> Patcharee
>
>
>
> On 09. nov. 2015 10:55, Elliot West wrote:
>
> Hi,
>
> You can create a table and point the location property to the folder
> containing your ORC file:
>
> CREATE EXTERNAL TABLE orc_table (
>   
> )
> STORED AS ORC
> LOCATION '/hdfs/folder/containing/orc/file'
> ;
>
>
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
>
> Thanks - Elliot.
>
> On 9 November 2015 at 09:44, patcharee  wrote:
>
>> Hi,
>>
>> How can I query an orc file (*.orc) by Hive? This orc file is created by
>> other apps, like spark, mr.
>>
>> Thanks,
>> Patcharee
>>
>
>
>


RE: export/import in hive failing with nested directory exception!

2015-11-13 Thread Mich Talebzadeh
This potentially breaks the ACID properties of Hive.

 

My own take is that export/import functionality was added before ACID 
properties and transactional tables were added to Hive and as such did not 
cater for this type of work.

 

There are two options IMO:

 

1.Update documentation for import/export to highlight the limitations with 
regard to transactional tables

2.Provide an alternative mechanism on how to migrate transactional tables. 

 

HTH,

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one 
out shortly

 

http://talebzadehmich.wordpress.com  

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Technology Ltd, its 
subsidiaries or their employees, unless expressly so stated. It is the 
responsibility of the recipient to ensure that this email is virus free, 
therefore neither Peridale Ltd, its subsidiaries nor their employees accept any 
responsibility.

 

From: sreebalineni . [mailto:sreebalin...@gmail.com] 
Sent: 13 November 2015 05:39
To: user@hive.apache.org
Subject: Re: export/import in hive failing with nested directory exception!

 

Hello Gopal,

Are there any plans for fixing this? Any idea?

 

On Fri, Nov 13, 2015 at 6:31 AM, Gopal Vijayaraghavan  > wrote:

Hi,

>Thanks Gopal. Indeed table t is defined as ORC and transactional.
>
>Any reason why this should not work for transactional tables?

The committed transactions list is actually missing from the exported
metadata.

So the EXPORT as it exists today is a dirty read snapshot, which is not a
good thing when the data is continously being streamed in.

I don't think IMPORT likes that (and why should it?).

Try with a fully compacted table.

Cheers,
Gopal



 



Re: Cross join/cartesian product explanation

2015-11-13 Thread Rory Sawyer
Hi Gopal,

Thanks for the detailed response.

It’s really a very simple query that I’m trying to run:
select
a.a_id,
b.b_id,
count(*) as c
from
table_a a, 
table_b b
where
bloom_contains(a_id, b_id_bloom)
group by
a.a_id,
b.b_id;

Where “bloom_contains” is a custom UDF. The only changes I made were renaming 
the tables and columns. The sizes of the tables I’m running against are small — 
roughly 50-100Mb — but this query would need to be expanded to run on a table 
that is >100Gb (table_b would likely max out around 100Mb).

Any suggestions on how to approach this would be greatly appreciated.

Best,
Rory


Best practices for monitoring hive

2015-11-13 Thread Ashok Kumar
 Hi,
I would like to know best practices to monitor the health and performance of 
Hive and hive server, trouble shooting and catching errors etc.
to be clear we do not use any bespoke monitoring tool and keen on developing 
our own in house tools to be integrated into general monitoring tools to be 
picked up by operations.
greetings and thanks

does hive support non equality join?

2015-11-13 Thread glen
from cwiki, the answer is no. it seems supported after some test.
by the way, is there any better doc for hive?




Re: [VOTE] Hive 2.0 release plan

2015-11-13 Thread Gopal Vijayaraghavan
(+user@)

+1.

Cheers,
Gopal

On 11/13/15, 5:54 PM, "Lefty Leverenz"  wrote:

>The Hive bylaws require this to be submitted on the user@hive mailing list
>(even though users don't get to vote).  See Release Plan in Actions
>.
>
>-- Lefty
...
>> > On Fri, Nov 13, 2015 at 1:38 PM, Sergey Shelukhin <
>> ser...@hortonworks.com>
>> > wrote:
>> >
>> >> Hi.
>> >> With no strong objections on DISCUSS thread, some issues raised and
>> >> addressed, and a reminder from Carl about the bylaws for the release
>> >> process, I propose we release the first version of Hive 2 (2.0), and
>> >> nominate myself as release manager.
>> >> The goal is to have the first release of Hive with aggressive set of
>>new
>> >> features, some of which are ready to use and some are at experimental
>> >> stage and will be developed in future Hive 2 releases, in line with
>>the
>> >> Hive-1-Hive-2 split discussion.
>> >> If the vote passes, the timeline to create a branch should be around
>>the
>> >> end of next week (to minimize merging in the wake of the release),
>>and
>> the
>> >> timeline to release would be around the end of November, depending on
>> the
>> >> issues found during the RC cutting process, as usual.
>> >>
>> >> Please vote:
>> >> +1 proceed with the release plan
>> >> +-0 don¹t care
>> >> -1 don¹t proceed with the release plan, for such and such reasons
>> >>
>> >> The vote will run for 3 days.
>> >>
>> >>
>>




Re: hive transaction strange behaviour

2015-11-13 Thread Elliot West
It is the compaction process that creates the base files. Check your
configuration to ensure that compaction should be running. I believe the
compactor should run periodically. You can also request a compaction using
the appropriate ALTER TABLE HQL DDL command.

Elliot.

On Friday, 13 November 2015, Sanjeev Verma 
wrote:

> I have enable the hive transaction and able to see the delta files created
> for some of the partition but i dont not see any base file created yet.it
> seems strange to me seeing so many delta files without any base file.
> Could somebody let me know when Base file created.
>
> Thanks
>


Re: query orc file by hive

2015-11-13 Thread patcharee

Hi,

It works after I altered add partition. Thanks!

My partitioned orc file (directory) is created by Spark, therefore hive 
is not aware of the partitions automatically.


Best,
Patcharee

On 13. nov. 2015 13:08, Elliot West wrote:

Have you added the partitions to the meta store?

ALTER TABLE ... ADD PARTITION ...

If using Spark, I believe it has good support to do this automatically 
with the HiveContext, although I have not used it myself.


Elliot.

On Friday, 13 November 2015, patcharee > wrote:


Hi,

It work with non-partition ORC, but does not work with (2-column)
partitioned ORC.

Thanks,
Patcharee


On 09. nov. 2015 10:55, Elliot West wrote:

Hi,

You can create a table and point the location property to the
folder containing your ORC file:

CREATE EXTERNAL TABLE orc_table (

)
STORED AS ORC
LOCATION '/hdfs/folder/containing/orc/file'
;



https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable

Thanks - Elliot.

On 9 November 2015 at 09:44, patcharee > wrote:

Hi,

How can I query an orc file (*.orc) by Hive? This orc file is
created by other apps, like spark, mr.

Thanks,
Patcharee