Organising Hive Scripts

2015-09-11 Thread Charles Mean
Hello, I am working with a huge hive script that I would like to improve its organisation to a better maintenance in the future. Looking into this issue, I did not found any kind of include or something to split my script into some smaller parts. So, Is there some sort of pattern that is

mapjoin with left join

2015-09-11 Thread Steve Howard
We would like to utilize mapjoin for the following SQL construct: select small.* from small s left join large l on s.id = l.id where l.id is null; We can easily fit small into RAM, but large is over 1TB according to optimizer stats. Unless we set hive.auto.convert.join.noconditionaltask.size =

Fwd: Hive Server Load Data InPath Fails

2015-09-11 Thread Vineet Mishra
Hi All, I am making a Hive Thrift connection to Hive Server and with load data inpath command to one of my table I am landing into some bizzare exception. Other queries seems out to be working fine, but this is the only query which is getting failed 15/09/11 11:40:32 ERROR utils.HCatalogUtil:

Re: Error: java.lang.IllegalArgumentE:Column has wrong number of index entries found - when trying to insert from JSON external table to ORC table

2015-09-11 Thread Daniel Haviv
Hi Prasanth, Can you elaborate on what does the hive.merge.orcfile.stripe.level parameter affext? Thank you for your help. Daniel Sent from my iPhone > On 8 בספט׳ 2015, at 17:48, Prasanth Jayachandran > wrote: > > hive.merge.orcfile.stripe.level

Re: Organising Hive Scripts

2015-09-11 Thread Charles Mean
Great Dmitry, It will certainly help me a lot. I will give it a try, thank you very much for your help. On Fri, Sep 11, 2015 at 4:34 PM, Dmitry Tolpeko wrote: > Charles, > > Not sure what you can do in Hive CLI right now, but consider a new Hive > HPL/SQL component that

Hive Macros roadmap

2015-09-11 Thread Elliot West
Hi, I noticed some time ago the Hive Macro feature. To me at least this seemed like an excellent addition to HQL, allowing the user to encapsulate complex column logic as an independent HQL, reusable macro while avoiding the complexities of Java UDFs. However, few people seem to be aware of them

Re: confluence access

2015-09-11 Thread Lefty Leverenz
Done. Welcome to the Hive wiki team, Wojciech! -- Lefty On Fri, Sep 11, 2015 at 6:30 AM, Wojciech Indyk wrote: > Hello! > Please grant me a write-access to the confluence (user woj_in), due to > >

Re: Hive Macros roadmap

2015-09-11 Thread Edward Capriolo
Macro's are in and tested. No one will remove them. The unit tests ensure they keep working. On Fri, Sep 11, 2015 at 3:38 PM, Elliot West wrote: > Hi, > > I noticed some time ago the Hive Macro feature. To me at least this seemed > like an excellent addition to HQL, allowing

Re: mapjoin with left join

2015-09-11 Thread Sergey Shelukhin
As far as I know it’s not currently supported. The large table will be streamed in multiple tasks with the small table in memory, so there’s not one place that knows for sure there was no row in the large table for a particular small table row in any of the locations. It could have no match in

Re: Hive Server Load Data InPath Fails

2015-09-11 Thread Takahiko Saito
What version of Hive is being used? In the past, I can see we had the following JIRA: https://issues.apache.org/jira/browse/HIVE-4256: JDBC2 HiveConnection does not use the specified database. On Fri, Sep 11, 2015 at 12:44 AM, Vineet Mishra wrote: > > Hi All, > > I am

How to use the Hive Lineage Service

2015-09-11 Thread sumit ghosh
Hi,I am trying to use the lineage service built in Hive. I need the tables used the columns at the source and how they are related to the target. So hive has this lineage servicehive --service lineage `cat myQuery` However it always errors out failing to parse the Hive Query. What am I doing

Re: Organising Hive Scripts

2015-09-11 Thread Dmitry Tolpeko
Charles, Not sure what you can do in Hive CLI right now, but consider a new Hive HPL/SQL component that will be included to new Hive versions and that currently you can compile and run separately, see https://github.com/apache/hive/tree/master/hplsql or www.hplsql.org It supports include files,

confluence access

2015-09-11 Thread Wojciech Indyk
Hello! Please grant me a write-access to the confluence (user woj_in), due to https://issues.apache.org/jira/browse/HIVE-11329?focusedCommentId=14740243=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14740243 -- Kindly regards/ Pozdrawiam, Wojciech Indyk

Checking the number of Readers

2015-09-11 Thread James Pirz
I am using Hive 1.2.0 on Hadoop 2.6 (on a cluster with 10 machines) and I am trying to understand the performance of a full-table scan. I am running the following query: SELECT * FROM LINEITEM WHERE L_LINENUMBER < 0; and I am measuring its performance in different scenarios: using "MR vs. Tez"