Re: Adding a virtual column for a custom input format

2020-05-06 Thread Gopal V
Hi, > I'm hoping someone can help me shed some light on how Hive deals with virtual columns. The virtual column impl is not extensible in Hive, it is a fixed set of enums. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/VirtualColumn.java#L64

Re: Query rerun with global limitation

2015-01-26 Thread Gopal V
On 1/26/15, 9:18 AM, Philippe Kernévez wrote: This degradation is due to this bug (requests are replayed with a full scan) : https://issues.apache.org/jira/browse/HIVE-9382 I doubt that is the issue you are hitting, if you're moving from 0.13 to 0.14. You are possibly hitting HIVE-9401. To

Re: Getting Tez working against cdh 5.3

2015-01-23 Thread Gopal V
an administrator or overwriting any of the system installed JARs. HTH. Cheers, Gopal On Tue, Jan 20, 2015 at 6:39 PM, Gopal V gop...@apache.org wrote: On 1/20/15, 12:34 PM, Edward Capriolo wrote: Actually more likely something like this: https://issues.apache.org/jira/browse/TEZ-1621 I

Re: Spark performance for small queries

2015-01-22 Thread Gopal V
On 1/22/15, 3:03 AM, Saumitra Shahapure (Vizury) wrote: We were comparing performance of some of our production hive queries between Hive and Spark. We compared Hive(0.13)+hadoop (1.2.1) against both Spark 0.9 and 1.1. We could see that the performance gains have been good in Spark. Is there

Re: Getting Tez working against cdh 5.3

2015-01-20 Thread Gopal V
On 1/20/15, 12:34 PM, Edward Capriolo wrote: Actually more likely something like this: https://issues.apache.org/jira/browse/TEZ-1621 I have a working Hive-13 + Tez install on CDH-5.2.0-1.cdh5.2.0.p0.36. Most of the work needed to get that to work was to build all of Hive+Tez against the

Re: Tez session after closing CLI

2014-12-08 Thread Gopal V
On 12/8/14, 10:09 PM, Fabio wrote: Hi everyone, when running Hive on Tez, a Tez session is alive within the Hive CLI until I leave the CLI. So if I run on the terminal something like hive -f query.sql, once the query is completed the Tez session is closed. Is there a way to run a query in this

Re: Insert into dynamic partitions performance

2014-12-06 Thread Gopal V
On 12/6/14, 6:27 AM, Daniel Haviv wrote: Hi, I'm executing an insert statement that goes over 1TB of data. The map phase goes well but the reduce stage only used one reducer which becomes a great bottleneck. Are you inserting into a bucketed or sorted table? If the destination table is

Re: Insert into dynamic partitions performance

2014-12-06 Thread Gopal V
speed. Cheers, Gopal On 7 בדצמ׳ 2014, at 06:06, Gopal V gop...@apache.org wrote: On 12/6/14, 6:27 AM, Daniel Haviv wrote: Hi, I'm executing an insert statement that goes over 1TB of data. The map phase goes well but the reduce stage only used one reducer which becomes a great bottleneck

Re: Enabling Tez sessions on HiveServer2

2014-12-04 Thread Gopal V
On 12/3/14, 3:34 PM, Pala M Muthaia wrote: I didn't know doAs needs to be turned off. But I don't think that is something to give up - users create tables, manage data, query etc, and we need the queries/jobs to run as the user who submitted them for various purposes including authorization,

Re: basic, dumb getting started question (single-node)

2014-11-12 Thread Gopal V
On 11/12/14, 1:27 PM, Nicholas Murphy wrote: Hadoop version 2.5.1, Hive version 0.13.1, Oracle JDK (1.6, I believe), Debian 7.7. I notice the default conf/ directory has a bunch of template files, but only that. Can someone point me to a resource, or to an example of what configuration I

Re: row_number() over(Partition by) Throw Error with Null Input.

2014-11-09 Thread Gopal V
On 11/9/14, 10:16 PM, karthik Srivasthava wrote: select row_number() over (PARTITION BY country,state,department,branch_name) from Employee_details; select count(*) over (PARTITION BY country,state,department,branch_name) from Employee_details; You haven't posted the entire back trace, so I'm

Re: Tez Vertex failure

2014-10-03 Thread Gopal V
On 10/3/14, 5:20 PM, Echo Li wrote: thanks for reply! the query is: *select count(customerid) from tableName where ymd=20140930 ;* That is simple enough that it should work anyway. There is strong possibility that the rest of that RuntimeException gives a clue to the problem - if anything

Re: Hive Index and ORC

2014-09-09 Thread Gopal V
On 9/6/14, 9:36 AM, Alain Petrus wrote: I am wondering whether is it possible to use Hive index and ORC format? Does it make sense? ORC maintains its own indexes within the file - one index record every 10,000 rows (orc.row.index.stride / orc.create.index). You can take advantage of it

Re: New to TEZ

2014-08-13 Thread Gopal V
On 8/11/14, 1:48 PM, karthik Srivasthava wrote: Hi, Below was my log.. I couldnt find where the error is. Can you please point out what caused my error... The correct items to post back would be hive.tez.container.size, hive.tez.java.opts and the value of io.sort.mb. It does look like you

Re: Tuning Triangle Joins on Hive

2014-08-06 Thread Gopal V
On 7/31/14, 12:28 PM, Firas Abuzaid wrote: We're running various triangle join queries on Hive 0.9.0, and we're wondering if we can get any better performance. Here's the query we're running: SELECT count(*) FROM table r1 JOIN table r2 ON (r1.dst = r2.src) JOIN table r3 ON (r2.dst = r3.src AND