Re: Tez jobs on YARN failing sporadically..

2016-07-06 Thread Gautam
here: https://issues.apache.org/jira/browse/YARN-543 In particular the symptom is that NM fails to spawn the task container due to init issues. This affected MR and Tez jobs alike. Sometimes even crashing the AM initialization itself. *Restarting the affected NMs fixed the issue. * -Gautam. O

Re: Tez jobs on YARN failing sporadically..

2016-06-28 Thread Gautam
*Software Versions* - Hive : 1.1.0 - Tez : 0.7.1 - Hadoop : 2.6.0 On Tue, Jun 28, 2016 at 5:58 PM, Gautam wrote: > Hello, > > We have Tez being used for one of our main ETL workflows and have been > using it for couple months now. We recently started seeing the following > er

Tez jobs on YARN failing sporadically..

2016-06-28 Thread Gautam
not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:115 Vertex vertex_1466828114374_53316_1_00 Map 1 killed/failed due to:OWN_TASK_FAILURE Thanks, -Gautam.

Hive Partition Restatement ..

2016-06-22 Thread Gautam
directories as new versions of the partition and point the table partition location to this new directory. Any currently running query would continue reading from previous version directory since that was not moved from it's original location. thanks, -Gautam.

ORC file sort order ..

2016-04-08 Thread Gautam
g the hive --orcfiledump --rowindex which prints that columns min/max values in the index. But that is still not saying if the data within the stripes is sorted. Cheers, -Gautam.

Re: Hive Metastore Bottleneck

2016-03-30 Thread Gautam
lution could be picking better JVM tuning params. .. there could be other reasons but these should give you a start. -Gautam. On Wed, Mar 30, 2016 at 3:33 PM, Udit Mehta wrote: > But dont the clients always pick the first URI for multiple instances > mentioned in "*hive.metastore.uris&qu

Re: Hive Metastore Bottleneck

2016-03-30 Thread Gautam
kedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 30 March 2016 at 22:53, Gautam wrote: > >> Can you

Re: Hive Metastore Bottleneck

2016-03-30 Thread Gautam
Can you elaborate on where you see the bottleneck? A general overview of your access path would be useful. For instance if you'r accessing Hive metastore via HiveServer2 or from webhcat using embedded cli or something else. Have you tried putting multiple metastores behind a load balancer? It's

Re: Tez reducer parallelism ..

2016-03-15 Thread Gautam
> The windowing is not simultaneous unless they are all over the same window > - the following query has 3 different windows applied over the same rows > sequentially. Ok. Just wanted to confirm. Maybe I could restructure my query to get more parallelism .. > They are all over the same rows so th

Tez reducer parallelism ..

2016-03-15 Thread Gautam
lized Reducer phases. But what I see is that these get serialized into M1 -> R1 -> R2 -> R3 .. instead of M1 -> [ R1, R2, R3 ] Is this something that Tez tries to do at all or an optimization that I can use to my benefit ? Cheers, -Gautam.

Re: Tez job submissions failing when cluster is under provisioned..

2016-03-11 Thread Gautam
This one seems related https://issues.apache.org/jira/browse/YARN-4538 Yet to ascertain if it actually fixes this issue. On Thu, Mar 10, 2016 at 11:43 PM, Gopal Vijayaraghavan wrote: > > > This seems to be something YARN fair-scheduler reporting it this way.. > >although Tez doesn't seem to ha

Tez job submissions failing when cluster is under provisioned..

2016-03-10 Thread Gautam
defaulting to some sane default and adding tasks to the queue. -Gautam.

Re: Record too large for Tez in-memory buffer...

2016-02-11 Thread Gautam
/ benchmarking further. thanks! -Gautam. On Wed, Feb 10, 2016 at 8:12 PM, Gopal Vijayaraghavan wrote: > > > Good to know there's a fix .. Is there a jira that talks about this > >issue? Coz I couldn't find one. > > https://github.com/apache/tez/commit/714461f47e6408ec331acd0

Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Thanks Gopal! I'l look at options provided. On Wed, Feb 10, 2016 at 7:46 PM, Gautam wrote: > Here's the json version. > > On Wed, Feb 10, 2016 at 7:44 PM, Gautam wrote: > >> Whoops.. meant to send the tez explain earlier. Here's the Tez query >> plan. G

Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Here's the json version. On Wed, Feb 10, 2016 at 7:44 PM, Gautam wrote: > Whoops.. meant to send the tez explain earlier. Here's the Tez query plan. > Good to know there's a fix .. Is there a jira that talks about this issue? Coz > I couldn't find one. Maybe I

Re: Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
Whoops.. meant to send the tez explain earlier. Here's the Tez query plan. Good to know there's a fix .. Is there a jira that talks about this issue? Coz I couldn't find one. Maybe I can alter the query a bit to filter these out. Cheers, -Gautam. On Wed, Feb 10, 2016 at

Record too large for Tez in-memory buffer...

2016-02-10 Thread Gautam
at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:534) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:380) ... 24 more -Gautam. OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-3 depends on stag

Re: nested join issue

2015-06-13 Thread Gautam
To clarify, HIVE-8435 introduced the regression. Turning that feature off fixes the issue. So we still need to fix that optimization to not produce this incorrect result. On Fri, Jun 12, 2015 at 11:31 PM, Gautam wrote: > Found that turning off hive.optimize.remove.identity.project ( ref: >

Re: nested join issue

2015-06-12 Thread Gautam
Found that turning off hive.optimize.remove.identity.project ( ref: HIVE-8435 <https://issues.apache.org/jira/browse/HIVE-8435> ) fixes the issue. This gives us a workaround, but dunno the performance degradation this impacts yet. Thanks! -Gautam. On Fri, Jun 12, 2015 at 6:02 PM, Gautam

Re: nested join issue

2015-06-12 Thread Gautam
Done. https://issues.apache.org/jira/browse/HIVE-10996 On Fri, Jun 12, 2015 at 1:47 PM, Gopal Vijayaraghavan wrote: > Hi > > > Thanks for investigating.. Trying to locate the patch that fixes this > >between 1.1 and 2.0.0-SNAPSHOT. Any leads on what Jira this fix was part > >of? Or what part of

Re: nested join issue

2015-06-11 Thread Gautam
Thanks for investigating.. Trying to locate the patch that fixes this between 1.1 and 2.0.0-SNAPSHOT. Any leads on what Jira this fix was part of? Or what part of the code the patch is likely to be on? -Gautam. On Thu, Jun 11, 2015 at 8:35 PM, Gopal Vijayaraghavan wrote: > Hi, > &

Re: Parsing Hive Query to get table names and column names

2014-11-06 Thread Ritesh Gautam
le, please use a metadata based approach for pre-calculating >>> and keeping the list of tables that you need to refresh prior to firing a >>> query on them (?) >>> >>> Hope it helps. >>> >>> regards >>> Devopam >>> >>> >

Re: Parsing Hive Query to get table names and column names

2014-11-05 Thread Ritesh Gautam
where these query are being > generated [ if not static in code!], that would be best place to get it. > b) [ if you don't have access to a) ] - try http://zql.sourceforge.net/ , > it should be easier. Also check the licence. > > Thanks > Alok > > On Wed, Nov 5, 2014 at 5

Parsing Hive Query to get table names and column names

2014-11-05 Thread Ritesh Gautam
Hello, I am trying to parse hive queries so that I can get the table names on which the query is dependent on. I have tried the following : 1) downloaded the grammer and used ANTLR to generate the lexer and parser, but there are some errors as such when I try to build it: .. symbol:

Re: Push down hive filters to hbase store with composite rowkey ..

2014-04-08 Thread Gautam
Any thoughts on this? Am I going down the right track? Is there an easier workaround? -Gautam. On Mon, Apr 7, 2014 at 11:29 PM, Gautam wrote: > Hello All, > *Short version*: Looking to leverage hbase fuzzy row scans > on fields in a composite rowkey ( on hbas

Push down hive filters to hbase store with composite rowkey ..

2014-04-07 Thread Gautam
e user group to provide guidance in this hour of need. Please lead me and I shall follow. Cheers, -Gautam. *Footnote*: Our rowkey is a fixed length key(no delimiter) that looks like: [bucket][App][MsgType][TimeSlice]

Re: Converting Array to a String

2011-06-25 Thread Raghav Kumar Gautam
Can I use them in hive like I use UDF ? If so, then how can I do that ? With Regards, Raghav. "Sobieray, Aaron" writes: > The join() function in StringUtils or Google's Joiner class is what you're > looking for. > > -Original Message- > From: Ra

Converting Array to a String

2011-06-23 Thread Raghav Kumar Gautam
I used function collect_set(col) on a table got an array as output. I want to cast it to string for furthur processing. Is there a function to accomplish this. With Regards, Raghav.