Caching intermediate data in tez object registry

2015-12-01 Thread Raajay
have sufficient RAM to store all intermediate data. Raajay

Re: VolcanoPlanner vs HepPlanner

2015-10-07 Thread Raajay Viswanathan
. Thanks Raajay > On Oct 7, 2015, at 11:58 AM, John Pullokkaran > wrote: > > This would be a broad change. > Hep Planner does enumerate different join orders through > “LoptOptimizeJoinRule". > > Volcano planner is not used as it has scalability issues. >

VolcanoPlanner vs HepPlanner

2015-10-06 Thread Raajay
to be passed to the HiveVolcanoPlanner for effective CBO ? Thanks, Raajay

Re: Hive Compile mode

2015-09-09 Thread Raajay
Ah okay thanks! On Wed, Sep 9, 2015 at 10:44 PM, Jeff Zhang wrote: > Use explain > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain > > > > On Thu, Sep 10, 2015 at 11:07 AM, Raajay wrote: > >> Is it possible to use Hive only in comp

Hive Compile mode

2015-09-09 Thread Raajay
Is it possible to use Hive only in compile mode ( and not execute the queries) ? The output here would be a DAG say to be executed on TEZ later. Thanks Raajay

Error upon XML serialization.

2015-09-03 Thread Raajay
wever, I was not successful in deserializing the object after writing the serialized form to disk. So, 1. Does it make sense to serialize QueryPlan ? 2. If yes, what are the correct configurations ? 3. If not, which is the ideal data structure to serialize after the query compilation stage ? Th

Hive - Serializing Query Plans

2015-09-02 Thread Raajay
ery object. For execution, I need the QueryPlan.java object. How to go from api.Query (Thrift Generated) to QueryPlan.java ? Thanks Raajay

Re: Serializing dags

2015-09-01 Thread Raajay
I see from the docs that QueryPlan can be serialized to string using the toThriftJSONString() function. How do de-serialize it ? Any pointers would be helpful. Thanks, Raajay On Tue, Sep 1, 2015 at 11:26 AM, Raajay wrote: > Hi Canan, > > The changes that I am primarily interested a

Re: Serializing dags

2015-09-01 Thread Raajay
Hi Canan, The changes that I am primarily interested are: a. Altering the parallelism of the DAG b. Change task location hints etc.. In general, I want to make these alterations and run the DAGs on tez, without having to go through the hive pipeline. Raajay On Mon, Aug 31, 2015 at 11:42 PM

Serializing dags

2015-08-31 Thread Raajay
Hello, Currently, I am running Hive on Tez. I wish to make some changes to the DAGs generated by HIve before running on Tez/Yarn. Which data structure should i serialize ? DAG or DagPlan ? - Raajay

Re: Run multiple queries simultaneously

2015-08-25 Thread Raajay
the impact. For this, I need to be able to run queries simultaneously and measure the running times. What I glean from other threads is that, it should be good enough to fire up 2 CLI's and issue the queries. Raajay On Tue, Aug 25, 2015 at 4:17 PM, Ryan Harris wrote: > You need to be a

Re: Run multiple queries simultaneously

2015-08-25 Thread Raajay
Noam, I am concerned with cases where the network is a bottleneck. Will i be able control it in YARN ? Ideally, I would like to run multiple queries simultaneously. Raajay On Tue, Aug 25, 2015 at 9:31 AM, Noam Hasson wrote: > I would just limit the resources given to the user on YARN. >

Run multiple queries simultaneously

2015-08-25 Thread Raajay
Hello, I want to compare the running time of an query when run alone against the run time in presence of other queries. What is the ideal setup required to run this experiment ? Should I have two Hive CLI's open and issue queries simultaneously ? How to script such experiment in Hive ? Raajay

Re: CBO - get cost of the plan

2015-08-24 Thread Raajay
le. But here the estimates are |tableA| = |tableB| = 7E6 and |tableA join tableB| = 1.76E7 (values highlighted in red in the log snippet above) Should I be able to explicitly specify it somewhere, so thats gathering is accurate ? Also, what is the definition of cumulative cost ? Thanks for th

Re: CBO - get cost of the plan

2015-08-24 Thread Raajay
he analyze commands, I find that CBO optimization is ignored as expected. Perhaps I am missing some configuration. I print out the calcite optimized plans, using the "RelOptUtil.toString()" helper on "calciteOptimizedPlan" at the end of "apply" function in C

CBO - get cost of the plan

2015-08-24 Thread Raajay
cumulative cost = {0}, id = 47 The number of rows as displayed here is 1.0, which is clearly not the correct value. - Raajay.

Hive CBO - Calcite Interface

2015-08-10 Thread Raajay
(read Operator trees) with cost lesser than a threshold. 2. Is there an interface for Hive to get the absolute cost (based on Hive Cost Factory) of a operator tree returned by Calcite ? Thanks, Raajay

Re: Running hive on tez locally

2015-08-07 Thread Raajay
Thanks for the configs. When I run hive it crashes because TEZ libraries were not found. How do I point Hive to tez libraries? Is it sufficient to populate CLASSPATH environment variables with location of tez libraries ? Raajay On Fri, Aug 7, 2015 at 3:16 PM, Jason Dere wrote: > If you

Running hive on tez locally

2015-08-07 Thread Raajay
I have been running Hive queries on a single node (no HDFS). I realize that the queries get compiled as map-reduce jobs and not as TEZ jobs even though "hive.execution.engine=tez" is set. Is that expected ? If yes, what is the ideal environment for debugging hive on tez? Raajay

Hive debug in eclipse

2015-07-30 Thread Raajay
(involving end to end processing of queries) rather than unit tests for a single module/class. 2. Are there other alternatives for speeding up the editing-testing cycle ? Thanks Raajay

View debug logs

2015-07-30 Thread Raajay
Hello everyone, How do I view the logs generated using "log4j" logger while running the query tests from "itest/qtest" ? Also, how to set the log4j properties, since I need to view the most detailed logs. Thanks, Raajay

Semantic Analysis Run Through

2015-07-30 Thread Raajay
y pointers / explanations will be helpful. Thanks, Raajay

Re: Cost based optimization

2015-06-26 Thread Raajay
Awesome! Thanks John. I would be grateful if you could point me to the files in the source code, that are primarily responsible for Query Planning. Thanks, Raajay On Thu, Jun 25, 2015 at 4:45 PM, John Pullokkaran < jpullokka...@hortonworks.com> wrote: > Hive does look in to alter

Cost based optimization

2015-06-25 Thread Raajay
Hello Everyone, A quick question on the cost-based optimization module in Hive. Does the latest version support query plan generation with alternate join orders ? Thanks Raajay