have
sufficient RAM to store all intermediate data.
Raajay
.
Thanks
Raajay
> On Oct 7, 2015, at 11:58 AM, John Pullokkaran
> wrote:
>
> This would be a broad change.
> Hep Planner does enumerate different join orders through
> “LoptOptimizeJoinRule".
>
> Volcano planner is not used as it has scalability issues.
>
to be passed to the HiveVolcanoPlanner for
effective CBO ?
Thanks,
Raajay
Ah okay thanks!
On Wed, Sep 9, 2015 at 10:44 PM, Jeff Zhang wrote:
> Use explain
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain
>
>
>
> On Thu, Sep 10, 2015 at 11:07 AM, Raajay wrote:
>
>> Is it possible to use Hive only in comp
Is it possible to use Hive only in compile mode ( and not execute the
queries) ?
The output here would be a DAG say to be executed on TEZ later.
Thanks
Raajay
wever, I was not successful
in deserializing the object after writing the serialized form to disk. So,
1. Does it make sense to serialize QueryPlan ?
2. If yes, what are the correct configurations ?
3. If not, which is the ideal data structure to serialize after the query
compilation stage ?
Th
ery object.
For execution, I need the QueryPlan.java object.
How to go from api.Query (Thrift Generated) to QueryPlan.java ?
Thanks
Raajay
I see from the docs that QueryPlan can be serialized to string using the
toThriftJSONString() function.
How do de-serialize it ? Any pointers would be helpful.
Thanks,
Raajay
On Tue, Sep 1, 2015 at 11:26 AM, Raajay wrote:
> Hi Canan,
>
> The changes that I am primarily interested a
Hi Canan,
The changes that I am primarily interested are:
a. Altering the parallelism of the DAG
b. Change task location hints etc..
In general, I want to make these alterations and run the DAGs on tez,
without having to go through the hive pipeline.
Raajay
On Mon, Aug 31, 2015 at 11:42 PM
Hello,
Currently, I am running Hive on Tez. I wish to make some changes to the
DAGs generated by HIve before running on Tez/Yarn.
Which data structure should i serialize ? DAG or DagPlan ?
- Raajay
the
impact.
For this, I need to be able to run queries simultaneously and measure the
running times. What I glean from other threads is that, it should be good
enough to fire up 2 CLI's and issue the queries.
Raajay
On Tue, Aug 25, 2015 at 4:17 PM, Ryan Harris
wrote:
> You need to be a
Noam,
I am concerned with cases where the network is a bottleneck. Will i be able
control it in YARN ? Ideally, I would like to run multiple queries
simultaneously.
Raajay
On Tue, Aug 25, 2015 at 9:31 AM, Noam Hasson
wrote:
> I would just limit the resources given to the user on YARN.
>
Hello,
I want to compare the running time of an query when run alone against the
run time in presence of other queries.
What is the ideal setup required to run this experiment ? Should I have two
Hive CLI's open and issue queries simultaneously ? How to script such
experiment in Hive ?
Raajay
le. But here the estimates are
|tableA| = |tableB| = 7E6 and |tableA join tableB| = 1.76E7 (values
highlighted in red in the log snippet above)
Should I be able to explicitly specify it somewhere, so thats gathering is
accurate ?
Also, what is the definition of cumulative cost ?
Thanks for th
he analyze commands, I find that CBO optimization is ignored as
expected. Perhaps I am missing some configuration.
I print out the calcite optimized plans, using the "RelOptUtil.toString()"
helper on "calciteOptimizedPlan" at the end of "apply" function in
C
cumulative
cost = {0}, id = 47
The number of rows as displayed here is 1.0, which is clearly not the
correct value.
- Raajay.
(read Operator trees)
with cost lesser than a threshold.
2. Is there an interface for Hive to get the absolute cost (based on Hive
Cost Factory) of a operator tree returned by Calcite ?
Thanks,
Raajay
Thanks for the configs. When I run hive it crashes because TEZ libraries
were not found.
How do I point Hive to tez libraries? Is it sufficient to populate
CLASSPATH environment variables with location of tez libraries ?
Raajay
On Fri, Aug 7, 2015 at 3:16 PM, Jason Dere wrote:
> If you
I have been running Hive queries on a single node (no HDFS). I realize that
the queries get compiled as map-reduce jobs and not as TEZ jobs even though
"hive.execution.engine=tez" is set.
Is that expected ? If yes, what is the ideal environment for debugging hive
on tez?
Raajay
(involving end to end processing of
queries) rather than unit tests for a single module/class.
2. Are there other alternatives for speeding up the editing-testing cycle ?
Thanks
Raajay
Hello everyone,
How do I view the logs generated using "log4j" logger while running the
query tests from "itest/qtest" ?
Also, how to set the log4j properties, since I need to view the most
detailed logs.
Thanks,
Raajay
y
pointers / explanations will be helpful.
Thanks,
Raajay
Awesome! Thanks John.
I would be grateful if you could point me to the files in the source code,
that are primarily responsible for Query Planning.
Thanks,
Raajay
On Thu, Jun 25, 2015 at 4:45 PM, John Pullokkaran <
jpullokka...@hortonworks.com> wrote:
> Hive does look in to alter
Hello Everyone,
A quick question on the cost-based optimization module in Hive. Does the
latest version support query plan generation with alternate join orders ?
Thanks
Raajay
24 matches
Mail list logo