> I want to override partitionByHash function on Flink like the same way
>of DBY on Hive.
> I am working on implementing some benchmark system for these two system,
>which could be contritbutino to Hive as well.
I would be very disappointed if Flink fails to outperform Hive with a
Distribute BY,
Hello, the same question about DISTRIBUTE BY on Hive.
Accorring to you, you do not use hashCode of Object class on DBY,
Distribute By.
I tried to understand how ObjectInspectorUtils works for distribution, but
it seemed it has a lot of Hive API. It is not much understnading.
I want to override pa
Hello, the same question about DISTRIBUTE BY on Hive.
Accorring to you, you do not use hashCode of Object class on DBY,
Distribute By.
I tried to understand how ObjectInspectorUtils works for distribution, but
it seemed it has a lot of Hive API. It is not much understnading.
I want to override pa
> so do you think if we want the same result from Hive and Spark or the
>other freamwork, how could we try this one ?
There's a special backwards compat slow codepath that gets triggered if
you do
set mapred.reduce.tasks=199; (or any number)
This will produce the exact same hash-code as the jav
Thanks for your help.
so do you think if we want the same result from Hive and Spark or the other
freamwork, how could we try this one ?
could you tell me in detail.
Regards,
Philip
On Thu, Oct 22, 2015 at 6:25 PM, Gopal Vijayaraghavan wrote:
>
> > When applying [Distribute By] on Hive to the
> When applying [Distribute By] on Hive to the framework, the function
>should be partitionByHash on Flink. This is to spread out all the rows
>distributed by a hash key from Object Class in Java.
Hive does not use the Object hashCode - the identityHashCode is
inconsistent, so Object.hashCode() .
Hello, I am working on Flink and Spark majoring in Computer Science in
Berlin.
I have the important question.
Well, this question is from what I do these days, which is translations
Hive Query to Flink.
When applying [Distribute By] on Hive to the framework, the function should
be partitionByHash