On 1/5/16, 7:51 PM, "Kiran Kolli" <[email protected]> wrote:
>In this case its data-skew. Of all the keys there will be few keys which >have more records. What extra information do you need? Zipf distributions have different workarounds for joins, group-bys & windowing functions. The query along with something like "explain formatted <query>;" JSON. More specifically, enough information to plot it out like http://people.apache.org/~gopalv/q27-plan.svg Please also collect the runtime info for the automated Tez analyzers (TEZ-2690). If you need to run the import tool on an older release, Rajesh has a backport standalone. https://github.com/rajeshbalamohan/tez-ats-import Cheers, Gopal
