Can you post the application logs?  It would be helpful if you could run
with "tez.task.generate.counters.per.io=true". This would generate the per
IO statistics which can be useful for debugging.


~Rajesh.B

On Thu, Sep 3, 2015 at 1:20 PM, Sandeep Kumar <[email protected]>
wrote:

> Hi All,
>
> I'm using Pig-0.14.0 over Tez-0.7.0 for running some basic pig scripts.
> I'm not able to see any performance gain using Tez. My pig scripts are
> taking same amount of time on mapred executionType as well.
>
> Following are the parameters which are in mapred-site.xml and being read
> by Tez and I'm not able to override them even if i mention them in my
> tez-site.xml:
>
>  tez.runtime.shuffle.merge.percent=0.66
>  tez.runtime.shuffle.fetch.buffer.percent=0.70
>  tez.runtime.io.sort.mb=256
>  tez.runtime.shuffle.memory.limit.percent=0.25
>  tez.runtime.io.sort.factor=64
>  tez.runtime.shuffle.connect.timeout=180000
>  tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort
>  tez.runtime.merge.progress.records=10000
>  tez.runtime.compress=true
>  tez.runtime.sort.spill.percent=0.8
>  tez.runtime.shuffle.ssl.enable=false
>  tez.runtime.ifile.readahead=true
>  tez.runtime.shuffle.parallel.copies=10
>  tez.runtime.ifile.readahead.bytes=4194304
>  tez.runtime.task.input.post-merge.buffer.percent=0.0
>  tez.runtime.shuffle.read.timeout=180000
>  tez.runtime.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
>
>
>
> PFA the list of task counter. I can see a lot of data is being spilled but
> if i try to increase tez.runtime.io.sort.mb through mapred-site.xml then
> my script terminates with OOM exception.
>
> Can you please suggest what parameters i should change to improve the
> performance of pig using Tez?
>
> Regards,
> Sandeep
>

Reply via email to