Is it possible to upload the AM logs alone?. That would be helpful.

It appears to be a problem with "scope_38_INPUT_scope_37". But without the
logs and without knowing the DAG, it would be hard to locate the issue.

Otherwise, try "yarn logs -applicationId appId | grep "HISTORY" >
history.log".  If you have SimpleHistoryLoggingService (which is the
default), check if "history.txt" logs are available which can be shared. If
not sure about the location, check  "yarn logs -applicationId appId | |
grep 'Initializing SimpleHistoryLoggingService, logFileLocation='".

~Rajesh.B

On Thu, Sep 3, 2015 at 3:30 PM, Sandeep Kumar <[email protected]>
wrote:

> @Rohini, I used new version of pig i.e. 0.15.0 unfortunately the
> performance of my script degraded.
> 2015-09-03 15:15:24,698 [main] INFO  org.apache.pig.Main - Pig script
> completed in 4 minutes, 1 second and 22 milliseconds (241022 ms)
>
> whereas earlier it was taking hardly 3 minutes and 27 seconds.
>
> PFA the task counters. Following are the version of softwares being used:
>
> HadoopVersion:
> 2.6.0-cdh5.4.4
>
> PigVersion:
> 0.15.1-SNAPSHOT
>
> TezVersion:
> 0.7.0
>
>
> Regards,
> Sandeep
>
> On Thu, Sep 3, 2015 at 2:46 PM, Sandeep Kumar <[email protected]>
> wrote:
>
>> @Rajesh, PFA the required statistics. Its difficult to share application
>> log because they are huge in size(i.e. 167MB). In case you want anything
>> specific from those logs then please let me know.
>>
>> @Rohini,
>> Thanks for suggesting regarding new version of Pig. I'll give it a try
>> for sure.
>>
>> Regards,
>> Sandeep
>>
>> On Thu, Sep 3, 2015 at 2:31 PM, Rohini Palaniswamy <
>> [email protected]> wrote:
>>
>>> Sandeep,
>>>    Can you try with Pig 0.15 first? There is ton of fixes that has gone
>>> in for Pig on Tez into that release and many of them are performance fixes.
>>>
>>> Regards,
>>> Rohini
>>>
>>> On Thu, Sep 3, 2015 at 1:05 AM, Rajesh Balamohan <[email protected]>
>>> wrote:
>>>
>>>> Can you post the application logs?  It would be helpful if you could
>>>> run with "tez.task.generate.counters.per.io=true". This would generate
>>>> the per IO statistics which can be useful for debugging.
>>>>
>>>>
>>>> ~Rajesh.B
>>>>
>>>> On Thu, Sep 3, 2015 at 1:20 PM, Sandeep Kumar <[email protected]
>>>> > wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I'm using Pig-0.14.0 over Tez-0.7.0 for running some basic pig
>>>>> scripts. I'm not able to see any performance gain using Tez. My pig 
>>>>> scripts
>>>>> are taking same amount of time on mapred executionType as well.
>>>>>
>>>>> Following are the parameters which are in mapred-site.xml and being
>>>>> read by Tez and I'm not able to override them even if i mention them in my
>>>>> tez-site.xml:
>>>>>
>>>>>  tez.runtime.shuffle.merge.percent=0.66
>>>>>  tez.runtime.shuffle.fetch.buffer.percent=0.70
>>>>>  tez.runtime.io.sort.mb=256
>>>>>  tez.runtime.shuffle.memory.limit.percent=0.25
>>>>>  tez.runtime.io.sort.factor=64
>>>>>  tez.runtime.shuffle.connect.timeout=180000
>>>>>  tez.runtime.internal.sorter.class=org.apache.hadoop.util.QuickSort
>>>>>  tez.runtime.merge.progress.records=10000
>>>>>  tez.runtime.compress=true
>>>>>  tez.runtime.sort.spill.percent=0.8
>>>>>  tez.runtime.shuffle.ssl.enable=false
>>>>>  tez.runtime.ifile.readahead=true
>>>>>  tez.runtime.shuffle.parallel.copies=10
>>>>>  tez.runtime.ifile.readahead.bytes=4194304
>>>>>  tez.runtime.task.input.post-merge.buffer.percent=0.0
>>>>>  tez.runtime.shuffle.read.timeout=180000
>>>>>  tez.runtime.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
>>>>>
>>>>>
>>>>>
>>>>> PFA the list of task counter. I can see a lot of data is being spilled
>>>>> but if i try to increase tez.runtime.io.sort.mb through
>>>>> mapred-site.xml then my script terminates with OOM exception.
>>>>>
>>>>> Can you please suggest what parameters i should change to improve the
>>>>> performance of pig using Tez?
>>>>>
>>>>> Regards,
>>>>> Sandeep
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to