Can you check: yarn.resourcemanager.am.max-attempts setting for YARN
(yarn-site.xml or yarn-default.xml whichever you are using)?

Also can you look at the application master logs for one of the app
instances you did not start to see why it was shutdown?


--
Chetan


On Wed, Aug 26, 2015 at 9:51 AM, Tushar Gosavi <[email protected]>
wrote:

> You can also check yarn resource manager ui and logs to verify which
> applications are getting restarted continuously.
>
> On Wed, Aug 26, 2015 at 9:08 AM, David Yan <[email protected]> wrote:
>
>> That's a lot of applications.  I suspect there is something that keeps
>> starting the application, which causes the folder to keep increasing in
>> size. Can you just run get-app-info on dtcli on just one application and
>> see what is being spawned up?
>>
>> David
>>
>> On Tue, Aug 25, 2015 at 11:44 PM, Shashi Vishwakarma <
>> [email protected]> wrote:
>>
>>> Thanks David for detailed explanation. I checked apps directory in
>>> HDFS,there are around 12858 application in that folder each of having 6.2 M
>>> size. It will be a time consuming process to find status of each
>>> application by running get-app-info in dtcli. So logged in to web
>>> interface of datatorrent(port 9090) but there is no application running at
>>> this moment.
>>>
>>> Still HDFS space utilization  is increasing,any pointers on this?
>>>
>>> Thanks and Regards,
>>> Shashi
>>>
>>> On Wed, Aug 26, 2015 at 2:16 AM, Amol Kekre <[email protected]>
>>> wrote:
>>>
>>>>
>>>> Adding [email protected]
>>>>
>>>> Thks,
>>>> Amol
>>>>
>>>>
>>>> On Tue, Aug 25, 2015 at 10:34 AM, David Yan <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Shashi,
>>>>>
>>>>> That directory is where Apex stores application information, like
>>>>> application jar files, checkpoints, container information, etc.
>>>>> Please run this command to see which directory is taking the most
>>>>> space.
>>>>>
>>>>> $ hdfs dfs -du /user/dtadmin/datatorrent/apps
>>>>>
>>>>> Then open dtcli and use the get-app-info command look at the
>>>>> information of that application.  For example:
>>>>>
>>>>> dt> get-app-info application_1439598948299_0557
>>>>>
>>>>> The field "state" will tell you whether the application is running or
>>>>> not.
>>>>>
>>>>> If you don't care about the application, you can safely kill it if
>>>>> it's running and delete the HDFS directory by doing hdfs dfs -rm -r
>>>>> /user/dtadmin/datatorrent/apps/application_xxx_yyy (replace xxx and yyy
>>>>> with appropriate values).  Note that doing so will wipe all stored
>>>>> information about that application.
>>>>>
>>>>> David
>>>>>
>>>>> On Tue, Aug 25, 2015 at 6:32 AM, Shashi Vishwakarma <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have  DataTorrent 3.x installed on my cluster.Even thought there is
>>>>>> no data torrent application is running , still my hdfs space utilization
>>>>>> goes on increasing. Below is hdfs path that has occupied most of the 
>>>>>> space.
>>>>>>
>>>>>> /user/dtadmin/datatorrent/apps
>>>>>>
>>>>>> Why this is happening? Am I missing something here?
>>>>>>
>>>>>> Thanks
>>>>>> Shashi
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "apex-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/apex-dev/8754d662-4948-4920-96f3-cb58f70d5f39%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/apex-dev/8754d662-4948-4920-96f3-cb58f70d5f39%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "apex-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/apex-dev/CAMqituP83nSGd4Ln6phTe0okyojwsE%3DGq22unu%3D-yDgyf0Y8tA%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/apex-dev/CAMqituP83nSGd4Ln6phTe0okyojwsE%3DGq22unu%3D-yDgyf0Y8tA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "apex-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/apex-dev/CAMqituMeKHC84rJpFAHKbcFi-psC-zDqrOTRwQXhq75CbSQcBQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/apex-dev/CAMqituMeKHC84rJpFAHKbcFi-psC-zDqrOTRwQXhq75CbSQcBQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> “I'd have blown my top, because I want to beat this damn thing,
>  as long as I've gone this far. I can't just leave it after I've found
>  out so much about it. I have to keep going to find out ultimately
> what is the matter with it in the end."
>                 Richard P. Feynman
>
> --
> You received this message because you are subscribed to the Google Groups
> "apex-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/apex-dev/CAHYazdeHVNqPgn8ABwic92HEkSrEoWU%3D_cXDw%2Brb5Li4GoDpww%40mail.gmail.com
> <https://groups.google.com/d/msgid/apex-dev/CAHYazdeHVNqPgn8ABwic92HEkSrEoWU%3D_cXDw%2Brb5Li4GoDpww%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

Reply via email to