[jira] [Commented] (TWILL-261) Add supports to run TwillApplications against Kubernetes cluster

2018-08-17 Thread Yuliya Feldman (JIRA)


[ 
https://issues.apache.org/jira/browse/TWILL-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584176#comment-16584176
 ] 

Yuliya Feldman commented on TWILL-261:
--

+1

> Add supports to run TwillApplications against Kubernetes cluster
> 
>
> Key: TWILL-261
> URL: https://issues.apache.org/jira/browse/TWILL-261
> Project: Apache Twill
>  Issue Type: Story
>Reporter: Terence Yim
>Assignee: Terence Yim
>Priority: Major
>
> Top level story to brainstorm and gather tasks needed in order to bring 
> TwillApplication to Kubernetes cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Subject: [VOTE] Release of Apache Twill-0.13.0 [rc1]

2018-07-19 Thread Yuliya Feldman
I would think it's better to remove it, since 0.12.1 is a regular release,
unless it is not.

Thanks,
Yuliya

On Thu, Jul 19, 2018 at 1:26 PM, Poorna Chandra  wrote:

> I added all the bugfixes since 0.12.0 into the changes file. I can remove
> the bugs fixed in 0.12.1 from the file.
>
> Thanks,
> Poorna.
>
> On Thu, Jul 19, 2018, 1:11 PM Yuliya Feldman  wrote:
>
> > In this case we should remove mention of the bugs fixed in 0.12.1
> > Or we should keep incremental list that is updated with each new release.
> >
> > Otherwise it would be a confusion about which release those bugs were
> fixed
> > in:
> >
> > Bug
> > [TWILL-61]  - Fix to allow higher attempts to relaunch the app after
> > the first attempt failed
> > [TWILL-254] - Update to use ContainerId.fromString in Hadoop 2.6+
> > [TWILL-255] - Incorrect logging after memory was adjusted. Does not
> > show memory before adjustment
> >
> > Thanks,
> > Yuliya
> >
> >
> > On Tue, Jul 17, 2018 at 7:34 PM, Poorna Chandra 
> wrote:
> >
> > > Yes, you are right. TWILL-248 is the only change between 0.12.1 and
> > 0.13.0.
> > >
> > > Poorna
> > >
> > > On Tue, Jul 17, 2018, 5:00 PM Yuliya Feldman 
> wrote:
> > >
> > > > What's the difference between 0.12.1 and 0.13.0 ?
> > > >
> > > > Looks like only TWILL-248, is it?
> > > >
> > > > On Tue, Jul 17, 2018 at 4:49 PM, Poorna Chandra 
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > This is a call for a vote on releasing Apache Twill 0.13.0, release
> > > > > candidate 1. This
> > > > > is the 15th release of Twill.
> > > > >
> > > > > The source tarball, including signatures, digests, etc. can be
> found
> > > at:
> > > > > https://dist.apache.org/repos/dist/dev/twill/0.13.0-rc1/src
> > > > >
> > > > > The tag to be voted upon is v0.13.0:
> > > > > https://git-wip-us.apache.org/repos/asf?p=twill.git;a=
> > > > > shortlog;h=refs/tags/v0.13.0
> > > > >
> > > > > The release hash is 26c3c988d3358f1c56f3b9a3471b45c144375804:
> > > > > https://git-wip-us.apache.org/repos/asf?p=twill.git;a=commit;h=
> > > > > 26c3c988d3358f1c56f3b9a3471b45c144375804
> > > > >
> > > > > The Nexus Staging URL:
> > > > >
> > https://repository.apache.org/content/repositories/orgapachetwill-1026
> > > > >
> > > > > Release artifacts are signed with the following key:
> > > > > http://people.apache.org/keys/committer/poorna
> > > > >
> > > > > KEYS file available:
> > > > > https://dist.apache.org/repos/dist/dev/twill/KEYS
> > > > >
> > > > > For information about the contents of this release, see:
> > > > > https://dist.apache.org/repos/dist/dev/twill/0.13.0-rc1/
> CHANGES.txt
> > > > >
> > > > > Please vote on releasing this package as Apache Twill 0.13.0
> > > > >
> > > > > The vote will be open for 72 hours.
> > > > >
> > > > > [ ] +1 Release this package as Apache Twill 0.13.0
> > > > > [ ] +0 no opinion
> > > > > [ ] -1 Do not release this package because ...
> > > > >
> > > > > +1 from myself.
> > > > >
> > > > > Thanks,
> > > > > Poorna
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (TWILL-260) Less invasive change to upgrade version of zkclient

2018-07-13 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created TWILL-260:


 Summary: Less invasive change to upgrade version of zkclient
 Key: TWILL-260
 URL: https://issues.apache.org/jira/browse/TWILL-260
 Project: Apache Twill
  Issue Type: Bug
  Components: yarn, zookeeper
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


This is related to 
[TWILL-249|https://issues.apache.org/jira/projects/TWILL/issues/TWILL-249]

The less invasive change is just to upgrade zkclient to the latest version.

This should solve issues with trying to work around issues with zkclient 
library bug of isues while processing connected and sasl events.

This is causing major issues with MapR - since they have sasl enabled even in 
case of no security enabled



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TWILL-255) incorrect logging after memory/cpu was adjusted

2018-02-28 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created TWILL-255:


 Summary: incorrect logging after memory/cpu was adjusted
 Key: TWILL-255
 URL: https://issues.apache.org/jira/browse/TWILL-255
 Project: Apache Twill
  Issue Type: Bug
  Components: yarn
Reporter: Yuliya Feldman
Assignee: Yuliya Feldman


While adjusting resources for Containers when logging what was adjusted it 
shows values after adjustment, so it's not known what it was adjusted from.

Affected are: adjustCapability()

Hadoop20YarnAMClient

Hadoop21YarnAMClient



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TWILL-252) Not providing any feedback when size of the container requested can't be allocated

2017-12-18 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295356#comment-16295356
 ] 

Yuliya Feldman commented on TWILL-252:
--

In order to do this before YARN application is submitted the app that submits 
YARN app needs to deal with YARN APIs to get information about max container 
size allowed, while Twill is doing it already - as it adjusts the size.

> Not providing any feedback when size of the container requested can't be 
> allocated
> --
>
> Key: TWILL-252
> URL: https://issues.apache.org/jira/browse/TWILL-252
> Project: Apache Twill
>  Issue Type: Bug
>        Reporter: Yuliya Feldman
>
> Looks like when YARN is configured with max memory per container 
> (yarn.scheduler.maximum-allocation-mb) less then amount of memory end user 
> allocates for their application and container is allocated with just 
> yarn.scheduler.maximum-allocation-mb value there is no way to know about it 
> until container is allocated.
> We try to divide memory into heap and off-heap and end up with setting up off 
> heap to the value higher then allocated for the container, as application 
> assumes it gets what it asked for or container is not allocated at all.
> Need either ability to fail application in this case or not allocate 
> container with memory less then asked.
> As currently Twill adjusts memory and cpu with only INFO level messages in 
> AppMaster log:
> from Hadoop21YarnAMClient.java
> {code:java}
>  protected Resource adjustCapability(Resource resource) {
> int cores = resource.getVirtualCores();
> int updatedCores = Math.min(resource.getVirtualCores(), 
> maxCapability.getVirtualCores());
> if (cores != updatedCores) {
>   resource.setVirtualCores(updatedCores);
>   LOG.info("Adjust virtual cores requirement from {} to {}.", cores, 
> updatedCores);
> }
> int updatedMemory = Math.min(resource.getMemory(), 
> maxCapability.getMemory());
> if (resource.getMemory() != updatedMemory) {
>   resource.setMemory(updatedMemory);
>   LOG.info("Adjust memory requirement from {} to {} MB.", 
> resource.getMemory(), updatedMemory);
> }
> return resource;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (TWILL-252) Not providing any feedback when size of the container requested can't be allocated

2017-12-12 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created TWILL-252:


 Summary: Not providing any feedback when size of the container 
requested can't be allocated
 Key: TWILL-252
 URL: https://issues.apache.org/jira/browse/TWILL-252
 Project: Apache Twill
  Issue Type: Bug
Reporter: Yuliya Feldman


Looks like when YARN is configured with max memory per container 
(yarn.scheduler.maximum-allocation-mb) less then amount of memory end user 
allocates for their application and container is allocated with just 
yarn.scheduler.maximum-allocation-mb value there is no way to know about it 
until container is allocated.
We try to divide memory into heap and off-heap and end up with setting up off 
heap to the value higher then allocated for the container, as application 
assumes it gets what it asked for or container is not allocated at all.
Need either ability to fail application in this case or not allocate container 
with memory less then asked.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TWILL-203) irrespective of number of CPUs specified in App Config it is always 1

2017-10-17 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208558#comment-16208558
 ] 

Yuliya Feldman commented on TWILL-203:
--

Looks like it depends on YARN setup -whether it set up to support cpu as a 
resource or not

> irrespective of number of CPUs specified in App Config it is always 1
> -
>
> Key: TWILL-203
> URL: https://issues.apache.org/jira/browse/TWILL-203
> Project: Apache Twill
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 0.8.0
>    Reporter: Yuliya Feldman
> Attachments: Screen Shot 2017-01-06 at 10.17.24 AM.png, rm.log, 
> twillclient.log
>
>
> When trying to deploy Bundled Jar app and specifying number of CPUs > 1 it 
> still defaults to 1 when application is starting.
> Version of YARN is: 2.7.2, Version of Twill is 0.8. Capacity scheduler.
> Looks like (from the logs) all the info is passed through Twill correctly and 
> gets "lost" while getting to RM
> Please see attached logs form Twill Client, RM and Screenshot



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: HelloWorld Struggle

2017-06-12 Thread Yuliya Feldman
Do those env vars resolve on the NodeManager nodes correctly?

On Mon, Jun 12, 2017 at 2:05 PM, Chris Hebert <
chris.hebert-...@digitalreasoning.com> wrote:

> Other YARN apps work fine. For example, I just successfully ran the stock
> MapReduce wordcount example (and of course, MapReduce is a YARN
> application).
>
> I ran HelloWorld in debug mode earlier and found that yarnClasspath
> contains the following:
>   $HADOOP_CONF_DIR
>   $HADOOP_COMMON_HOME/*
>   $HADOOP_COMMON_HOME/lib/*
>   $HADOOP_HDFS_HOME/*
>   $HADOOP_HDFS_HOME/lib/*
>   $HADOOP_MAPRED_HOME/*
>   $HADOOP_MAPRED_HOME/lib/*
>   $HADOOP_YARN_HOME/*
>   $HADOOP_YARN_HOME/lib/*
>
> I don't know whether these environment variables are supposed to be
> resolved to path variables already or if that happens later. I also don't
> know if I'm supposed to explicitly declare these environment variables
> somewhere, and if so, I do not know where in the Hadoop configuration it is
> ideal for me to declare them. I tried declaring them in hadoop_env.sh and
> core-site.xml, but I'm not sure I did it right, and at least in my initial
> efforts declaring these variables did not seem to prevent the error. If it
> is the correct thing for me to set these variables appropriately somewhere,
> then I will continue to try to do so.
>
> On Mon, Jun 12, 2017 at 3:46 PM, Yuliya Feldman <yul...@dremio.com> wrote:
>
> > I don't think it is an issue with classpath on the client - since it gets
> > to start AppMaster container
> >
> > Is any other YARN app runs OK?
> >
> > May be YarnConfiguration.YARN_APPLICATION_CLASSPATH is not producing
> right
> > jars
> >
> > Look at HelloWorld code for yarnClasspath
> >
> >
> >
> > On Mon, Jun 12, 2017 at 1:22 PM, Chris Hebert <
> > chris.hebert-...@digitalreasoning.com> wrote:
> >
> > > I hate to ask this here, but it won't work, so whatever.
> > >
> > > I followed the HelloWorld section of the Getting Started guide <
> > > http://twill.apache.org/GettingStarted.html> on my cluster with Hadoop
> > and
> > > Zookeeper set up and functioning properly.
> > >
> > > git clone https://github.com/apache/twill.git
> > > cd twill
> > > mvn clean install -DskipTests
> > >
> > > export
> > > CP=twill-examples/yarn/target/twill-examples-yarn-0.12.0-
> > > SNAPSHOT.jar:`hadoop
> > > classpath`
> > > java -cp $CP org.apache.twill.example.yarn.HelloWorld
> > > my.zookeeper.domain:2181
> > >
> > > Yes, `hadoop classpath` echoes all the relevant jar directories.
> > >
> > > The command runs well for a bit with multiple:
> > > [ STARTING] DEBUG o.a.twill.yarn.YarnTwillController - Yarn
> application
> > > status for HelloWorldRunnable application_1_0001: ACCEPTED
> > >
> > > until:
> > > [ STARTING] DEBUG o.a.hadoop.service.AbstractService - Service:
> > > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state
> > > STOPPED
> > > [ STARTING] DEBUG org.apache.hadoop.ipc.Client - stopping client from
> > > cache: org.apache.hadoop.ipc.Client@4d465b11
> > > [ STARTING] DEBUG o.a.twill.yarn.YarnTwillController - Yarn
> application
> > > status for HelloWorldRunnable application_1_0001: FAILED
> > > ...
> > > java.util.concurrent.ExecutionException: java.lang.RuntimeException:
> > Yarn
> > > application completed with failure HelloWorldRunnable...
> > >
> > > The ResourceManager reveals:
> > > Application application_1_0001 failed 2 times due to AM
> > > Container for appattempt_1_0001_02 exited with
> > exitCode: 1
> > > ...
> > > Diagnostics: Exception from container-launch.
> > >
> > > The corresponding YARN logs for each DataNode reveal:
> > > Exception in thread "main" java.lang.NoClassDefFoundError:
> > > org/apache/hadoop/conf/Configuration
> > > at java.lang.Class.getDeclaredMethods0(Native Method)
> > > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> > > at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> > > at java.lang.Class.getMethod0(Class.java:3018)
> > > at java.lang.Class.getMethod(Class.java:1784)
> > > at org.apache.twill.launcher.TwillLauncher.main(TwillLauncher.java:70)
> > > Caused by: java.lang.ClassNotFoundException:
> > > org.apache.hadoop.conf.Configuration
> > > at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> 

Re: HelloWorld Struggle

2017-06-12 Thread Yuliya Feldman
I don't think it is an issue with classpath on the client - since it gets
to start AppMaster container

Is any other YARN app runs OK?

May be YarnConfiguration.YARN_APPLICATION_CLASSPATH is not producing right
jars

Look at HelloWorld code for yarnClasspath



On Mon, Jun 12, 2017 at 1:22 PM, Chris Hebert <
chris.hebert-...@digitalreasoning.com> wrote:

> I hate to ask this here, but it won't work, so whatever.
>
> I followed the HelloWorld section of the Getting Started guide <
> http://twill.apache.org/GettingStarted.html> on my cluster with Hadoop and
> Zookeeper set up and functioning properly.
>
> git clone https://github.com/apache/twill.git
> cd twill
> mvn clean install -DskipTests
>
> export
> CP=twill-examples/yarn/target/twill-examples-yarn-0.12.0-
> SNAPSHOT.jar:`hadoop
> classpath`
> java -cp $CP org.apache.twill.example.yarn.HelloWorld
> my.zookeeper.domain:2181
>
> Yes, `hadoop classpath` echoes all the relevant jar directories.
>
> The command runs well for a bit with multiple:
> [ STARTING] DEBUG o.a.twill.yarn.YarnTwillController - Yarn application
> status for HelloWorldRunnable application_1_0001: ACCEPTED
>
> until:
> [ STARTING] DEBUG o.a.hadoop.service.AbstractService - Service:
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state
> STOPPED
> [ STARTING] DEBUG org.apache.hadoop.ipc.Client - stopping client from
> cache: org.apache.hadoop.ipc.Client@4d465b11
> [ STARTING] DEBUG o.a.twill.yarn.YarnTwillController - Yarn application
> status for HelloWorldRunnable application_1_0001: FAILED
> ...
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Yarn
> application completed with failure HelloWorldRunnable...
>
> The ResourceManager reveals:
> Application application_1_0001 failed 2 times due to AM
> Container for appattempt_1_0001_02 exited with exitCode: 1
> ...
> Diagnostics: Exception from container-launch.
>
> The corresponding YARN logs for each DataNode reveal:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/conf/Configuration
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
> at java.lang.Class.getMethod0(Class.java:3018)
> at java.lang.Class.getMethod(Class.java:1784)
> at org.apache.twill.launcher.TwillLauncher.main(TwillLauncher.java:70)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.conf.Configuration
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 6 more
> Launch class (org.apache.twill.internal.appmaster.ApplicationMasterMain)
> using classloader java.net.URLClassLoader with classpath: [
> *A list of several classpaths like
> "file:/some/path/yarn/local/usercache/my.username/appcache/application_
> 1_0001/container_e10_
> 1
> _0001_02_01/application.jar/lib/twill-examples-yarn-0.
> 12.0-SNAPSHOT.jar"
> But none of which are paths to any Hadoop jars of the sort that are
> referenced in $CP*
> ]
>
> What am I missing?
>
> I've spent an embarrassingly large amount of time on this fiddling with
> environment variables and Hadoop configuration. (I'm an intern learning
> this stuff the hard way, so it's not really embarrassing, just
> substantial.)
>


[jira] [Commented] (TWILL-233) Apache-Twill 0.11.0 install failure

2017-05-18 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015868#comment-16015868
 ] 

Yuliya Feldman commented on TWILL-233:
--

[~narahari] can you point to exact error(s) - log is really big.

> Apache-Twill 0.11.0 install failure
> ---
>
> Key: TWILL-233
> URL: https://issues.apache.org/jira/browse/TWILL-233
> Project: Apache Twill
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 0.11.0
> Environment: Redhat Linux 64 bit
>Reporter: Narahari
>Priority: Critical
> Fix For: 0.11.0
>
> Attachments: install_errors.txt
>
>
> Trying to install Apache-Twill 0.11.0 followed by link 
> http://twill.apache.org/GettingStarted.html.  Getting below errors.  We are 
> trying to install on MapR lab box. MapR version is 5.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: ENOENT error on upgrading to Twill 0.10.0

2017-03-27 Thread Yuliya Feldman
Code of your application you want to be running in YARN I believe :)

On Mon, Mar 27, 2017 at 3:28 PM, Sam William <sampri...@gmail.com> wrote:

> Yes. 22 bytes looks like an empty zip file.  Any idea what should there in
> the application jar file ?
>
> Sam
> > On Mar 27, 2017, at 13:22, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > File is very small - it may be nothing to do with file not found. Either
> > permissions or something else
> >
> > On Mon, Mar 27, 2017 at 1:17 PM, Sam William <sampri...@gmail.com>
> wrote:
> >
> >> I logged into the master host and looked at the nodemanager logs. It
> fails
> >> at localizing the application jar.  The files are there in HDFS.  I can
> >> even see it is able to copy the other files just fine (for example the
> >> launcher jar and runtime.config)
> >>
> >> -rw-r--r--   3 sam supergroup 22 2017-03-27 12:47
> >> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-
> >> 44a506886fc1/Build-shards-GRE-bd5d893b401041edceec38c78f1ece
> >> c7-application.538b9590-d7f5-4121-824e-448a12a635c1.jar
> >> -rw-r--r--   3 sam supergroup5991970 2017-03-27 12:47
> >> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-
> >> 44a506886fc1/buil.b0458483-23ca-4243-89f6-d1a40210110d.
> >> -rw-r--r--   3 sam supergroup   5725 2017-03-27 12:47
> >> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-
> 44a506886fc1/launcher.
> >> 4d7df397-5325-4a5f-8c95-ddcae99867f5.jar
> >> -rw-r--r--   3 sam supergroup   1038 2017-03-27 12:47
> >> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-
> >> 44a506886fc1/localizeFiles.bbe5dc82-9fe9-4249-8964-df15212a1812.json
> >> -rw-r--r--   3 sam supergroup   2072 2017-03-27 12:47
> >> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-
> >> 44a506886fc1/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar
> >> -rw-r--r--   3 sam supergroup   48245414 2017-03-27 12:47
> >> /user/sam/Build-shards-GRE/2f30b4ab-d9e1-48bd-9384-
> >> 44a506886fc1/twill.c765e4d8-958e-4811-b138-c4ef71e2a93e.jar
> >>
> >>
> >> 2017-03-27 12:47:45,632 INFO org.apache.hadoop.yarn.server.
> >> nodemanager.containermanager.localizer.LocalizedResource: Resource
> >> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab-
> >> d9e1-48bd-9384-44a506886fc1/runtime.config.9dd1b585-c601-
> >> 40b7-8831-25383013eb1e.jar(->/data/8/yarn/nm/usercache/sam/
> >> appcache/application_1484158548936_11282/filecache/
> >> 11/runtime.config.9dd1b585-c601-40b7-8831-25383013eb1e.jar)
> transitioned
> >> from DOWNLOADING to LOCALIZED
> >> 2017-03-27 12:47:45,645 INFO org.apache.hadoop.yarn.server.
> >> nodemanager.containermanager.localizer.LocalizedResource: Resource
> >> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab-
> >> d9e1-48bd-9384-44a506886fc1/launcher.4d7df397-5325-4a5f-
> >> 8c95-ddcae99867f5.jar(->/data/10/yarn/nm/usercache/sam/
> >> appcache/application_1484158548936_11282/filecache/
> >> 12/launcher.4d7df397-5325-4a5f-8c95-ddcae99867f5.jar) transitioned from
> >> DOWNLOADING to LOCALIZED
> >> 2017-03-27 12:47:45,651 WARN org.apache.hadoop.security.
> UserGroupInformation:
> >> PriviledgedActionException as:sam (auth:SIMPLE) cause:ENOENT: No such
> file
> >> or directory
> >> 2017-03-27 12:47:45,655 WARN org.apache.hadoop.yarn.server.
> >> nodemanager.containermanager.localizer.ResourceLocalizationService: {
> >> hdfs://pv34-search-dev/user/sam/Build-shards-GRE/2f30b4ab-
> >> d9e1-48bd-9384-44a506886fc1/Build-shards-GRE-
> >> bd5d893b401041edceec38c78f1ecec7-application.538b9590-d7f5-
> 4121-824e-448a12a635c1.jar,
> >> 1490644063924, ARCHIVE, null } failed: No such file or directory
> >> ENOENT: No such file or directory
> >>at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmodImpl(Native
> >> Method)
> >>at org.apache.hadoop.io.nativeio.NativeIO$POSIX.chmod(NativeIO.
> >> java:230)
> >>at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(
> >> RawLocalFileSystem.java:660)
> >>at org.apache.hadoop.fs.DelegateToFileSystem.setPermission(
> >> DelegateToFileSystem.java:206)
> >>at org.apache.hadoop.fs.FilterFs.setPermission(FilterFs.java:
> 251)
> >>at org.apache.hadoop.fs.FileContext$10.next(
> FileContext.java:955)
> >>at org.apache.hadoop.fs.FileContext$10.next(
> FileContext.java:951)
> >>at org.apache

question regarding BundleRunnable behavior in case of failure

2017-03-23 Thread Yuliya Feldman
Hello there,

I am using BundleRunnable and so far my experience was that in case of
failure of my runnable container process never exits.

And my impression it pretty much all the time stuck in executing Shutdown
hooks (not even my application specific)

Here is ThreadDump for that thread:

"TwillContainerService" #32 prio=5 os_prio=0 tid=0x7f1590297800
nid=0x4de2 in Object.wait() [0x7f15805d9000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
- locked <0xff9c41d0> (a 
org.apache.twill.internal.ServiceMain$1)
at java.lang.Thread.join(Thread.java:1323)
at 
java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
at 
java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked <0xff6363e8> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at 
org.apache.twill.ext.BundledJarRunnable.run(BundledJarRunnable.java:59)
at 
org.apache.twill.internal.container.TwillContainerService.doRun(TwillContainerService.java:130)
at 
org.apache.twill.internal.AbstractTwillService.run(AbstractTwillService.java:181)
at 
twill.com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52)
at java.lang.Thread.run(Thread.java:745)


Just wonder if anybody else experienced similar.

Thanks


[jira] [Commented] (TWILL-225) Allow using different configurations per application submission

2017-03-20 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933443#comment-15933443
 ] 

Yuliya Feldman commented on TWILL-225:
--

[~chtyim] Great idea, otherwise it has to be pretty much TwillRunnerService per 
"run" with any modification to Configuration object

> Allow using different configurations per application submission
> ---
>
> Key: TWILL-225
> URL: https://issues.apache.org/jira/browse/TWILL-225
> Project: Apache Twill
>  Issue Type: Improvement
>Reporter: Terence Yim
>Assignee: Terence Yim
> Fix For: 0.11.0
>
>
> Currently there are couple configurations that can be provided via the hadoop 
> {{Configuration}} object to the {{YarnTwillRunnerService}}. However, those 
> configurations are global (same for all app launched through the same 
> {{TwillRunnerService}}). It would be better if the {{TwillPreparer}} exposes 
> method to alter the configuration for a given app submission.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TWILL-216) Make ratio between total memory and on-heap memory configurable

2017-02-17 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872281#comment-15872281
 ] 

Yuliya Feldman commented on TWILL-216:
--

Sorry, missed one style change. Will update PR in a minute

> Make ratio between total memory and on-heap memory configurable
> ---
>
> Key: TWILL-216
> URL: https://issues.apache.org/jira/browse/TWILL-216
> Project: Apache Twill
>  Issue Type: Improvement
>  Components: yarn
>        Reporter: Yuliya Feldman
>    Assignee: Yuliya Feldman
>
> As of now ratio between on-heap memory and total memory provided to yarn 
> container is hardcoded to 0.7, so if app running in the container needs more 
> reserved memory than on-heap it is not possible to achieve.
> Suggestion is to make it configurable as well as amount of reserved memory



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TWILL-216) Make ratio between total memory and on-heap memory configurable

2017-02-16 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871248#comment-15871248
 ] 

Yuliya Feldman commented on TWILL-216:
--

Sorry for not adding more details soon.

Essentially at the moment Twill decides how much of  the total requested memory 
to allocate to heap based on the hardcoded ratio of 0.7, meaning it will 
allocate at least 70% to heap.

There could be applications that use direct memory quite a bit and they want to 
allocate more then 30% of total memory to be direct memory. 

This is a rational behind this JIRA.

> Make ratio between total memory and on-heap memory configurable
> ---
>
> Key: TWILL-216
> URL: https://issues.apache.org/jira/browse/TWILL-216
> Project: Apache Twill
>  Issue Type: Improvement
>  Components: yarn
>        Reporter: Yuliya Feldman
>    Assignee: Yuliya Feldman
>
> As of now ratio between on-heap memory and total memory provided to yarn 
> container is hardcoded to 0.7, so if app running in the container needs more 
> reserved memory than on-heap it is not possible to achieve.
> Suggestion is to make it configurable as well as amount of reserved memory



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (TWILL-216) Make ratio between total memory and on-heap memory configurable

2017-02-16 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871248#comment-15871248
 ] 

Yuliya Feldman edited comment on TWILL-216 at 2/17/17 6:24 AM:
---

Sorry for not adding more details sooner.

Essentially at the moment Twill decides how much of  the total requested memory 
to allocate to heap based on the hardcoded ratio of 0.7, meaning it will 
allocate at least 70% to heap.

There could be applications that use direct memory quite a bit and they want to 
allocate more then 30% of total memory to be direct memory. 

This is a rational behind this JIRA.


was (Author: yufeldman):
Sorry for not adding more details soon.

Essentially at the moment Twill decides how much of  the total requested memory 
to allocate to heap based on the hardcoded ratio of 0.7, meaning it will 
allocate at least 70% to heap.

There could be applications that use direct memory quite a bit and they want to 
allocate more then 30% of total memory to be direct memory. 

This is a rational behind this JIRA.

> Make ratio between total memory and on-heap memory configurable
> ---
>
> Key: TWILL-216
> URL: https://issues.apache.org/jira/browse/TWILL-216
> Project: Apache Twill
>  Issue Type: Improvement
>  Components: yarn
>        Reporter: Yuliya Feldman
>    Assignee: Yuliya Feldman
>
> As of now ratio between on-heap memory and total memory provided to yarn 
> container is hardcoded to 0.7, so if app running in the container needs more 
> reserved memory than on-heap it is not possible to achieve.
> Suggestion is to make it configurable as well as amount of reserved memory



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Bundled jar ability to pick up local jars

2017-02-14 Thread Yuliya Feldman
Let me even further rephrase the question

What should be the structure of bundled jar

It feels like it should be classes of the main jar + lib folder with
additional jars - it can not be main jar + lib folder with jars, as in this
case main jar is not really loaded since it loads parent jar (one that is
defined as "bundled")

Thanks

On Tue, Feb 14, 2017 at 6:54 PM, Yuliya Feldman <yul...@dremio.com> wrote:

>  Sorry,
>
> I probably was not clear. I understand that TwillContainer launcher will
> take classpath into consideration.
> I was more wondering about bundledjar loading - we load it in a separate
> classloader, so everything has to be included into bundled jar, otherwise
> it does not seem to work, as it will be missing dependencies - nothing is
> loaded outside of the jar itself.
>
> Thanks
>
> On Tue, Feb 14, 2017 at 6:39 PM, Terence Yim <cht...@gmail.com> wrote:
>
>> Hi,
>>
>> If a jar is already available on the node, you can use
>> TwillPreparer.withClasspath to include those to the container classpath.
>>
>> Terence
>>
>> Sent from my iPhone
>>
>> > On Feb 14, 2017, at 6:32 PM, Yuliya Feldman <yul...@dremio.com> wrote:
>> >
>> > Hello there,
>> >
>> > I have a question regarding Bundled jar.
>> >
>> > Is there is anyway I could pick up some jar/config form the node where
>> it
>> > is running so it is not prepackaged within bundled jar itself.
>> >
>> > Thanks
>>
>
>


Re: Bundled jar ability to pick up local jars

2017-02-14 Thread Yuliya Feldman
 Sorry,

I probably was not clear. I understand that TwillContainer launcher will
take classpath into consideration.
I was more wondering about bundledjar loading - we load it in a separate
classloader, so everything has to be included into bundled jar, otherwise
it does not seem to work, as it will be missing dependencies - nothing is
loaded outside of the jar itself.

Thanks

On Tue, Feb 14, 2017 at 6:39 PM, Terence Yim <cht...@gmail.com> wrote:

> Hi,
>
> If a jar is already available on the node, you can use
> TwillPreparer.withClasspath to include those to the container classpath.
>
> Terence
>
> Sent from my iPhone
>
> > On Feb 14, 2017, at 6:32 PM, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > Hello there,
> >
> > I have a question regarding Bundled jar.
> >
> > Is there is anyway I could pick up some jar/config form the node where it
> > is running so it is not prepackaged within bundled jar itself.
> >
> > Thanks
>


Bundled jar ability to pick up local jars

2017-02-14 Thread Yuliya Feldman
Hello there,

I have a question regarding Bundled jar.

Is there is anyway I could pick up some jar/config form the node where it
is running so it is not prepackaged within bundled jar itself.

Thanks


Re: java.lang.IncompatibleClassChangeError: Implementing class

2017-02-10 Thread Yuliya Feldman
It is a problem with guava versions difference

Twill is using version of guava that is different from Hadoop one

I would highly recommend create shaded jar with Twill libraries to shade
guava

On Fri, Feb 10, 2017 at 6:52 PM, Matteo Pelati 
wrote:

> 2.7.3
>
> And this is teh stacktrace:
>
> Exception in thread "main" java.lang.IncompatibleClassChangeError:
> Implementing class
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
> org.apache.twill.internal.zookeeper.DefaultZKClientService.(
> DefaultZKClientService.java:98)
> at
> org.apache.twill.zookeeper.ZKClientService$Builder.build(
> ZKClientService.java:101)
> at
> org.apache.twill.yarn.YarnTwillRunnerService.getZKClientService(
> YarnTwillRunnerService.java:450)
> at
> org.apache.twill.yarn.YarnTwillRunnerService.(
> YarnTwillRunnerService.java:164)
> at
> org.apache.twill.yarn.YarnTwillRunnerService.(
> YarnTwillRunnerService.java:150)
> at
> com.dataheaps.beanszoo.utils.BundledJarExample.main(
> BundledJarExample.java:72)
>
>
> Thanks
> Matteo
>
> On Sat, Feb 11, 2017 at 1:10 AM, Terence Yim  wrote:
>
> > Hi,
> >
> > What is the Hadoop version you are using? And do you have the class name
> > of the incompatible class involved?
> >
> > Terence
> >
> > Sent from my iPhone
> >
> > > On Feb 10, 2017, at 8:11 AM, Matteo Pelati 
> > wrote:
> > >
> > > Hello,
> > >
> > > I'm trying to use Twill in BeansZoo and I'm running into the following
> > > excpetion when I try to run a basic application:
> > >
> > > java.lang.IncompatibleClassChangeError: Implementing class
> > >
> > > any hint ?
> > >
> > > Thanks
> > > Matteo
> > >
> > > --
> > > Matteo Pelati
> > > Phone: +65-91149676
> > > Skype: matteop1976
> >
>
>
>
> --
> Matteo Pelati
> Phone: +65-91149676
> Skype: matteop1976
>


[jira] [Updated] (TWILL-216) Make ratio between total memory and on-heap memory configurable

2017-02-10 Thread Yuliya Feldman (JIRA)

 [ 
https://issues.apache.org/jira/browse/TWILL-216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuliya Feldman updated TWILL-216:
-
Summary: Make ratio between total memory and on-heap memory configurable  
(was: Make ration between total memory and onheap memory configurable)

> Make ratio between total memory and on-heap memory configurable
> ---
>
> Key: TWILL-216
> URL: https://issues.apache.org/jira/browse/TWILL-216
> Project: Apache Twill
>  Issue Type: Improvement
>  Components: yarn
>        Reporter: Yuliya Feldman
>    Assignee: Yuliya Feldman
>
> As of now ratio between on-heap memory and total memory provided to yarn 
> container is hardcoded to 0.7, so if app running in the container needs more 
> reserved memory than on-heap it is not possible to achieve.
> Suggestion is to make it configurable as well as amount of reserved memory



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (TWILL-210) ServiceMain does not handle well URI without authority

2017-01-27 Thread Yuliya Feldman (JIRA)

[ 
https://issues.apache.org/jira/browse/TWILL-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843928#comment-15843928
 ] 

Yuliya Feldman commented on TWILL-210:
--

[~chtyim] Thank you. It was really quick.

> ServiceMain does not handle well URI without authority
> --
>
> Key: TWILL-210
> URL: https://issues.apache.org/jira/browse/TWILL-210
> Project: Apache Twill
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 0.8.0, 0.9.0
>    Reporter: Yuliya Feldman
>    Assignee: Yuliya Feldman
> Fix For: 0.10.0
>
>
> When figuring out defaultFS from path  ServiceMain does not handle correctly 
> FileSystems that do not provide URI authority 
> E.g. maprfs:///



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TWILL-210) ServiceMain does not handle well URI without authority

2017-01-27 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created TWILL-210:


 Summary: ServiceMain does not handle well URI without authority
 Key: TWILL-210
 URL: https://issues.apache.org/jira/browse/TWILL-210
 Project: Apache Twill
  Issue Type: Bug
  Components: yarn
Affects Versions: 0.9.0, 0.8.0
Reporter: Yuliya Feldman


When figuring out defaultFS from path  ServiceMain does not handle correctly 
FileSystems that do not provide URI authority 

E.g. maprfs:///





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TWILL-203) irrespective of number of CPUs specified in App Config it is always 1

2017-01-11 Thread Yuliya Feldman (JIRA)

 [ 
https://issues.apache.org/jira/browse/TWILL-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuliya Feldman updated TWILL-203:
-
Attachment: rm.log

updated rm.log snippet

> irrespective of number of CPUs specified in App Config it is always 1
> -
>
> Key: TWILL-203
> URL: https://issues.apache.org/jira/browse/TWILL-203
> Project: Apache Twill
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 0.8.0
>    Reporter: Yuliya Feldman
> Attachments: Screen Shot 2017-01-06 at 10.17.24 AM.png, rm.log, 
> twillclient.log
>
>
> When trying to deploy Bundled Jar app and specifying number of CPUs > 1 it 
> still defaults to 1 when application is starting.
> Version of YARN is: 2.7.2, Version of Twill is 0.8. Capacity scheduler.
> Looks like (from the logs) all the info is passed through Twill correctly and 
> gets "lost" while getting to RM
> Please see attached logs form Twill Client, RM and Screenshot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TWILL-203) irrespective of number of CPUs specified in App Config it is always 1

2017-01-11 Thread Yuliya Feldman (JIRA)

 [ 
https://issues.apache.org/jira/browse/TWILL-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuliya Feldman updated TWILL-203:
-
Attachment: (was: rm.log)

> irrespective of number of CPUs specified in App Config it is always 1
> -
>
> Key: TWILL-203
> URL: https://issues.apache.org/jira/browse/TWILL-203
> Project: Apache Twill
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 0.8.0
>    Reporter: Yuliya Feldman
> Attachments: Screen Shot 2017-01-06 at 10.17.24 AM.png, rm.log, 
> twillclient.log
>
>
> When trying to deploy Bundled Jar app and specifying number of CPUs > 1 it 
> still defaults to 1 when application is starting.
> Version of YARN is: 2.7.2, Version of Twill is 0.8. Capacity scheduler.
> Looks like (from the logs) all the info is passed through Twill correctly and 
> gets "lost" while getting to RM
> Please see attached logs form Twill Client, RM and Screenshot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: CPU count in Resource Spec

2017-01-10 Thread Yuliya Feldman
Created: https://issues.apache.org/jira/browse/TWILL-203

Thanks

On Tue, Jan 10, 2017 at 11:15 PM, Yuliya Feldman <yul...@dremio.com> wrote:

> sure
>
> will do
>
> Thanks
>
> On Tue, Jan 10, 2017 at 11:11 PM, Terence Yim <cht...@gmail.com> wrote:
>
>> Probably the apache mailing list does not allow attachment. Would you
>> mind creating a JIRA an have it attached to the ticket?
>>
>> https://issues.apache.org/jira/browse/TWILL <
>> https://issues.apache.org/jira/browse/TWILL>
>>
>> Terence
>>
>> > On Jan 10, 2017, at 11:09 PM, Yuliya Feldman <yul...@dremio.com> wrote:
>> >
>> > sure
>> >
>> > attached
>> >
>> > Hopefully attachments are allowed
>> >
>> > On Tue, Jan 10, 2017 at 11:07 PM, Terence Yim <cht...@gmail.com
>> <mailto:cht...@gmail.com>> wrote:
>> > Hi,
>> >
>> > I don’t see any attachment in your previous email. Would you mind
>> attaching it again?
>> >
>> > Terence
>> > > On Jan 10, 2017, at 11:02 PM, Yuliya Feldman <yul...@dremio.com
>> <mailto:yul...@dremio.com>> wrote:
>> > >
>> > > Capacity scheduler
>> > >
>> > > On Tue, Jan 10, 2017 at 11:00 PM, Terence Yim <cht...@gmail.com
>> <mailto:cht...@gmail.com>> wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> Do you know what resource schedule that YARN is running with?
>> > >>
>> > >> Terence
>> > >>
>> > >>> On Jan 6, 2017, at 10:32 AM, Yuliya Feldman <yul...@dremio.com
>> <mailto:yul...@dremio.com>> wrote:
>> > >>>
>> > >>> Yes, I did
>> > >>>
>> > >>> See attached
>> > >>>
>> > >>> On Fri, Jan 6, 2017 at 10:15 AM, Terence Yim <cht...@gmail.com
>> <mailto:cht...@gmail.com> > > >> cht...@gmail.com <mailto:cht...@gmail.com>>> wrote:
>> > >>> Hi,
>> > >>>
>> > >>> We've never observe this before. The AM container itself is always
>> > >> running
>> > >>> with 1 vcore. Have you look at the YARN container info in the RM UI
>> that
>> > >> is
>> > >>> having that particular TwillRunnable running inside?
>> > >>>
>> > >>> Terence
>> > >>>
>> > >>> On Fri, Jan 6, 2017 at 9:54 AM, Yuliya Feldman <yul...@dremio.com
>> <mailto:yul...@dremio.com>
>> > >> <mailto:yul...@dremio.com <mailto:yul...@dremio.com>>> wrote:
>> > >>>
>> > >>>> yes
>> > >>>>
>> > >>>> On Fri, Jan 6, 2017 at 9:40 AM, Terence Yim <cht...@gmail.com
>> <mailto:cht...@gmail.com> > > >> cht...@gmail.com <mailto:cht...@gmail.com>>> wrote:
>> > >>>>
>> > >>>>> Where do you verify the CPU cores? Was it from the YARN resource
>> > >> manager
>> > >>>>> UI?
>> > >>>>>
>> > >>>>> Terence
>> > >>>>>
>> > >>>>> Sent from my iPhone
>> > >>>>>
>> > >>>>>> On Jan 6, 2017, at 9:01 AM, Yuliya Feldman <yul...@dremio.com
>> <mailto:yul...@dremio.com>
>> > >> <mailto:yul...@dremio.com <mailto:yul...@dremio.com>>> wrote:
>> > >>>>>>
>> > >>>>>> Sorry for so many questions - unfortunately not much information
>> I
>> > >> can
>> > >>>>> fish
>> > >>>>>> in the wild :).
>> > >>>>>>
>> > >>>>>> I noticed that CPU count set on Application Specification is not
>> > >>>> getting
>> > >>>>>> reflected while starting container - it is always 1 CPU.
>> > >>>>>>
>> > >>>>>> Did anybody experience this? It really feels like it is lost
>> > >> somewhere
>> > >>>> in
>> > >>>>>> between Twill and YARN. I am using YARN 2.7.2
>> > >>>>>>
>> > >>>>>> Thanks in advance
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> > >>
>> >
>> >
>>
>>
>


[jira] [Created] (TWILL-203) irrespective of number of CPUs specified in App Config it is always 1

2017-01-10 Thread Yuliya Feldman (JIRA)
Yuliya Feldman created TWILL-203:


 Summary: irrespective of number of CPUs specified in App Config it 
is always 1
 Key: TWILL-203
 URL: https://issues.apache.org/jira/browse/TWILL-203
 Project: Apache Twill
  Issue Type: Bug
  Components: yarn
Affects Versions: 0.8.0
Reporter: Yuliya Feldman
 Attachments: Screen Shot 2017-01-06 at 10.17.24 AM.png, rm.log, 
twillclient.log

When trying to deploy Bundled Jar app and specifying number of CPUs > 1 it 
still defaults to 1 when application is starting.

Version of YARN is: 2.7.2, Version of Twill is 0.8. Capacity scheduler.

Looks like (from the logs) all the info is passed through Twill correctly and 
gets "lost" while getting to RM

Please see attached logs form Twill Client, RM and Screenshot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: CPU count in Resource Spec

2017-01-10 Thread Yuliya Feldman
sure

will do

Thanks

On Tue, Jan 10, 2017 at 11:11 PM, Terence Yim <cht...@gmail.com> wrote:

> Probably the apache mailing list does not allow attachment. Would you mind
> creating a JIRA an have it attached to the ticket?
>
> https://issues.apache.org/jira/browse/TWILL <https://issues.apache.org/
> jira/browse/TWILL>
>
> Terence
>
> > On Jan 10, 2017, at 11:09 PM, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > sure
> >
> > attached
> >
> > Hopefully attachments are allowed
> >
> > On Tue, Jan 10, 2017 at 11:07 PM, Terence Yim <cht...@gmail.com  cht...@gmail.com>> wrote:
> > Hi,
> >
> > I don’t see any attachment in your previous email. Would you mind
> attaching it again?
> >
> > Terence
> > > On Jan 10, 2017, at 11:02 PM, Yuliya Feldman <yul...@dremio.com
> <mailto:yul...@dremio.com>> wrote:
> > >
> > > Capacity scheduler
> > >
> > > On Tue, Jan 10, 2017 at 11:00 PM, Terence Yim <cht...@gmail.com
> <mailto:cht...@gmail.com>> wrote:
> > >
> > >> Hi,
> > >>
> > >> Do you know what resource schedule that YARN is running with?
> > >>
> > >> Terence
> > >>
> > >>> On Jan 6, 2017, at 10:32 AM, Yuliya Feldman <yul...@dremio.com
> <mailto:yul...@dremio.com>> wrote:
> > >>>
> > >>> Yes, I did
> > >>>
> > >>> See attached
> > >>>
> > >>> On Fri, Jan 6, 2017 at 10:15 AM, Terence Yim <cht...@gmail.com
> <mailto:cht...@gmail.com>  > >> cht...@gmail.com <mailto:cht...@gmail.com>>> wrote:
> > >>> Hi,
> > >>>
> > >>> We've never observe this before. The AM container itself is always
> > >> running
> > >>> with 1 vcore. Have you look at the YARN container info in the RM UI
> that
> > >> is
> > >>> having that particular TwillRunnable running inside?
> > >>>
> > >>> Terence
> > >>>
> > >>> On Fri, Jan 6, 2017 at 9:54 AM, Yuliya Feldman <yul...@dremio.com
> <mailto:yul...@dremio.com>
> > >> <mailto:yul...@dremio.com <mailto:yul...@dremio.com>>> wrote:
> > >>>
> > >>>> yes
> > >>>>
> > >>>> On Fri, Jan 6, 2017 at 9:40 AM, Terence Yim <cht...@gmail.com
> <mailto:cht...@gmail.com>  > >> cht...@gmail.com <mailto:cht...@gmail.com>>> wrote:
> > >>>>
> > >>>>> Where do you verify the CPU cores? Was it from the YARN resource
> > >> manager
> > >>>>> UI?
> > >>>>>
> > >>>>> Terence
> > >>>>>
> > >>>>> Sent from my iPhone
> > >>>>>
> > >>>>>> On Jan 6, 2017, at 9:01 AM, Yuliya Feldman <yul...@dremio.com
> <mailto:yul...@dremio.com>
> > >> <mailto:yul...@dremio.com <mailto:yul...@dremio.com>>> wrote:
> > >>>>>>
> > >>>>>> Sorry for so many questions - unfortunately not much information I
> > >> can
> > >>>>> fish
> > >>>>>> in the wild :).
> > >>>>>>
> > >>>>>> I noticed that CPU count set on Application Specification is not
> > >>>> getting
> > >>>>>> reflected while starting container - it is always 1 CPU.
> > >>>>>>
> > >>>>>> Did anybody experience this? It really feels like it is lost
> > >> somewhere
> > >>>> in
> > >>>>>> between Twill and YARN. I am using YARN 2.7.2
> > >>>>>>
> > >>>>>> Thanks in advance
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >>
> >
> >
>
>


Re: CPU count in Resource Spec

2017-01-10 Thread Yuliya Feldman
sure

attached

Hopefully attachments are allowed

On Tue, Jan 10, 2017 at 11:07 PM, Terence Yim <cht...@gmail.com> wrote:

> Hi,
>
> I don’t see any attachment in your previous email. Would you mind
> attaching it again?
>
> Terence
> > On Jan 10, 2017, at 11:02 PM, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > Capacity scheduler
> >
> > On Tue, Jan 10, 2017 at 11:00 PM, Terence Yim <cht...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> Do you know what resource schedule that YARN is running with?
> >>
> >> Terence
> >>
> >>> On Jan 6, 2017, at 10:32 AM, Yuliya Feldman <yul...@dremio.com> wrote:
> >>>
> >>> Yes, I did
> >>>
> >>> See attached
> >>>
> >>> On Fri, Jan 6, 2017 at 10:15 AM, Terence Yim <cht...@gmail.com
>  >> cht...@gmail.com>> wrote:
> >>> Hi,
> >>>
> >>> We've never observe this before. The AM container itself is always
> >> running
> >>> with 1 vcore. Have you look at the YARN container info in the RM UI
> that
> >> is
> >>> having that particular TwillRunnable running inside?
> >>>
> >>> Terence
> >>>
> >>> On Fri, Jan 6, 2017 at 9:54 AM, Yuliya Feldman <yul...@dremio.com
> >> <mailto:yul...@dremio.com>> wrote:
> >>>
> >>>> yes
> >>>>
> >>>> On Fri, Jan 6, 2017 at 9:40 AM, Terence Yim <cht...@gmail.com
>  >> cht...@gmail.com>> wrote:
> >>>>
> >>>>> Where do you verify the CPU cores? Was it from the YARN resource
> >> manager
> >>>>> UI?
> >>>>>
> >>>>> Terence
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>>> On Jan 6, 2017, at 9:01 AM, Yuliya Feldman <yul...@dremio.com
> >> <mailto:yul...@dremio.com>> wrote:
> >>>>>>
> >>>>>> Sorry for so many questions - unfortunately not much information I
> >> can
> >>>>> fish
> >>>>>> in the wild :).
> >>>>>>
> >>>>>> I noticed that CPU count set on Application Specification is not
> >>>> getting
> >>>>>> reflected while starting container - it is always 1 CPU.
> >>>>>>
> >>>>>> Did anybody experience this? It really feels like it is lost
> >> somewhere
> >>>> in
> >>>>>> between Twill and YARN. I am using YARN 2.7.2
> >>>>>>
> >>>>>> Thanks in advance
> >>>>>
> >>>>
> >>>
> >>
> >>
>
>


Re: CPU count in Resource Spec

2017-01-10 Thread Yuliya Feldman
Capacity scheduler

On Tue, Jan 10, 2017 at 11:00 PM, Terence Yim <cht...@gmail.com> wrote:

> Hi,
>
> Do you know what resource schedule that YARN is running with?
>
> Terence
>
> > On Jan 6, 2017, at 10:32 AM, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > Yes, I did
> >
> > See attached
> >
> > On Fri, Jan 6, 2017 at 10:15 AM, Terence Yim <cht...@gmail.com  cht...@gmail.com>> wrote:
> > Hi,
> >
> > We've never observe this before. The AM container itself is always
> running
> > with 1 vcore. Have you look at the YARN container info in the RM UI that
> is
> > having that particular TwillRunnable running inside?
> >
> > Terence
> >
> > On Fri, Jan 6, 2017 at 9:54 AM, Yuliya Feldman <yul...@dremio.com
> <mailto:yul...@dremio.com>> wrote:
> >
> > > yes
> > >
> > > On Fri, Jan 6, 2017 at 9:40 AM, Terence Yim <cht...@gmail.com  cht...@gmail.com>> wrote:
> > >
> > > > Where do you verify the CPU cores? Was it from the YARN resource
> manager
> > > > UI?
> > > >
> > > > Terence
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On Jan 6, 2017, at 9:01 AM, Yuliya Feldman <yul...@dremio.com
> <mailto:yul...@dremio.com>> wrote:
> > > > >
> > > > > Sorry for so many questions - unfortunately not much information I
> can
> > > > fish
> > > > > in the wild :).
> > > > >
> > > > > I noticed that CPU count set on Application Specification is not
> > > getting
> > > > > reflected while starting container - it is always 1 CPU.
> > > > >
> > > > > Did anybody experience this? It really feels like it is lost
> somewhere
> > > in
> > > > > between Twill and YARN. I am using YARN 2.7.2
> > > > >
> > > > > Thanks in advance
> > > >
> > >
> >
>
>


Re: CPU count in Resource Spec

2017-01-06 Thread Yuliya Feldman
Yes, I did

See attached

On Fri, Jan 6, 2017 at 10:15 AM, Terence Yim <cht...@gmail.com> wrote:

> Hi,
>
> We've never observe this before. The AM container itself is always running
> with 1 vcore. Have you look at the YARN container info in the RM UI that is
> having that particular TwillRunnable running inside?
>
> Terence
>
> On Fri, Jan 6, 2017 at 9:54 AM, Yuliya Feldman <yul...@dremio.com> wrote:
>
> > yes
> >
> > On Fri, Jan 6, 2017 at 9:40 AM, Terence Yim <cht...@gmail.com> wrote:
> >
> > > Where do you verify the CPU cores? Was it from the YARN resource
> manager
> > > UI?
> > >
> > > Terence
> > >
> > > Sent from my iPhone
> > >
> > > > On Jan 6, 2017, at 9:01 AM, Yuliya Feldman <yul...@dremio.com>
> wrote:
> > > >
> > > > Sorry for so many questions - unfortunately not much information I
> can
> > > fish
> > > > in the wild :).
> > > >
> > > > I noticed that CPU count set on Application Specification is not
> > getting
> > > > reflected while starting container - it is always 1 CPU.
> > > >
> > > > Did anybody experience this? It really feels like it is lost
> somewhere
> > in
> > > > between Twill and YARN. I am using YARN 2.7.2
> > > >
> > > > Thanks in advance
> > >
> >
>


Re: CPU count in Resource Spec

2017-01-06 Thread Yuliya Feldman
yes

On Fri, Jan 6, 2017 at 9:40 AM, Terence Yim <cht...@gmail.com> wrote:

> Where do you verify the CPU cores? Was it from the YARN resource manager
> UI?
>
> Terence
>
> Sent from my iPhone
>
> > On Jan 6, 2017, at 9:01 AM, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > Sorry for so many questions - unfortunately not much information I can
> fish
> > in the wild :).
> >
> > I noticed that CPU count set on Application Specification is not getting
> > reflected while starting container - it is always 1 CPU.
> >
> > Did anybody experience this? It really feels like it is lost somewhere in
> > between Twill and YARN. I am using YARN 2.7.2
> >
> > Thanks in advance
>


CPU count in Resource Spec

2017-01-06 Thread Yuliya Feldman
Sorry for so many questions - unfortunately not much information I can fish
in the wild :).

I noticed that CPU count set on Application Specification is not getting
reflected while starting container - it is always 1 CPU.

Did anybody experience this? It really feels like it is lost somewhere in
between Twill and YARN. I am using YARN 2.7.2

Thanks in advance


logging within an application

2017-01-06 Thread Yuliya Feldman
Hello there,

 I have an issue with logging from my App that runs within container - no
logging comes out except one that goes to System.out

Here is what I have in stderr of the container (not sure if StaticBinder
duplicate is an issue, but I will certainly try to fix it):

OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512m;
support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/var/lib/hadoop/yarn/data/usercache/username/appcache/application_1483475903489_0036/container_1483475903489_0036_01_02/tmp/twill.launcher-1483721015382-0/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type
[ch.qos.logback.classic.util.ContextSelectorStaticBinder]
17/01/06 16:43:37 INFO utils.VerifiableProperties: Verifying properties
17/01/06 16:43:37 INFO utils.VerifiableProperties: Property
metadata.broker.list is overridden to host:40040
17/01/06 16:43:37 INFO utils.VerifiableProperties: Property
request.required.acks is overridden to 1
17/01/06 16:43:37 INFO utils.VerifiableProperties: Property
partitioner.class is overridden to
org.apache.twill.internal.kafka.client.IntegerPartitioner
17/01/06 16:43:37 INFO utils.VerifiableProperties: Property
compression.codec is overridden to snappy
17/01/06 16:43:37 INFO utils.VerifiableProperties: Property
key.serializer.class is overridden to
org.apache.twill.internal.kafka.client.IntegerEncoder
17/01/06 16:43:37 INFO utils.VerifiableProperties: Property
serializer.class is overridden to
org.apache.twill.internal.kafka.client.ByteBufferEncoder
17/01/06 16:43:37 INFO client.ClientUtils$: Fetching metadata from
broker id:0,host:host,port:40040 with correlation id 0 for 1 topic(s)
Set(log)
17/01/06 16:43:37 INFO producer.SyncProducer: Connected to host:40040
for producing
17/01/06 16:43:37 INFO producer.SyncProducer: Disconnecting from host:40040
17/01/06 16:43:38 INFO producer.SyncProducer: Connected host:40040 for producing
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for
further details.
Jan 06, 2017 4:43:55 PM org.hibernate.validator.internal.util.Version 
INFO: HV01: Hibernate Validator 5.2.4.Final
17/01/06 16:44:02 INFO producer.Producer: Shutting down producer
17/01/06 16:44:02 INFO producer.ProducerPool: Closing all sync producers
17/01/06 16:44:02 INFO producer.SyncProducer: Disconnecting from host:40040



Thanks


Re: Retrieving failure when app terminates abnormally

2017-01-06 Thread Yuliya Feldman
Thank you Terrence



On Fri, Jan 6, 2017 at 12:37 AM, Terence Yim <cht...@gmail.com> wrote:

> Hi,
>
> It depends on when the failure happen. If the app is already submitted to
> YARN successfully and failed afterwards, it should still get reflected
> through the Future returned by the teminate() call. However, due to
> https://issues.apache.org/jira/browse/TWILL-180 <
> https://issues.apache.org/jira/browse/TWILL-180>, it is currently not
> reflected correctly.
>
> Terence
>
> > On Jan 6, 2017, at 12:30 AM, Yuliya Feldman <yul...@dremio.com> wrote:
> >
> > Thank you
> >
> > What about retrieving error if it fails to start?
> >
> > On Fri, Jan 6, 2017 at 12:12 AM, Terence Yim <cht...@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> Currently the error can be retrieved via `TwillController.terminate().
> get()`
> >> call, as stated in the javadoc of the `terminate()` method.
> >>
> >> * Calling this method multiple times is allowed and a {@link Future}
> >> representing the termination state
> >> * will be returned.
> >>
> >> Terence
> >>
> >>> On Jan 5, 2017, at 11:17 PM, Yuliya Feldman <yul...@dremio.com> wrote:
> >>>
> >>> Hello there,
> >>>
> >>> I am trying to use async APIs to start/stop Twill managed Yarn
> >> Application.
> >>>
> >>> I am using onRunning() and onTerminated() APIs for this, but I don't
> see
> >> a
> >>> way of retrieving an error in case of failure
> >>>
> >>> public void onTerminated(final Runnable runnable, Executor executor) {
> >>> this.addListener(new ServiceListenerAdapter() {
> >>>   public void failed(State from, Throwable failure) {
> >>> runnable.run();
> >>>   }
> >>>
> >>>   public void terminated(State from) {
> >>> runnable.run();
> >>>   }
> >>> }, executor);
> >>> }
> >>>
> >>>
> >>> Is there is a way of retrieving "Throwable failure" ?
> >>>
> >>> Or am I using wrong APIs?
> >>>
> >>> Thanks
> >>
> >>
>
>


Re: How/does Twill can survive a restart of TwillClient

2016-12-23 Thread Yuliya Feldman
Thank you very much for the reply
Please see inline


On Fri, Dec 23, 2016 at 11:10 AM, Terence Yim <cht...@gmail.com> wrote:

> Hi,
>
> 1. It really depends on how much resources that your application need.
> Twill simply act as a bridge between your app and YARN, however, the YARN
> cluster itself needs to have enough resources (memory and vcores) to run
> your application.
>
I definitely agree that YARN should have capacity. What I am trying to say
is that if I want to change my mind and resize 2nd time before 1st request
was satisfied I can not do it. What if I mistyped number of requested
containers - put 100 instead of 10 and YARN will never have this capacity.
If I change back to 10 it won't change it unless 100 is satisfied.

>
> 2. You should be able to do that through the TwillRunner.lookup method. Do
> you mean you tried but it doesn't return anything?
>
TwillRunner.lookup works ONLY if application that uses TwillRunner.lookup
(YARN/Twill client another words) NEVER restarted - if it restarted all the
information is lost and I am not sure how to make TwillRunner to obtain it
again from running cluster.

>
> Terence
>
> On Thu, Dec 22, 2016 at 2:20 PM, Yuliya Feldman <yul...@dremio.com> wrote:
>
> > Hello there,
> >
> > I started using Twill recently and I came across couple of issues I
> wanted
> > to check on:
> >
> > 1. If I resize YARN cluster to more capacity it can handle I can't resize
> > down, as it did not satisfy first request
> >
> > 2. If my application that spawns up Twill YARN Cluster restarts (meaning
> I
> > am losing YarnTwillRunnerService) I can not get hold of the
> TwillController
> > after it even I know runId and what not.
> >
> > Could anybody advise/confirm/deny on the issues I am seeing?
> >
> > Thanks in advance
> >
>