Re: Question for Mesos gurus

2015-08-25 Thread yuliya Feldman
Thank you Adam
When you say:but before you upgrade Mesos to 0.23, you should upgrade
your scheduler (and executor) libmesos to 0.22.x

Do you mean - recompile?
Does this sentence from link with upgrade instructions you provided means the 
same? Rebuild and install any modules so that upgraded masters/slaves can 
use them
Thanks,Yuliya  From: Adam Bordelon a...@mesosphere.io
 To: dev@myriad.incubator.apache.org; yuliya Feldman yufeld...@yahoo.com 
 Sent: Tuesday, August 25, 2015 10:06 AM
 Subject: Re: Question for Mesos gurus
   
Mesos guarantees forward and backward compatibility by one minor version.
It is expected that you upgrade the entire cluster to one consecutive
version before upgrading any component to the next. So, if your scheduler
jar's libmesos is from 0.21.x, you can upgrade your Mesos master/agents to
0.22.x safely, but before you upgrade Mesos to 0.23, you should upgrade
your scheduler (and executor) libmesos to 0.22.x. See
http://mesos.apache.org/documentation/latest/upgrades/ for other special
notes and recommended upgrade order.
Once we reach Mesos 1.0 (when the new HTTP API stabilizes), then we'll have
stronger guarantees about version compatibility within a major version.



On Tue, Aug 25, 2015 at 8:33 AM, yuliya Feldman yufeld...@yahoo.com.invalid
 wrote:

 Hello guys,
 I wonder about compatibility of Mesos protobuf for Myriad usage.
 If I complied Myriad with Mesos version 0.22.1/0.21.1 but on the cluster I
 have Mesos 0.23 - is it suppose to be compatible?
 Yesterday our guys came across an exception(see below).
 When switching jars to mesos-0.21.1 issue went away.
 Thanks,Yuliya
 15/08/24 10:57:40 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
 yarn.resourcemanager.hostname is set to rm.marathon.mesos via
 YARN_RESOURCEMANAGER_OPTS. Passing it into YARN_NODEMANAGER_OPTS.
 Aug 24, 2015 10:57:40 AM com.lmax.disruptor.FatalExceptionHandler
 handleEventException
 SEVERE: Exception processing: 1
 com.ebay.myriad.scheduler.event.ResourceOffersEvent@74a1e0a5
 java.lang.NoSuchMethodError:

 org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
      at

 com.ebay.myriad.scheduler.TaskFactory$NMTaskFactoryImpl.createTask(TaskFactory.java:310)
      at

 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:98)
      at

 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
      at
 com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
      at

 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)

 15/08/24 10:57:40 ERROR yarn.YarnUncaughtExceptionHandler: Thread
 Thread[pool-2-thread-3,5,main] threw an Exception.
 java.lang.RuntimeException: java.lang.NoSuchMethodError:

 org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
      at

 com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
      at
 com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
      at

 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.NoSuchMethodError:

 org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
      at

 com.ebay.myriad.scheduler.TaskFactory$NMTaskFactoryImpl.createTask(TaskFactory.java:310)
      at

 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:98)
      at

 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
      at
 com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
      ... 3 more



  

Re: Question for Mesos gurus

2015-08-25 Thread Adam Bordelon
Yes, you'll have to recompile your scheduler/executor against the latest
libmesos.
In the upgrade guide, this is mentioned as Upgrade the schedulers by
linking the latest native library / jar / egg (if necessary).
The modules instructions apply to C++ plugins for the Mesos master/slaves
themselves.

On Tue, Aug 25, 2015 at 10:52 AM, yuliya Feldman 
yufeld...@yahoo.com.invalid wrote:

 Thank you Adam
 When you say:but before you upgrade Mesos to 0.23, you should upgrade
 your scheduler (and executor) libmesos to 0.22.x

 Do you mean - recompile?
 Does this sentence from link with upgrade instructions you provided means
 the same? Rebuild and install any modules so that upgraded
 masters/slaves can use them
 Thanks,Yuliya  From: Adam Bordelon a...@mesosphere.io
  To: dev@myriad.incubator.apache.org; yuliya Feldman yufeld...@yahoo.com
  Sent: Tuesday, August 25, 2015 10:06 AM
  Subject: Re: Question for Mesos gurus

 Mesos guarantees forward and backward compatibility by one minor version.
 It is expected that you upgrade the entire cluster to one consecutive
 version before upgrading any component to the next. So, if your scheduler
 jar's libmesos is from 0.21.x, you can upgrade your Mesos master/agents to
 0.22.x safely, but before you upgrade Mesos to 0.23, you should upgrade
 your scheduler (and executor) libmesos to 0.22.x. See
 http://mesos.apache.org/documentation/latest/upgrades/ for other special
 notes and recommended upgrade order.
 Once we reach Mesos 1.0 (when the new HTTP API stabilizes), then we'll have
 stronger guarantees about version compatibility within a major version.



 On Tue, Aug 25, 2015 at 8:33 AM, yuliya Feldman
 yufeld...@yahoo.com.invalid
  wrote:

  Hello guys,
  I wonder about compatibility of Mesos protobuf for Myriad usage.
  If I complied Myriad with Mesos version 0.22.1/0.21.1 but on the cluster
 I
  have Mesos 0.23 - is it suppose to be compatible?
  Yesterday our guys came across an exception(see below).
  When switching jars to mesos-0.21.1 issue went away.
  Thanks,Yuliya
  15/08/24 10:57:40 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
  yarn.resourcemanager.hostname is set to rm.marathon.mesos via
  YARN_RESOURCEMANAGER_OPTS. Passing it into YARN_NODEMANAGER_OPTS.
  Aug 24, 2015 10:57:40 AM com.lmax.disruptor.FatalExceptionHandler
  handleEventException
  SEVERE: Exception processing: 1
  com.ebay.myriad.scheduler.event.ResourceOffersEvent@74a1e0a5
  java.lang.NoSuchMethodError:
 
 
 org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
   at
 
 
 com.ebay.myriad.scheduler.TaskFactory$NMTaskFactoryImpl.createTask(TaskFactory.java:310)
   at
 
 
 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:98)
   at
 
 
 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
   at
  com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
   at
 
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at
 
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 
  15/08/24 10:57:40 ERROR yarn.YarnUncaughtExceptionHandler: Thread
  Thread[pool-2-thread-3,5,main] threw an Exception.
  java.lang.RuntimeException: java.lang.NoSuchMethodError:
 
 
 org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
   at
 
 
 com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
   at
  com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
   at
 
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at
 
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
  Caused by: java.lang.NoSuchMethodError:
 
 
 org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
   at
 
 
 com.ebay.myriad.scheduler.TaskFactory$NMTaskFactoryImpl.createTask(TaskFactory.java:310)
   at
 
 
 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:98)
   at
 
 
 com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
   at
  com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
   ... 3 more
 






Question for Mesos gurus

2015-08-25 Thread yuliya Feldman
Hello guys,
I wonder about compatibility of Mesos protobuf for Myriad usage.
If I complied Myriad with Mesos version 0.22.1/0.21.1 but on the cluster I have 
Mesos 0.23 - is it suppose to be compatible?
Yesterday our guys came across an exception(see below).
When switching jars to mesos-0.21.1 issue went away.
Thanks,Yuliya
15/08/24 10:57:40 INFO scheduler.TaskFactory$NMTaskFactoryImpl:
yarn.resourcemanager.hostname is set to rm.marathon.mesos via
YARN_RESOURCEMANAGER_OPTS. Passing it into YARN_NODEMANAGER_OPTS.
Aug 24, 2015 10:57:40 AM com.lmax.disruptor.FatalExceptionHandler
handleEventException
SEVERE: Exception processing: 1
com.ebay.myriad.scheduler.event.ResourceOffersEvent@74a1e0a5
java.lang.NoSuchMethodError:
org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
 at
com.ebay.myriad.scheduler.TaskFactory$NMTaskFactoryImpl.createTask(TaskFactory.java:310)
 at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:98)
 at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
 at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

15/08/24 10:57:40 ERROR yarn.YarnUncaughtExceptionHandler: Thread
Thread[pool-2-thread-3,5,main] threw an Exception.
java.lang.RuntimeException: java.lang.NoSuchMethodError:
org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
 at
com.lmax.disruptor.FatalExceptionHandler.handleEventException(FatalExceptionHandler.java:45)
 at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:147)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoSuchMethodError:
org.apache.mesos.Protos$TaskInfo$Builder.setData(Lcom/google/protobuf/ByteString;)Lorg/apache/mesos/Protos$TaskInfo$Builder;
 at
com.ebay.myriad.scheduler.TaskFactory$NMTaskFactoryImpl.createTask(TaskFactory.java:310)
 at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:98)
 at
com.ebay.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:55)
 at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
 ... 3 more


[jira] [Updated] (MYRIAD-7) Run MyriadMesosScheduler inside YARN's resource manager JVM.

2015-08-25 Thread Santosh Marella (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santosh Marella updated MYRIAD-7:
-
Issue Type: Improvement  (was: Bug)

 Run MyriadMesosScheduler inside YARN's resource manager JVM.
 

 Key: MYRIAD-7
 URL: https://issues.apache.org/jira/browse/MYRIAD-7
 Project: Myriad
  Issue Type: Improvement
Reporter: Santosh Marella
Assignee: Santosh Marella

 The objective of this change is to run MyriadMesosScheduler inside YARN's 
 resource manager JVM.
 1. Added Myriad{Fair,Capacity,Fifo}Scheduler classes that extend from Yarn's 
 {Fair,Capacity,Fifo}Scheduler classes respectively.
 2. Added build dependencies on Hadoop.
 3. Encountered slf4j's multiple binding conflicts because drop wizard uses
logback implementation while YARN uses slf4j-log4j12 implementation for 
 slf4j API.
After discussions with Mohit, we've decided to remove dependencies on drop 
 wizard.
 4. Refactored the rest of the code to that effect. Retained non-conflicting 
 deps like
codahale metrics, jackson's dataformat/databind/annotations/yaml and 
 hibernate validator.
 5. Added build rules to package the myriad jar and the dependencies to 
 build/libs dir.
These DO NOT INCLUDE hadoop jars and their dependencies as the intent is to
deploy myriad jar and it's deps into an existing YARN installation.
 Build and Deployment guidelines:
 1. From myriad dir in the local git repo, run ./gradlew jar.
 2. This should produce myriad-0.0.1.jar and other deps under build/libs dir.
 3. Copy build/libs/*.jar to HADOOP_HOME/share/hadoop/yarn.
 4. Modify HADOOP_HOME/etc/hadoop/yarn-site.xml to have the following entry:
 {code:xml}
   property
 nameyarn.resourcemanager.scheduler.class/name
 valuecom.ebay.myriad.scheduler.yarn.MyriadFairScheduler/value
   /property
 {code}
 5. Restart Resource Manager process.
 CAVEATS:
 - The REST API to myriad is currently broken as we eliminated the 
 dependencies on dropwizard.
   We need to bring up a web app for myriad and expose the
   REST API through that.
 - Mesos also requires a native library (libmesos.so). Our build process 
 currently
   does not add that to myriad jar. We need to manually add that to YARN's 
 native lib dir
   $HADOOP_HOME/lib/native.
 I'll open tasks for both the above and track them separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (MYRIAD-7) Run MyriadMesosScheduler inside YARN's resource manager JVM.

2015-08-25 Thread Santosh Marella (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santosh Marella reopened MYRIAD-7:
--
Assignee: Santosh Marella

 Run MyriadMesosScheduler inside YARN's resource manager JVM.
 

 Key: MYRIAD-7
 URL: https://issues.apache.org/jira/browse/MYRIAD-7
 Project: Myriad
  Issue Type: Bug
Reporter: Santosh Marella
Assignee: Santosh Marella

 The objective of this change is to run MyriadMesosScheduler inside YARN's 
 resource manager JVM.
 1. Added Myriad{Fair,Capacity,Fifo}Scheduler classes that extend from Yarn's 
 {Fair,Capacity,Fifo}Scheduler classes respectively.
 2. Added build dependencies on Hadoop.
 3. Encountered slf4j's multiple binding conflicts because drop wizard uses
logback implementation while YARN uses slf4j-log4j12 implementation for 
 slf4j API.
After discussions with Mohit, we've decided to remove dependencies on drop 
 wizard.
 4. Refactored the rest of the code to that effect. Retained non-conflicting 
 deps like
codahale metrics, jackson's dataformat/databind/annotations/yaml and 
 hibernate validator.
 5. Added build rules to package the myriad jar and the dependencies to 
 build/libs dir.
These DO NOT INCLUDE hadoop jars and their dependencies as the intent is to
deploy myriad jar and it's deps into an existing YARN installation.
 Build and Deployment guidelines:
 1. From myriad dir in the local git repo, run ./gradlew jar.
 2. This should produce myriad-0.0.1.jar and other deps under build/libs dir.
 3. Copy build/libs/*.jar to HADOOP_HOME/share/hadoop/yarn.
 4. Modify HADOOP_HOME/etc/hadoop/yarn-site.xml to have the following entry:
 {code:xml}
   property
 nameyarn.resourcemanager.scheduler.class/name
 valuecom.ebay.myriad.scheduler.yarn.MyriadFairScheduler/value
   /property
 {code}
 5. Restart Resource Manager process.
 CAVEATS:
 - The REST API to myriad is currently broken as we eliminated the 
 dependencies on dropwizard.
   We need to bring up a web app for myriad and expose the
   REST API through that.
 - Mesos also requires a native library (libmesos.so). Our build process 
 currently
   does not add that to myriad jar. We need to manually add that to YARN's 
 native lib dir
   $HADOOP_HOME/lib/native.
 I'll open tasks for both the above and track them separately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-55) Add a API to destroy a Myriad/YARN cluster

2015-08-25 Thread Santosh Marella (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santosh Marella updated MYRIAD-55:
--
Issue Type: Improvement  (was: Bug)

 Add a API to destroy a Myriad/YARN cluster
 

 Key: MYRIAD-55
 URL: https://issues.apache.org/jira/browse/MYRIAD-55
 Project: Myriad
  Issue Type: Improvement
Reporter: Santosh Marella

 This is similar to destroy option in Marathon.
 We need a way to distinguish between accidental death of ResourceManager vs 
 an explicit request from admin to shutdown the YARN cluster (both RM and the 
 NMs that were launched by the framework).  In the former case, Mesos needs to 
 wait until a new instance of the framework connects back and the framework's 
 HA should kick in. In the latter case, the framework should tell mesos that 
 it wants to shut down and it should shut down all the tasks (NMs) that it 
 previously launched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)