[jira] [Updated] (SLIDER-976) Add KOYA test
[ https://issues.apache.org/jira/browse/SLIDER-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated SLIDER-976: Assignee: (was: Thomas Weise) > Add KOYA test > - > > Key: SLIDER-976 > URL: https://issues.apache.org/jira/browse/SLIDER-976 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise > > Add a test that verifies fault tolerance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108821#comment-15108821 ] Thomas Weise commented on SLIDER-977: - [~elserj] something is wrong with how this was merged. The PR had the correct attribution and history for the KOYA work. What got pushed looks like a squash merge was done. Would like to see this fixed. > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > Fix For: Slider 0.91 > > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108908#comment-15108908 ] Thomas Weise commented on SLIDER-977: - [~elserj] it's for the Slider community to decide. I personally think it is good to retain history in cases where multiple people over a longer period of time have worked on something and don't see this as "one change". Ability to look at a file in git, how it was modified, who worked on it and may be able to help with a question is actually something I would care about. But I also see how you may want to cut down the number of meaningless commits that get created as part of working on new features in developers forks. > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task >Reporter: Thomas Weise >Assignee: Josh Elser > Fix For: Slider 0.91 > > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108859#comment-15108859 ] Thomas Weise commented on SLIDER-977: - If the work was done by 3 people, then should it not show up like that in the history? I can rebase the original PR to develop and apply your changes on top of it. But you would need to back out the merge from the develop branch first. > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > Fix For: Slider 0.91 > > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107453#comment-15107453 ] Thomas Weise commented on SLIDER-977: - [~elserj] this is great news. Once you merge the pull request, I will work on the license headers. > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15104182#comment-15104182 ] Thomas Weise edited comment on SLIDER-977 at 1/18/16 5:14 AM: -- http://s.apache.org/YLR http://mail-archives.apache.org/mod_mbox/incubator-general/201601.mbox/%3C5699DD25.5070802%40apache.org%3E was (Author: thw): http://s.apache.org/YLR http://mail-archives.apache.org/mod_mbox/incubator-general/201601.mbox/browser > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15104182#comment-15104182 ] Thomas Weise commented on SLIDER-977: - http://s.apache.org/YLR http://mail-archives.apache.org/mod_mbox/incubator-general/201601.mbox/browser > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15030775#comment-15030775 ] Thomas Weise commented on SLIDER-977: - https://github.com/apache/incubator-slider/pull/3 > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005994#comment-15005994 ] Thomas Weise commented on SLIDER-977: - Sure, create the branch and I will target the pull request there. License headers are not a problem, we have done a lot of that lately for Apex. > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005648#comment-15005648 ] Thomas Weise commented on SLIDER-977: - Thanks Josh! I will work on preparing the source to get it ready for import (license headers etc.) along with the changes [~ste...@apache.org] has asked for. I will then raise a pull request. I will also provide you with the DataTorrent CLA. > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise >Assignee: Josh Elser > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SLIDER-974) Integrate KOYA
Thomas Weise created SLIDER-974: --- Summary: Integrate KOYA Key: SLIDER-974 URL: https://issues.apache.org/jira/browse/SLIDER-974 Project: Slider Issue Type: Task Reporter: Thomas Weise -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SLIDER-974) Integrate KOYA
[ https://issues.apache.org/jira/browse/SLIDER-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated SLIDER-974: Description: Bring in the source from https://github.com/DataTorrent/KOYA > Integrate KOYA > -- > > Key: SLIDER-974 > URL: https://issues.apache.org/jira/browse/SLIDER-974 > Project: Slider > Issue Type: Task > Reporter: Thomas Weise > > Bring in the source from https://github.com/DataTorrent/KOYA -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SLIDER-975) Add KOYA to build
Thomas Weise created SLIDER-975: --- Summary: Add KOYA to build Key: SLIDER-975 URL: https://issues.apache.org/jira/browse/SLIDER-975 Project: Slider Issue Type: Sub-task Reporter: Thomas Weise -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SLIDER-976) Add KOYA test
Thomas Weise created SLIDER-976: --- Summary: Add KOYA test Key: SLIDER-976 URL: https://issues.apache.org/jira/browse/SLIDER-976 Project: Slider Issue Type: Sub-task Reporter: Thomas Weise Add a test that verifies fault tolerance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SLIDER-977) KOYA integration IP clearance
[ https://issues.apache.org/jira/browse/SLIDER-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated SLIDER-977: Description: http://incubator.apache.org/ip-clearance/ > KOYA integration IP clearance > - > > Key: SLIDER-977 > URL: https://issues.apache.org/jira/browse/SLIDER-977 > Project: Slider > Issue Type: Sub-task > Reporter: Thomas Weise > > http://incubator.apache.org/ip-clearance/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Integrating KOYA into the Slider code base
https://issues.apache.org/jira/browse/SLIDER-974 I don't have permission to assign tickets. 977 should be assigned to PPMC, others to me. Thomas On Thu, Nov 5, 2015 at 2:10 PM, Thomas Weise <thomas.we...@gmail.com> wrote: > Sure, I will file the JIRAs. > > Thomas > > On Thu, Nov 5, 2015 at 2:08 PM, Josh Elser <els...@apache.org> wrote: > >> Yep, that's on us (so I've just read). >> >> Want to include that with the other JIRA issue Steve asked? We can figure >> out which one of the PPMC will be responsible for it then. >> >> >> Thomas Weise wrote: >> >>> Thanks, who will be responsible to put up the clearance page, Slider >>> PPMC? >>> >>> >>> On Thu, Nov 5, 2015 at 8:31 AM, Billie Rinaldi<billie.rina...@gmail.com> >>> wrote: >>> >>> See http://incubator.apache.org/ip-clearance/. >>>> >>>> On Thu, Nov 5, 2015 at 8:08 AM, Josh Elser<els...@apache.org> wrote: >>>> >>>> +1 sounds like a great idea! >>>>> >>>>> We will likely have some "paperwork" for the code grant (per ASF), but >>>>> I >>>>> think that's a very minor headache compared to what integrating KOYA >>>>> >>>> would >>>> >>>>> benefit Slider. >>>>> >>>>> >>>>> Thomas Weise wrote: >>>>> >>>>> Dear Slider community, >>>>>> >>>>>> As you may know, KOYA (Kafka on YARN) was initiated by DataTorrent >>>>>> >>>>> about a >>>> >>>>> year ago with the aim of offering the option to run a Kafka cluster on >>>>>> >>>>> the >>>> >>>>> same infrastructure that our YARN native stream processing platform >>>>>> (now >>>>>> Apache Apex (incubating) - http://apex.incubator.apache.org/) was >>>>>> built >>>>>> for. >>>>>> >>>>>> The KOYA repository is here: https://github.com/DataTorrent/KOYA >>>>>> >>>>>> This is to propose the merge of KOYA into the slider repository. We >>>>>> believe >>>>>> the ASF umbrella is the best option to take the project forward and >>>>>> >>>>> become >>>> >>>>> a viable option for Kafka deployment. >>>>>> >>>>>> Please share your thoughts on how we can take this forward. >>>>>> >>>>>> Thanks, >>>>>> Thomas >>>>>> >>>>>> >>>>>> >>> >
Re: Integrating KOYA into the Slider code base
Thanks, who will be responsible to put up the clearance page, Slider PPMC? On Thu, Nov 5, 2015 at 8:31 AM, Billie Rinaldi <billie.rina...@gmail.com> wrote: > See http://incubator.apache.org/ip-clearance/. > > On Thu, Nov 5, 2015 at 8:08 AM, Josh Elser <els...@apache.org> wrote: > > > +1 sounds like a great idea! > > > > We will likely have some "paperwork" for the code grant (per ASF), but I > > think that's a very minor headache compared to what integrating KOYA > would > > benefit Slider. > > > > > > Thomas Weise wrote: > > > >> Dear Slider community, > >> > >> As you may know, KOYA (Kafka on YARN) was initiated by DataTorrent > about a > >> year ago with the aim of offering the option to run a Kafka cluster on > the > >> same infrastructure that our YARN native stream processing platform (now > >> Apache Apex (incubating) - http://apex.incubator.apache.org/) was built > >> for. > >> > >> The KOYA repository is here: https://github.com/DataTorrent/KOYA > >> > >> This is to propose the merge of KOYA into the slider repository. We > >> believe > >> the ASF umbrella is the best option to take the project forward and > become > >> a viable option for Kafka deployment. > >> > >> Please share your thoughts on how we can take this forward. > >> > >> Thanks, > >> Thomas > >> > >> >
Re: Integrating KOYA into the Slider code base
Sure, I will file the JIRAs. Thomas On Thu, Nov 5, 2015 at 2:08 PM, Josh Elser <els...@apache.org> wrote: > Yep, that's on us (so I've just read). > > Want to include that with the other JIRA issue Steve asked? We can figure > out which one of the PPMC will be responsible for it then. > > > Thomas Weise wrote: > >> Thanks, who will be responsible to put up the clearance page, Slider PPMC? >> >> >> On Thu, Nov 5, 2015 at 8:31 AM, Billie Rinaldi<billie.rina...@gmail.com> >> wrote: >> >> See http://incubator.apache.org/ip-clearance/. >>> >>> On Thu, Nov 5, 2015 at 8:08 AM, Josh Elser<els...@apache.org> wrote: >>> >>> +1 sounds like a great idea! >>>> >>>> We will likely have some "paperwork" for the code grant (per ASF), but I >>>> think that's a very minor headache compared to what integrating KOYA >>>> >>> would >>> >>>> benefit Slider. >>>> >>>> >>>> Thomas Weise wrote: >>>> >>>> Dear Slider community, >>>>> >>>>> As you may know, KOYA (Kafka on YARN) was initiated by DataTorrent >>>>> >>>> about a >>> >>>> year ago with the aim of offering the option to run a Kafka cluster on >>>>> >>>> the >>> >>>> same infrastructure that our YARN native stream processing platform (now >>>>> Apache Apex (incubating) - http://apex.incubator.apache.org/) was >>>>> built >>>>> for. >>>>> >>>>> The KOYA repository is here: https://github.com/DataTorrent/KOYA >>>>> >>>>> This is to propose the merge of KOYA into the slider repository. We >>>>> believe >>>>> the ASF umbrella is the best option to take the project forward and >>>>> >>>> become >>> >>>> a viable option for Kafka deployment. >>>>> >>>>> Please share your thoughts on how we can take this forward. >>>>> >>>>> Thanks, >>>>> Thomas >>>>> >>>>> >>>>> >>
Integrating KOYA into the Slider code base
Dear Slider community, As you may know, KOYA (Kafka on YARN) was initiated by DataTorrent about a year ago with the aim of offering the option to run a Kafka cluster on the same infrastructure that our YARN native stream processing platform (now Apache Apex (incubating) - http://apex.incubator.apache.org/) was built for. The KOYA repository is here: https://github.com/DataTorrent/KOYA This is to propose the merge of KOYA into the slider repository. We believe the ASF umbrella is the best option to take the project forward and become a viable option for Kafka deployment. Please share your thoughts on how we can take this forward. Thanks, Thomas
[jira] [Comment Edited] (SLIDER-82) Support ANTI_AFFINITY_REQUIRED option
[ https://issues.apache.org/jira/browse/SLIDER-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963518#comment-14963518 ] Thomas Weise edited comment on SLIDER-82 at 10/19/15 4:07 PM: -- Actually there is: https://issues.apache.org/jira/browse/YARN-1412 was (Author: thw): Not that I know of. We have been chasing this without success for CDH for some time. https://malhar.atlassian.net/browse/APEX-123 > Support ANTI_AFFINITY_REQUIRED option > - > > Key: SLIDER-82 > URL: https://issues.apache.org/jira/browse/SLIDER-82 > Project: Slider > Issue Type: Task > Components: appmaster >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: Slider 2.0.0 > > Attachments: SLIDER-82.002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > slider has an anti-affinity flag in roles (visible in resources.json?), which > is ignored. > YARN-1042 promises this for YARN, slider will need > # flag in resources.json > # use in container requests > we may also want two policies: anti-affinity-desired, and -required. Then if > required nodes get >1 container for the same component type on the same node, > it'd have to request a new one and return the old one (Risk: getting the same > one back). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (SLIDER-82) Support ANTI_AFFINITY_REQUIRED option
[ https://issues.apache.org/jira/browse/SLIDER-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963518#comment-14963518 ] Thomas Weise edited comment on SLIDER-82 at 10/19/15 4:10 PM: -- Actually there is: https://issues.apache.org/jira/browse/YARN-1412 There is a separate app to reproduce linked here: https://malhar.atlassian.net/browse/APEX-123 was (Author: thw): Actually there is: https://issues.apache.org/jira/browse/YARN-1412 > Support ANTI_AFFINITY_REQUIRED option > - > > Key: SLIDER-82 > URL: https://issues.apache.org/jira/browse/SLIDER-82 > Project: Slider > Issue Type: Task > Components: appmaster >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: Slider 2.0.0 > > Attachments: SLIDER-82.002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > slider has an anti-affinity flag in roles (visible in resources.json?), which > is ignored. > YARN-1042 promises this for YARN, slider will need > # flag in resources.json > # use in container requests > we may also want two policies: anti-affinity-desired, and -required. Then if > required nodes get >1 container for the same component type on the same node, > it'd have to request a new one and return the old one (Risk: getting the same > one back). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-82) Support ANTI_AFFINITY_REQUIRED option
[ https://issues.apache.org/jira/browse/SLIDER-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963518#comment-14963518 ] Thomas Weise commented on SLIDER-82: Not that I know of. We have been chasing this without success for CDH for some time. https://malhar.atlassian.net/browse/APEX-123 > Support ANTI_AFFINITY_REQUIRED option > - > > Key: SLIDER-82 > URL: https://issues.apache.org/jira/browse/SLIDER-82 > Project: Slider > Issue Type: Task > Components: appmaster >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: Slider 2.0.0 > > Attachments: SLIDER-82.002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > slider has an anti-affinity flag in roles (visible in resources.json?), which > is ignored. > YARN-1042 promises this for YARN, slider will need > # flag in resources.json > # use in container requests > we may also want two policies: anti-affinity-desired, and -required. Then if > required nodes get >1 container for the same component type on the same node, > it'd have to request a new one and return the old one (Risk: getting the same > one back). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-82) Support ANTI_AFFINITY_REQUIRED option
[ https://issues.apache.org/jira/browse/SLIDER-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957186#comment-14957186 ] Thomas Weise commented on SLIDER-82: Since the new approach relies on requesting specific nodes instead of blacklisting: Working on Apache Apex we have seen issues requesting specific nodes on CDH up till the last release. We have seen the same issue testing KOYA on CDH with component recovery: https://mail-archives.apache.org/mod_mbox/incubator-slider-dev/201505.mbox/%3cd1823655.e319%25gs...@hortonworks.com%3E I can volunteer to test the patch on CDH. > Support ANTI_AFFINITY_REQUIRED option > - > > Key: SLIDER-82 > URL: https://issues.apache.org/jira/browse/SLIDER-82 > Project: Slider > Issue Type: Task > Components: appmaster >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: Slider 2.0.0 > > Attachments: SLIDER-82.002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > slider has an anti-affinity flag in roles (visible in resources.json?), which > is ignored. > YARN-1042 promises this for YARN, slider will need > # flag in resources.json > # use in container requests > we may also want two policies: anti-affinity-desired, and -required. Then if > required nodes get >1 container for the same component type on the same node, > it'd have to request a new one and return the old one (Risk: getting the same > one back). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Container recovery on working on CDH with yarn.component.placement.policy=1
Jean, Curious what your findings will be with the capacity scheduler. The cluster I'm using has the fair scheduler (CDH default) and we see this issue with other applications also. Works fine on HDP 2.2. The resource manager is logging at INFO level only and I cannot mock with it at this time. There is nothing in the log indicating a problem with the container request. Thomas On Tue, May 19, 2015 at 11:14 AM, Jean-Baptiste Note jbn...@gmail.com wrote: Hi Thomas, I'm also testing on CDH5.4, so i'll be able to attempt duplicating this after my vacation (next week). I'm using the capacity scheduler on a secure cluster though. You probably should be able to see what's going on by increasing the log verbosity of the RM and/or NM -- I don't know if debug level will trace the RPCs, but I guess it should. Of course you could also log client side, but you may be less familiar with the code. Kind regards, JB
Re: Container recovery on working on CDH with yarn.component.placement.policy=1
All resources are freed up. The AM requests the replacement container and nothing happens after that. Please see: https://www.dropbox.com/sh/8ub0jedh60cgys4/AACPftofPcdhD5Sb2XADRMTga?dl=0 resources.json { schema : http://example.org/specification/v2.0.0;, metadata : { }, global : { yarn.container.failure.threshold:10, yarn.container.failure.window.hours:1 }, components : { broker : { yarn.role.priority : 1, yarn.component.instances : 3, yarn.memory : 768, yarn.vcores : 1, yarn.component.placement.policy:1 }, slider-appmaster : { } } } On Wed, May 13, 2015 at 5:03 PM, Gour Saha gs...@hortonworks.com wrote: Can you check the resources (memory, cpu) available in the host, after killing the container? Is it freed? Can you hit the RM UI and share what you see in the ³Cluster Metrics² table for that node? Also, if possible please share your resources.json. -Gour On 5/12/15, 9:34 AM, Thomas Weise thomas.we...@gmail.com wrote: We are testing KOYA on CDH 5.4. We see that after killing the container Slider as expected will ask for the same host. The request is never filled and the container cannot be redeployed. We see this behavior on CDH with DataTorrent also, it looks like a CDH bug. Anyone else trying to run Slider on CDH and sees the same behavior? Any insight on whether that is a CDH configuration issue or fair scheduler bug? Thanks, Thomas
Container recovery on working on CDH with yarn.component.placement.policy=1
We are testing KOYA on CDH 5.4. We see that after killing the container Slider as expected will ask for the same host. The request is never filled and the container cannot be redeployed. We see this behavior on CDH with DataTorrent also, it looks like a CDH bug. Anyone else trying to run Slider on CDH and sees the same behavior? Any insight on whether that is a CDH configuration issue or fair scheduler bug? Thanks, Thomas
Re: Packaging new apps
Excellent, will look the pull request shortly. Any thoughts on merging the server properties defined into the slider config into the server.properties that came with the Kafka archive? Thomas On Mon, May 11, 2015 at 8:10 AM, Jean-Baptiste Note jbn...@gmail.com wrote: Hi Thomas, This is because the app_container_tag is unique under each resource. Given your two brokers are on separate resources BROKER0 and BROKER1, they get identical (1) container_tag. You should set them in the same resource (BROKER), and the numbering will be sequential. No idea how it behaves on container restart, however this is good enough to start and flex a kafka cluster here. I've sent your a pull request on github showing how I did. There's no pretention for actual merge, but if you want it, I can amend for inclusion to your leasure. Kind regards, JB
Re: Packaging new apps
Hi Jean, Indeed we would like to use component instances as you outline. So far, I have not found a way to derive the Kafka server id from the Slider configuration. I checked on my cluster and I find 2 containers using the same app_container_tag in the logs: u'componentName': u'BROKER1', u'configurations': {u'BROKER-COMMON': {u'broker.id': u'1', u'zookeeper.connect': u'node26:2181,node27:2181,node28:2181'}, u'BROKER0': {u'broker.id': u'0'}, u'BROKER1': {u'broker.id': u'1'}, u'global': {u'app_container_id': u'container _1430350563654_0416_01_03', u'app_container_tag': u'1', u'componentName': u'BROKER0', u'configurations': {u'BROKER-COMMON': {u'broker.id': u'0', u'zookeeper.connect': u'node26:2181,node27:2181,node28:2181'}, u'BROKER0': {u'broker.id': u'0'}, u'BROKER1': {u'broker.id': u'1'}, u'global': {u'app_container_id': u'container _1430350563654_0416_01_09', u'app_container_tag': u'1', Any other ideas how to obtain the component instance index that works across container failures? Thanks, Thomas On Mon, May 11, 2015 at 1:44 AM, Jean-Baptiste Note jbn...@gmail.com wrote: Hi Thomas, Thanks a lot for the updates you brought to the main Koya repository. I saw and can see you're still declaring a resource for each broker. This is painful as it means modifying your metainfo possibly resource.json in case you want to grow your cluster, say beyond 10 machines :) Wouldn't it more logically fit into slider to declare one server.xml configuration, one resource type, and actually flex the application / play with the instance # to grow it ? I saw from Gour's comment that you were concerned about unique id generation. Maybe using the app_container_tag would be a good starting point ? For what it's worth, it seemed to work out properly for me. Kind regards, JB
Re: Packaging new apps
In order to work for different Kafka versions, it would be nice to pick whatever server.properties the archive comes with and apply all the properties that are defined in server.xml on top of it. Does that work for you? We can look into making that merge work then. Everything else looks great, thanks for the pull request! Thomas On Mon, May 11, 2015 at 8:21 AM, Jean-Baptiste Note jbn...@gmail.com wrote: There's a remark on the pull request about this, with more details than in this mail, but basically: * Other apps seem to regenerate the config files directly through a template rather than try to do a merge (you seem to be doing a SED on defined properties, however it does not work here, maybe a python version issue ?), so that's what I did for server.properties. Where I come from we use Chef, and redefine all configuration files anyways, so I was thinking of duplicating a standard configuration file in the appConfig-default.json (kind of duplicated from the tarball -- again all other packaged apps are doing it like this), and use Chef to regenerate all the appConfig.json in order to deploy infrastructure Kafka (and let users do whatever they wish based on the defaults). Kind regards, JB
Re: Packaging new apps
Jean, We pulled in your changes and added modifications on top of it. It appears we agree that we should not force the user to redefine the default values that ship with server.properties. Please see whether the properties merge as implemented works on your environment or not. If not, what is the Python version? We can find an alternative solution to in-place edit of server properties if and when needed. The file is an argument to the start script, hence we can do a copy before merge if necessary. Thomas On Mon, May 11, 2015 at 3:26 PM, hsy...@gmail.com hsy...@gmail.com wrote: Hi Jean, Thanks for the change, using instance tag(is it a new feature in the latest version? I didn't see it in the older slider versions) is a really good idea. it might be good for other's to have a template but not for kafka. Kafka is evolving in quite fast pace. I've seen many property key/val change in last several releases. Our method is keep most properties default and only override the one declared in appConfig.json which is actually supported in current python script(maybe need some change for the latest slider). And Kafka broker is bundled with local disk once it's launched so in the real world there would be at most one instance for each NM. Best, Siyuan On Mon, May 11, 2015 at 10:16 AM, Jean-Baptiste Note jbn...@gmail.com wrote: Hi Thomas, According to kafka's documentation: http://kafka.apache.org/07/configuration.html there should be a default value for any added property; I would expect the provided server.properties file to actually reflect those default values. Therefore, I'd look twice before overconstraining the problem, and would just generate the file for those and only those dictionary values that have been set in the appConfig (which currently, my code does not, it configures too many properties statically, but it can be arranged), relying on the default properties for the rest. If there's really a case to have all properties at hand, I could: * parse the properties file provided in the tarball * re-generate the whole conf file with the parsed + overrides This, in order to allow for *added* properties (which the current schemes, either mine or yours, does not look to allow) AND ultimately, allow for the whole tarball installation to be switched to read-only (which could allow them to be shared among instances running on the same NM; I don't know if slider currently does this kind of optimization). Maybe guidance from people more familiar with slider than us would be needed here :) Kind regards, JB
Re: Reading component configuration from application script
Wow, looks like fun :-) Certainly Slider will benefit from more flexibility. For example, sharing configuration between some of the components, but not all of them. But for now, can someone shed some light on how to access the resolved component configuration from the .py script? Let's take the following example from appConfig.json: { metadata: { }, global: { }, components: { BROKER0: { broker.id: 0 }, BROKER1: { broker.id: 1, }, slider-appmaster: { jvm.heapsize: 256M } } } How do I pull out broker.id (it is not part of the config dictionary)? On Sun, May 10, 2015 at 9:16 AM, Steve Loughran ste...@hortonworks.com wrote: On 10 May 2015, at 05:23, Thomas Weise thomas.we...@gmail.com wrote: Also, the configuration docs describe property inheritance and resolution: http://slider.incubator.apache.org/docs/configuration/core.html I could not find this used (other than for the app master) in existing packages? Is there an example that also shows how these values are accessed from the script? I'm not sure that by the time the .py client gets to see things, they get to see the raw data, just the resolved stuff. That's something to look at in future: there's no reason why not to expose it. The other thing we have always talked about is cross-referencing in the .json config files, both to other bits of the specification (including between appconf.json resources.json), and with some late-binding env variables (so that you really can get the env.PATH at the far end). That gets complex very, very fast, introduces loops, leads to very hard to debug configurations. And, once you start adding more features, the leap from declarative to full turing-equivalence. We've left things dumbed down (for now). However, did you notice the Google Borg paper hinted at their config language, with 100+K lines of config. That's one serious deployment. here's the only public documentation of their GCL http://alexandria.tue.nl/extra1/afstversl/wsk-i/bokharouss2008.pdf
Re: Reading component configuration from application script
Also, the configuration docs describe property inheritance and resolution: http://slider.incubator.apache.org/docs/configuration/core.html I could not find this used (other than for the app master) in existing packages? Is there an example that also shows how these values are accessed from the script? Thanks, Thomas On Sat, May 9, 2015 at 4:28 PM, Thomas Weise thomas.we...@gmail.com wrote: Slider 0.80.0 https://issues.apache.org/jira/browse/SLIDER-812 Making component configurations in appConfig available on the SliderAgent side I'm trying to access the component configuration from appConfig.json. For global settings, we have config = Script.get_config() componentName = config['componentName'] myKey = config['configurations']['myKey''] Dumping the dictionary does not show a component configuration. How am I supposed to access it? Thanks.
Reading component configuration from application script
Slider 0.80.0 https://issues.apache.org/jira/browse/SLIDER-812 Making component configurations in appConfig available on the SliderAgent side I'm trying to access the component configuration from appConfig.json. For global settings, we have config = Script.get_config() componentName = config['componentName'] myKey = config['configurations']['myKey''] Dumping the dictionary does not show a component configuration. How am I supposed to access it? Thanks.
Re: Packaging new apps
Jean, You will see updates in the KOYA repository soon. As part of that we will move up to the latest release of Slider and also document the configuration process. Thanks, Thomas On Thu, May 7, 2015 at 5:52 PM, Gour Saha gs...@hortonworks.com wrote: Hi Jean, Please see answers inline. -Gour On 5/6/15, 6:16 AM, Jean-Baptiste Note jbn...@gmail.commailto: jbn...@gmail.com wrote: Hi folks, Currently we're using Chef in our organization to deploy a lot of infrastructure services around Hadoop. Of course it makes a lot of sense to offer these as self-services on YARN using slider, but i'm looking at a number of challenges. So please forgive the broad range of questions :) I'm specifically intersted in deploying the following applications: * HTTPFS service (see https://github.com/jbnote/httpfs-slider) helpers (nginx) * Opentsdb helpers (varnish) * kafka (I had a look at koya) * druid * storm (fine, thanks !) * hbase (fine, thanks !) I'm facing a lot of issues with those services which are not yet packaged correctly: * httpfs/opentsdb are not released as standalone tarballs, contrary to all services currently packaged. So i've butchered a tarball from Cloudera RPMs, which is not satisfactory. How would you go about handling this ? Not sure exactly what you mean, by saying handling this. If you are referring to a way to create a Slider package of an app in rpm format, then there are challenges, such as rpm install requires root access and YARN does not allow that. If you are referring to an issue you are facing with deploying the Slider app (now that you have created a tarball), can you share what issues you are facing? You might also want to take a look at this tomcat Slider package. Caution: It is not ready for prime-time and has few issues which needs to be resolved. But the scripts and metadata files might be a helpful reference. https://issues.apache.org/jira/browse/SLIDER-809 https://github.com/apache/incubator-slider/tree/feature/SLIDER-809-tomcat-app-package/app-packages/tomcat * KOYA has been talked a lot of, however the source i'm looking at ( https://github.com/DataTorrent/koya) is kind of disappointing, and activity is a bit low -- would anyone know if dataTorrent is still committed to the project ? What issues are you facing with KOYA? DataTorrent gave a presentation of KOYA and Slider seems to have fit their need so far. They wanted few features around data locality (strict placement) which will be there in 0.80.0 release AND unique ids which still needs some work to be done. Last but not least, I'm wondering if there would already be a plan to expose somehow (through an internal or an external service) the registry through DNS (that's what we really use for service location for HTTPFS OpenTSDB). A bash polling script would certainly be sufficient for our needs for now, but longer-term, we'd need to have a more robust solution. Registry and REST APIs on registry comes directly from YARN - https://issues.apache.org/jira/browse/YARN-913 https://issues.apache.org/jira/browse/YARN-2948 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/registry/yarn-registry.html Thanks a lot, kind regards, JB
Re: [DISCUSS] SLIDER-799 terminology: escalate vs relax
We are escalating the placement process because we cannot get the resource we are looking for by means of relaxing the locality constraint. On Thu, Mar 19, 2015 at 9:25 AM, Sumit Mohanty smoha...@hortonworks.com wrote: In conjunction with placement, relax seems more appropriate. Also, this feature is alluding to relaxing placement constraints. From: Steve Loughran ste...@hortonworks.com Sent: Thursday, March 19, 2015 9:20 AM To: dev@slider.incubator.apache.org Subject: [DISCUSS] SLIDER-799 terminology: escalate vs relax I'm documenting the SLIDER-799 changes and its property names. This is our chance to get terminology right. Should I call the action of going from a specific host to anywhere one of [] Escalation [] Relaxation I've been using 'escalation', but think 'relaxation' may be better, we can have properties like 'yarn.placement.relax.time.seconds' Which do people prefer? Or does anyone have a better term?
Re: Anti-affinity
Absent YARN support, would it make sense to add logic in the Slider AM to ensures anti-affinity. It would be possible to do this by requesting specific hosts, if the initial AM response does not return all containers on different hosts. Thanks, Thomas On Tue, Mar 17, 2015 at 11:05 AM, Ted Yu yuzhih...@gmail.com wrote: Support from YARN is needed. Please see https://issues.apache.org/jira/browse/YARN-1042 Cheers On Tue, Mar 17, 2015 at 1:56 AM, Krishna Kishore Bonagiri write2kish...@gmail.com wrote: Hi, Is there any plan to implement anti-affinity? I know it would obviously depend on the support for it from YARN. Can it be expected in the near feature? Do you guys know if YARN has any plans for it? Thanks, Ksihore
Reading properties from appConfig.json
Hello, Is there an example on how to read component properties from appConfig.json from the agent Python script. Our current code to read the config is here: https://github.com/DataTorrent/koya/blob/master/koya-slider-package/package/scripts/params.py And we would like to read the Kafka broker.id from the per broker config block rather than flatten it in global scope, as we currently do: https://github.com/DataTorrent/koya/blob/master/koya-slider-package/appConfig-default.json It would then look like this: components: { BROKER0: { broker.id 0 }, BROKER1: { broker.id: 1 }, slider-appmaster: { jvm.heapsize: 256M } } Thanks, Thomas
[jira] [Commented] (SLIDER-688) Zero touch install support
[ https://issues.apache.org/jira/browse/SLIDER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273233#comment-14273233 ] Thomas Weise commented on SLIDER-688: - Using yarn.sh and passing down the 2 options would be better than having to reconfigure HADOOP_CONF_DIR and JAVA_HOME. Zero touch install support -- Key: SLIDER-688 URL: https://issues.apache.org/jira/browse/SLIDER-688 Project: Slider Issue Type: Improvement Affects Versions: Slider 0.60 Reporter: Thomas Weise Assignee: Sumit Mohanty Currently the user needs to specify environment variable such as HADOOP_CONF_DIR and JAVA_HOME. Typlically the environment has Hadoop installed and distros have already provided the dependencies. User should not have to configure this for Slider. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SLIDER-747) Expose component specific information through Slider AM Web Service
Thomas Weise created SLIDER-747: --- Summary: Expose component specific information through Slider AM Web Service Key: SLIDER-747 URL: https://issues.apache.org/jira/browse/SLIDER-747 Project: Slider Issue Type: Improvement Components: appmaster Reporter: Thomas Weise It should be possible for the agent to pass on stats/counters to as part of the heartbeat and for those to be made available in through the Slider AM REST API. The information could be an opaque JSON object that is not interpreted by the AM but available to the client. This can be used in Slider apps to provide stats without reinventing heartbeat protocol and web service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SLIDER-688) Zero touch install support
[ https://issues.apache.org/jira/browse/SLIDER-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated SLIDER-688: Summary: Zero touch install support (was: Zero touch install option) Zero touch install support -- Key: SLIDER-688 URL: https://issues.apache.org/jira/browse/SLIDER-688 Project: Slider Issue Type: Improvement Affects Versions: Slider 0.60 Reporter: Thomas Weise Currently the user needs to specify environment variable such as HADOOP_CONF_DIR and JAVA_HOME. Typlically the environment has Hadoop installed and distros have already provided the dependencies. User should not have to configure this for Slider. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLIDER-665) Allow extensibility of the Slider AM Web UI to provide application specific end points
[ https://issues.apache.org/jira/browse/SLIDER-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225584#comment-14225584 ] Thomas Weise commented on SLIDER-665: - Yes, we are looking to follow the conventions for YARN AM web services. RM proxy exposes a single, secure endpoint. For KOYA, we would like to leverage the web service in the AM to aggregate stats and make them available to the end user. The lack of full REST support in YARN is a known problem (YARN-156) that we assume will be addressed. The workaround we employ is to get the actual endpoint through the application report and connect directly for non-GET requests. We would like to have the ability to also send information to the agent via the web service - ideally support for custom data in the heartbeat would be bidirectional. Allow extensibility of the Slider AM Web UI to provide application specific end points -- Key: SLIDER-665 URL: https://issues.apache.org/jira/browse/SLIDER-665 Project: Slider Issue Type: Task Components: appmaster, Web REST Affects Versions: Slider 0.60 Reporter: Sumit Mohanty Priority: Critical Slider AppMaster UI provides a REST end point for various metadata related to the application and general purpose application status. Applications can also explicitly export config and URLs (any data for that matter) and that is also available through the REST end point. What is not possible is for the application to report back custom data sets at regular intervals and have it available through the AM REST endpoint. The advantage of such a support would be for applications that do not need a comprehensive web service and can extend the AppMaster REST endpoint to provide application specific data. _Its worth investigating if the REST API should be readonly or it is also expected to have support for PUT and POST._ Such feature will mean: * Allow agents to send custom stats to the AM (arbitrary JSON structure) * Make per component instance JSON stats available through the Slider AM Web Service -- This message was sent by Atlassian JIRA (v6.3.4#6332)