Re: Packaging new apps
I might be wrong, but I sense there is a requirement here, where Slider needs to accept custom application specific config files in it¹s original raw format (like properties, xml, json, yaml, etc.) in addition to appConfig.json. Then it is expected to merge them with appConfig.json and send the complete property bag down to the application containers. If that is true, or even if I got it all wrong, it would be great if you can file JIRAs for what you are looking for? It is good to have these kind of gaps and ideas captured in JIRAs, so that we can make Slider better. Siyuan, The instance tag feature has been there since 0.60. Check https://issues.apache.org/jira/browse/SLIDER-463. -Gour On 5/11/15, 10:41 PM, "Thomas Weise" wrote: >Jean, > >We pulled in your changes and added modifications on top of it. It appears >we agree that we should not force the user to redefine the default values >that ship with server.properties. Please see whether the properties merge >as implemented works on your environment or not. If not, what is the >Python >version? > >We can find an alternative solution to in-place edit of server properties >if and when needed. The file is an argument to the start script, hence we >can do a copy before merge if necessary. > >Thomas > > >On Mon, May 11, 2015 at 3:26 PM, hsy...@gmail.com >wrote: > >> Hi Jean, >> >> Thanks for the change, using instance tag(is it a new feature in the >>latest >> version? I didn't see it in the older slider versions) is a really good >> idea. it might be good for other's to have a template but not for >>kafka. >> Kafka is evolving in quite fast pace. I've seen many property key/val >> change in last several releases. Our method is keep most properties >>default >> and only override the one declared in appConfig.json which is actually >> supported in current python script(maybe need some change for the latest >> slider). >> >> And Kafka broker is bundled with local disk once it's launched so in >>the >> real world there would be at most one instance for each NM. >> >> Best, >> Siyuan >> >> >> >> On Mon, May 11, 2015 at 10:16 AM, Jean-Baptiste Note >> wrote: >> >> > Hi Thomas, >> > >> > According to kafka's documentation: >> > http://kafka.apache.org/07/configuration.html there should be a >>default >> > value for any added property; I would expect the provided >> server.properties >> > file to actually reflect those default values. >> > Therefore, I'd look twice before overconstraining the problem, and >>would >> > just generate the file for those and only those dictionary values that >> have >> > been set in the appConfig (which currently, my code does not, it >> configures >> > too many properties statically, but it can be arranged), relying on >>the >> > default properties for the rest. >> > >> > If there's really a case to have all properties at hand, I could: >> > * parse the properties file provided in the tarball >> > * re-generate the whole conf file with the parsed + overrides >> > >> > This, in order to allow for *added* properties (which the current >> schemes, >> > either mine or yours, does not look to allow) AND ultimately, allow >>for >> the >> > whole tarball installation to be switched to read-only (which could >>allow >> > them to be shared among instances running on the same NM; I don't >>know if >> > slider currently does this kind of optimization). >> > >> > Maybe guidance from people more familiar with slider than us would be >> > needed here :) >> > >> > Kind regards, >> > JB >> > >>
Re: Packaging new apps
Jean, We pulled in your changes and added modifications on top of it. It appears we agree that we should not force the user to redefine the default values that ship with server.properties. Please see whether the properties merge as implemented works on your environment or not. If not, what is the Python version? We can find an alternative solution to in-place edit of server properties if and when needed. The file is an argument to the start script, hence we can do a copy before merge if necessary. Thomas On Mon, May 11, 2015 at 3:26 PM, hsy...@gmail.com wrote: > Hi Jean, > > Thanks for the change, using instance tag(is it a new feature in the latest > version? I didn't see it in the older slider versions) is a really good > idea. it might be good for other's to have a template but not for kafka. > Kafka is evolving in quite fast pace. I've seen many property key/val > change in last several releases. Our method is keep most properties default > and only override the one declared in appConfig.json which is actually > supported in current python script(maybe need some change for the latest > slider). > > And Kafka broker is bundled with local disk once it's launched so in the > real world there would be at most one instance for each NM. > > Best, > Siyuan > > > > On Mon, May 11, 2015 at 10:16 AM, Jean-Baptiste Note > wrote: > > > Hi Thomas, > > > > According to kafka's documentation: > > http://kafka.apache.org/07/configuration.html there should be a default > > value for any added property; I would expect the provided > server.properties > > file to actually reflect those default values. > > Therefore, I'd look twice before overconstraining the problem, and would > > just generate the file for those and only those dictionary values that > have > > been set in the appConfig (which currently, my code does not, it > configures > > too many properties statically, but it can be arranged), relying on the > > default properties for the rest. > > > > If there's really a case to have all properties at hand, I could: > > * parse the properties file provided in the tarball > > * re-generate the whole conf file with the parsed + overrides > > > > This, in order to allow for *added* properties (which the current > schemes, > > either mine or yours, does not look to allow) AND ultimately, allow for > the > > whole tarball installation to be switched to read-only (which could allow > > them to be shared among instances running on the same NM; I don't know if > > slider currently does this kind of optimization). > > > > Maybe guidance from people more familiar with slider than us would be > > needed here :) > > > > Kind regards, > > JB > > >
Re: Packaging new apps
Hi Jean, Thanks for the change, using instance tag(is it a new feature in the latest version? I didn't see it in the older slider versions) is a really good idea. it might be good for other's to have a template but not for kafka. Kafka is evolving in quite fast pace. I've seen many property key/val change in last several releases. Our method is keep most properties default and only override the one declared in appConfig.json which is actually supported in current python script(maybe need some change for the latest slider). And Kafka broker is bundled with local disk once it's launched so in the real world there would be at most one instance for each NM. Best, Siyuan On Mon, May 11, 2015 at 10:16 AM, Jean-Baptiste Note wrote: > Hi Thomas, > > According to kafka's documentation: > http://kafka.apache.org/07/configuration.html there should be a default > value for any added property; I would expect the provided server.properties > file to actually reflect those default values. > Therefore, I'd look twice before overconstraining the problem, and would > just generate the file for those and only those dictionary values that have > been set in the appConfig (which currently, my code does not, it configures > too many properties statically, but it can be arranged), relying on the > default properties for the rest. > > If there's really a case to have all properties at hand, I could: > * parse the properties file provided in the tarball > * re-generate the whole conf file with the parsed + overrides > > This, in order to allow for *added* properties (which the current schemes, > either mine or yours, does not look to allow) AND ultimately, allow for the > whole tarball installation to be switched to read-only (which could allow > them to be shared among instances running on the same NM; I don't know if > slider currently does this kind of optimization). > > Maybe guidance from people more familiar with slider than us would be > needed here :) > > Kind regards, > JB >
Re: Packaging new apps
Hi Thomas, According to kafka's documentation: http://kafka.apache.org/07/configuration.html there should be a default value for any added property; I would expect the provided server.properties file to actually reflect those default values. Therefore, I'd look twice before overconstraining the problem, and would just generate the file for those and only those dictionary values that have been set in the appConfig (which currently, my code does not, it configures too many properties statically, but it can be arranged), relying on the default properties for the rest. If there's really a case to have all properties at hand, I could: * parse the properties file provided in the tarball * re-generate the whole conf file with the parsed + overrides This, in order to allow for *added* properties (which the current schemes, either mine or yours, does not look to allow) AND ultimately, allow for the whole tarball installation to be switched to read-only (which could allow them to be shared among instances running on the same NM; I don't know if slider currently does this kind of optimization). Maybe guidance from people more familiar with slider than us would be needed here :) Kind regards, JB
Re: Packaging new apps
In order to work for different Kafka versions, it would be nice to pick whatever server.properties the archive comes with and apply all the properties that are defined in server.xml on top of it. Does that work for you? We can look into making that merge work then. Everything else looks great, thanks for the pull request! Thomas On Mon, May 11, 2015 at 8:21 AM, Jean-Baptiste Note wrote: > There's a remark on the pull request about this, with more details than in > this mail, but basically: > > * Other apps seem to regenerate the config files directly through a > template rather than try to do a merge (you seem to be doing a SED on > defined properties, however it does not work here, maybe a python version > issue ?), so that's what I did for server.properties. > > Where I come from we use Chef, and redefine all configuration files > anyways, so I was thinking of duplicating a standard configuration file in > the appConfig-default.json (kind of duplicated from the tarball -- again > all other packaged apps are doing it like this), and use Chef to regenerate > all the appConfig.json in order to deploy infrastructure Kafka (and let > users do whatever they wish based on the defaults). > > Kind regards, > JB >
Re: Packaging new apps
There's a remark on the pull request about this, with more details than in this mail, but basically: * Other apps seem to regenerate the config files directly through a template rather than try to do a merge (you seem to be doing a SED on defined properties, however it does not work here, maybe a python version issue ?), so that's what I did for server.properties. Where I come from we use Chef, and redefine all configuration files anyways, so I was thinking of duplicating a standard configuration file in the appConfig-default.json (kind of duplicated from the tarball -- again all other packaged apps are doing it like this), and use Chef to regenerate all the appConfig.json in order to deploy infrastructure Kafka (and let users do whatever they wish based on the defaults). Kind regards, JB
Re: Packaging new apps
Excellent, will look the pull request shortly. Any thoughts on merging the server properties defined into the slider config into the server.properties that came with the Kafka archive? Thomas On Mon, May 11, 2015 at 8:10 AM, Jean-Baptiste Note wrote: > Hi Thomas, > > This is because the app_container_tag is unique under each resource. > Given your two brokers are on separate resources BROKER0 and BROKER1, they > get identical (1) container_tag. > > You should set them in the same resource (BROKER), and the numbering will > be sequential. No idea how it behaves on container restart, however this is > good enough to start and flex a kafka cluster here. > > I've sent your a pull request on github showing how I did. There's no > pretention for actual merge, but if you want it, I can amend for inclusion > to your leasure. > > Kind regards, > JB >
Re: Packaging new apps
Hi Thomas, This is because the app_container_tag is unique under each resource. Given your two brokers are on separate resources BROKER0 and BROKER1, they get identical (1) container_tag. You should set them in the same resource (BROKER), and the numbering will be sequential. No idea how it behaves on container restart, however this is good enough to start and flex a kafka cluster here. I've sent your a pull request on github showing how I did. There's no pretention for actual merge, but if you want it, I can amend for inclusion to your leasure. Kind regards, JB
Re: Packaging new apps
Hi Jean, Indeed we would like to use component instances as you outline. So far, I have not found a way to derive the Kafka server id from the Slider configuration. I checked on my cluster and I find 2 containers using the same app_container_tag in the logs: u'componentName': u'BROKER1', u'configurations': {u'BROKER-COMMON': {u'broker.id': u'1', u'zookeeper.connect': u'node26:2181,node27:2181,node28:2181'}, u'BROKER0': {u'broker.id': u'0'}, u'BROKER1': {u'broker.id': u'1'}, u'global': {u'app_container_id': u'container _1430350563654_0416_01_03', u'app_container_tag': u'1', u'componentName': u'BROKER0', u'configurations': {u'BROKER-COMMON': {u'broker.id': u'0', u'zookeeper.connect': u'node26:2181,node27:2181,node28:2181'}, u'BROKER0': {u'broker.id': u'0'}, u'BROKER1': {u'broker.id': u'1'}, u'global': {u'app_container_id': u'container _1430350563654_0416_01_09', u'app_container_tag': u'1', Any other ideas how to obtain the component instance index that works across container failures? Thanks, Thomas On Mon, May 11, 2015 at 1:44 AM, Jean-Baptiste Note wrote: > Hi Thomas, > > Thanks a lot for the updates you brought to the main Koya repository. > > I saw and can see you're still declaring a resource for each broker. This > is painful as it means modifying your metainfo & possibly resource.json in > case you want to grow your cluster, say beyond 10 machines :) > > Wouldn't it more logically fit into slider to declare one server.xml > configuration, one resource type, and actually flex the application / play > with the instance # to grow it ? > I saw from Gour's comment that you were concerned about unique id > generation. Maybe using the app_container_tag would be a good starting > point ? > For what it's worth, it seemed to work out properly for me. > > Kind regards, > JB >
Re: Packaging new apps
Hi Thomas, Thanks a lot for the updates you brought to the main Koya repository. I saw and can see you're still declaring a resource for each broker. This is painful as it means modifying your metainfo & possibly resource.json in case you want to grow your cluster, say beyond 10 machines :) Wouldn't it more logically fit into slider to declare one server.xml configuration, one resource type, and actually flex the application / play with the instance # to grow it ? I saw from Gour's comment that you were concerned about unique id generation. Maybe using the app_container_tag would be a good starting point ? For what it's worth, it seemed to work out properly for me. Kind regards, JB
Re: Packaging new apps
Hi Steve, Thanks a lot for your reply from your very busy schedule. Actually we'll get away with a python daemon watching zookeeper and doing dynamic DNS updates. This seems easy enough and probably more palatable than duplicating a full DNS server (i'm on the operations side ;)). I'll keep you posted as we'll probably share this work. Kind regards, JB
Re: Packaging new apps
Hi Gour, Thanks a lot for the detailed answer, and the pointer to tomcat packaging, which does half the work for httpfs. I'll try to wrap properly unpacking of the RPM & extraction of the relevant parts for slider packaging. That was my gripe; other than that, i can launch httpfs services and flex them: slider is just awesome. Kind regards, JB
Re: Packaging new apps
> On 8 May 2015, at 01:52, Gour Saha wrote: > > Last but not least, I'm wondering if there would already be a plan to > expose somehow (through an internal or an external service) the registry > through DNS (that's what we really use for service location for HTTPFS & > OpenTSDB). A bash polling script would certainly be sufficient for our > needs for now, but longer-term, we'd need to have a more robust solution. > > Registry and REST APIs on registry comes directly from YARN - > https://issues.apache.org/jira/browse/YARN-913 > https://issues.apache.org/jira/browse/YARN-2948 > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/registry/yarn-registry.html DNS support is always something that's been considered; it's why the paths in the registry spec are required to be valid DNS names (though the check is actually disabled primarily because usernames aren't and punycoding doesn't address things like spaces in names, just high-unicode characters. We held back on this originally due to (a) need to scope things for hadoop 2.6 and (b) worries about how operations teams will like more DNS servers popping up in the organisation. I think we can try to do the DNS -it just needs someone to sit down and to it. I'm afraid my todo list is already full I'd like to wrap up the registry stuff with an HTTP service that can be deployable at a fixed location; we have this in slider but it's there to show its possible more than anything else (because it moves around).
Re: Packaging new apps
Jean, You will see updates in the KOYA repository soon. As part of that we will move up to the latest release of Slider and also document the configuration process. Thanks, Thomas On Thu, May 7, 2015 at 5:52 PM, Gour Saha wrote: > Hi Jean, > > Please see answers inline. > > -Gour > > On 5/6/15, 6:16 AM, "Jean-Baptiste Note" jbn...@gmail.com>> wrote: > > Hi folks, > > Currently we're using Chef in our organization to deploy a lot of > infrastructure services around Hadoop. Of course it makes a lot of sense to > offer these as self-services on YARN using slider, but i'm looking at a > number of challenges. So please forgive the broad range of questions :) > > I'm specifically intersted in deploying the following applications: > * HTTPFS service (see https://github.com/jbnote/httpfs-slider) & helpers > (nginx) > * Opentsdb & helpers (varnish) > * kafka (I had a look at koya) > * druid > * storm (fine, thanks !) > * hbase (fine, thanks !) > > I'm facing a lot of issues with those services which are not yet packaged > correctly: > > * httpfs/opentsdb are not released as standalone tarballs, contrary to all > services currently packaged. So i've butchered a tarball from Cloudera > RPMs, which is not satisfactory. How would you go about handling this ? > > Not sure exactly what you mean, by saying "handling this". If you are > referring to a way to create a Slider package of an app in rpm format, then > there are challenges, such as rpm install requires root access and YARN > does not allow that. If you are referring to an issue you are facing with > deploying the Slider app (now that you have created a tarball), can you > share what issues you are facing? > > You might also want to take a look at this tomcat Slider package. Caution: > It is not ready for prime-time and has few issues which needs to be > resolved. But the scripts and metadata files might be a helpful reference. > https://issues.apache.org/jira/browse/SLIDER-809 > > https://github.com/apache/incubator-slider/tree/feature/SLIDER-809-tomcat-app-package/app-packages/tomcat > > > > * KOYA has been talked a lot of, however the source i'm looking at ( > https://github.com/DataTorrent/koya) is kind of disappointing, and > activity > is a bit low -- would anyone know if dataTorrent is still committed to the > project ? > > What issues are you facing with KOYA? DataTorrent gave a presentation of > KOYA and Slider seems to have fit their need so far. They wanted few > features around data locality (strict placement) which will be there in > 0.80.0 release AND unique ids which still needs some work to be done. > > > Last but not least, I'm wondering if there would already be a plan to > expose somehow (through an internal or an external service) the registry > through DNS (that's what we really use for service location for HTTPFS & > OpenTSDB). A bash polling script would certainly be sufficient for our > needs for now, but longer-term, we'd need to have a more robust solution. > > Registry and REST APIs on registry comes directly from YARN - > https://issues.apache.org/jira/browse/YARN-913 > https://issues.apache.org/jira/browse/YARN-2948 > > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/registry/yarn-registry.html > > > > Thanks a lot, kind regards, > JB > >
Re: Packaging new apps
Hi Jean, Please see answers inline. -Gour On 5/6/15, 6:16 AM, "Jean-Baptiste Note" mailto:jbn...@gmail.com>> wrote: Hi folks, Currently we're using Chef in our organization to deploy a lot of infrastructure services around Hadoop. Of course it makes a lot of sense to offer these as self-services on YARN using slider, but i'm looking at a number of challenges. So please forgive the broad range of questions :) I'm specifically intersted in deploying the following applications: * HTTPFS service (see https://github.com/jbnote/httpfs-slider) & helpers (nginx) * Opentsdb & helpers (varnish) * kafka (I had a look at koya) * druid * storm (fine, thanks !) * hbase (fine, thanks !) I'm facing a lot of issues with those services which are not yet packaged correctly: * httpfs/opentsdb are not released as standalone tarballs, contrary to all services currently packaged. So i've butchered a tarball from Cloudera RPMs, which is not satisfactory. How would you go about handling this ? Not sure exactly what you mean, by saying "handling this". If you are referring to a way to create a Slider package of an app in rpm format, then there are challenges, such as rpm install requires root access and YARN does not allow that. If you are referring to an issue you are facing with deploying the Slider app (now that you have created a tarball), can you share what issues you are facing? You might also want to take a look at this tomcat Slider package. Caution: It is not ready for prime-time and has few issues which needs to be resolved. But the scripts and metadata files might be a helpful reference. https://issues.apache.org/jira/browse/SLIDER-809 https://github.com/apache/incubator-slider/tree/feature/SLIDER-809-tomcat-app-package/app-packages/tomcat * KOYA has been talked a lot of, however the source i'm looking at ( https://github.com/DataTorrent/koya) is kind of disappointing, and activity is a bit low -- would anyone know if dataTorrent is still committed to the project ? What issues are you facing with KOYA? DataTorrent gave a presentation of KOYA and Slider seems to have fit their need so far. They wanted few features around data locality (strict placement) which will be there in 0.80.0 release AND unique ids which still needs some work to be done. Last but not least, I'm wondering if there would already be a plan to expose somehow (through an internal or an external service) the registry through DNS (that's what we really use for service location for HTTPFS & OpenTSDB). A bash polling script would certainly be sufficient for our needs for now, but longer-term, we'd need to have a more robust solution. Registry and REST APIs on registry comes directly from YARN - https://issues.apache.org/jira/browse/YARN-913 https://issues.apache.org/jira/browse/YARN-2948 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/registry/yarn-registry.html Thanks a lot, kind regards, JB