The usage of any Apex attributes is the generic configuration of Apex applications on the end-user level. But the subject of the discussion is to provide the system level configuration of Apex applications. I guess the having of the two different layers of the configuration (system and end-user) is a generic approach for all good designed tools.
Thanks, Sergey On Sat, Feb 3, 2018 at 10:02 AM, Pramod Immaneni <pra...@datatorrent.com> wrote: > Yes generic in the Attribute class > > > On Feb 3, 2018, at 10:00 AM, Vlad Rozov <vro...@apache.org> wrote: > > > > +1 assuming that support for merge/override will be generic for all > attributes that support list/set of values and not limited to LIBRARY_JARS > attribute only. > > > > Thank you, > > > > Vlad > > > > On 2/3/18 09:13, Pramod Immaneni wrote: > >> I too agree that the discussion has veered off from the original topic. > Why > >> can't LIBRARY_JARS be used for this, albeit with a minor improvement? > >> Currently, our attribute layering is an override, so if you have an > >> attribute that is specified as apex.application.<appname>. > attr.<attrname> > >> it overrides apex.attr.<attrname> for that application. What if were to > >> expand the attribute definition to allow for the specification of how > the > >> layering of attributes will be combined, override being one option, > merge > >> being another with these being implemented with a combiner interface? > This > >> way a set of common jars could be specified using dt.attr.LIBRARY_JARS > and > >> applications can still add extra jars on top. > >> > >> On Fri, Feb 2, 2018 at 6:32 PM, Vlad Rozov <vro...@apache.org> wrote: > >> > >>> IMO, support for Kubernetes, Docker images, Mesos and anything outside > of > >>> Yarn deployments is a topic by itself and design for such support > needs to > >>> be discussed. I do not want to propose any specific design, but assume > that > >>> logic to create proper execution environment would be coded into Apex > >>> client. Whether it (hardcoded logic to create an execution > environment) can > >>> be expressed simply as a list of dependent classes or jars is at > minimum > >>> questionable. Until design is proposed and agreed upon, I'd prefer to > use > >>> plugins for the subject. > >>> > >>> Thank you, > >>> > >>> Vlad > >>> > >>> > >>> On 2/2/18 13:17, Sanjay Pujare wrote: > >>> > >>>> In cases where we have an "über" docker image containing support for > >>>> multiple execution environments it might be useful for the Apex core > to > >>>> infer what kind of execution environment to use for a particular > >>>> invocation (say based on configuration values/environment variables) > and > >>>> in that case the core will load the corresponding libraries. And I > think > >>>> this kind of flexibility or support would be difficult through the > plugins > >>>> hence I think Sergey's proposal will be useful. > >>>> > >>>> Sanjay > >>>> > >>>> > >>>> On Fri, Feb 2, 2018 at 11:18 AM, Sergey Golovko < > ser...@datatorrent.com> > >>>> wrote: > >>>> > >>>> Unfortunately the moving of .apa file to a docker image cannot > resolve all > >>>>> problems with the dependencies. If we assume an Apex application > should > >>>>> be > >>>>> run in different execution environments, the application docker image > >>>>> must > >>>>> contain all possible execution environment dependencies. > >>>>> > >>>>> I think the better way is to assume that the original application > docker > >>>>> image like the current .apa file should contain the application > specific > >>>>> dependencies only. And some smart client tool should create the > >>>>> executable > >>>>> application docker image form the original one and include the > execution > >>>>> specific environment dependencies into the target application docker > >>>>> image. > >>>>> It means anyway an smart client Apex tool should have an interface to > >>>>> define different environment dependencies or combination of different > >>>>> dimensions of the environment dependencies. > >>>>> > >>>>> Thanks, > >>>>> Sergey > >>>>> > >>>>> > >>>>> On Fri, Feb 2, 2018 at 10:23 AM, Thomas Weise <t...@apache.org> > wrote: > >>>>> > >>>>> The current dependencies are based on how Apex YARN client works. > YARN > >>>>>> depends on a DFS implementation for deployment (not necessarily > HDFS). > >>>>>> > >>>>>> I think a better way to look at this is to consider that instead of > an > >>>>>> > >>>>> .apa > >>>>> > >>>>>> file the application is a docker image, which would contain Apex > and all > >>>>>> dependencies that the "StramClient" today adds for YARN. > >>>>>> > >>>>>> In that world there would be no Apex CLI or Apex specific client. > >>>>>> > >>>>>> Thomas > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Thu, Feb 1, 2018 at 5:57 PM, Sergey Golovko < > ser...@datatorrent.com> > >>>>>> wrote: > >>>>>> > >>>>>> I agree. It can be implemented with usage of plugins. But if I need > to > >>>>>>> enable and configurate the plugin I need to put this information > into > >>>>>>> dt-site.xml. It means The plugin and its parameter must be > documented > >>>>>>> > >>>>>> and > >>>>>> the list of the added specific jars will be visible and available > for > >>>>>>> updates to the end-user. The implementation via plugins is more > dynamic > >>>>>>> solution that is more convenient for the application developers. > But > >>>>>>> > >>>>>> I'm > >>>>>> talking about the static configuration of the Apex build or > >>>>>> installation > >>>>>> that relates more to the platform development. > >>>>>>> The current Apex core implementation uses the static unchanged > list of > >>>>>>> > >>>>>> jars > >>>>>> > >>>>>>> for long time, because the Apex implementation still contains > several > >>>>>>> > >>>>>> basic > >>>>>> > >>>>>>> static assumptions (for instance, the usage of YARN, HDSF, etc.). > And > >>>>>>> > >>>>>> the > >>>>>> current Apex assumptions are hardcoded in the implementation. But > if we > >>>>>> are > >>>>>> > >>>>>>> going to improve Apex and use Java interfaces in generic Apex > >>>>>>> implementation, the current static approach in Apex code to > hardcode a > >>>>>>> > >>>>>> list > >>>>>> > >>>>>>> of dependent jars will not work anymore. It will require to > include a > >>>>>>> > >>>>>> new > >>>>>> solution to add/change jars in specific Apex builds/configurations. > >>>>>> And I > >>>>>> don't think the usage of the plugins will be good for that. > >>>>>>> Thanks, > >>>>>>> Sergey > >>>>>>> > >>>>>>> > >>>>>>> On Thu, Feb 1, 2018 at 1:47 PM, Vlad Rozov <vro...@apache.org> > wrote: > >>>>>>> > >>>>>>> There is a way to get the same end result by using plugins. It will > >>>>>>> be > >>>>>> good to understand why plugin can't be used and can they be extended > >>>>>>> to > >>>>>> provide the required functionality. > >>>>>>>> Thank you, > >>>>>>>> > >>>>>>>> Vlad > >>>>>>>> > >>>>>>>> > >>>>>>>> On 1/29/18 15:14, Sergey Golovko wrote: > >>>>>>>> > >>>>>>>> Hello All, > >>>>>>>>> In Apex there are two ways to deploy non-Hadoop jars to the > deployed > >>>>>>>>> cluster. > >>>>>>>>> > >>>>>>>>> The first approach is static (hardcoded) and it is used by Apex > >>>>>>>>> > >>>>>>>> platform > >>>>>>> developers only. There are several final static arrays of Java > >>>>>>>> classes > >>>>>> in StramClient.java > >>>>>>>>> that define which of the available jars should be included into > >>>>>>>>> > >>>>>>>> deployment > >>>>>>>> for every Apex application. > >>>>>>>>> The second approach is to add paths of all dependent jar-files to > >>>>>>>>> > >>>>>>>> the > >>>>>> value > >>>>>>>>> of the attribute LIB_JARS. The end-user can set/update the value > of > >>>>>>>>> > >>>>>>>> the > >>>>>>> attribute LIB_JARS via dt-site.xml files, command line parameters, > >>>>>>>>> application properties and plugins. The usage of the > >>>>>>>>> attribute LIB_JARS is the official documented way for all Apex > users > >>>>>>>>> > >>>>>>>> to > >>>>>>> manage by the deployment jars. > >>>>>>>>> But some of the dependent jars (not from the Apex core) can be > >>>>>>>>> > >>>>>>>> common > >>>>>> for > >>>>>>>> all customer's applications for a specific installation and/or > >>>>>>>> execution > >>>>>>> environment. Unfortunately the Apex implementation does not contain > >>>>>>>> the > >>>>>>> middle solution that would allow the Apex developers and customer > >>>>>>>> support > >>>>>>>> to > >>>>>>>>> define and add new dependent jar-files (jars that should not be > >>>>>>>>> configurable/managed by the end-user) without the > >>>>>>>>> > >>>>>>>> updates/recompilation > >>>>>>> of > >>>>>>> > >>>>>>>> the Apex Java code during the Apex building process and/or > >>>>>>>>> installation/configuration. > >>>>>>>>> > >>>>>>>>> Also the having of such kind of flexibility would allow the Apex > >>>>>>>>> > >>>>>>>> core > >>>>>> developers to use Java interfaces during the development to define > >>>>>>>> an > >>>>>> abstraction layer in Apex implementation and configurate Apex core > >>>>>>>> to > >>>>>> add > >>>>>>>> some specific jars to all Apex applications without recompilation > of > >>>>>>>> the > >>>>>>> Apex source code. > >>>>>>>>> For instance, now the usage of HDFS is hardcoded in Apex platform > >>>>>>>>> > >>>>>>>> code > >>>>>> but > >>>>>>>> it can be replaced with any other distributed or cloud base file > >>>>>>>> system. > >>>>>>> The Apex core code can use an interface for all I/O operations but > >>>>>>>> the > >>>>>> supporting of a real specific file system implementation can be > >>>>>>>> added > >>>>>> as > >>>>>> > >>>>>>> an > >>>>>>>>> independent jar-file. Or if the implementation of some of Apex > >>>>>>>>> > >>>>>>>> operators > >>>>>>> depend on a specific service, and it is necessary to add some of > the > >>>>>>>>> service jars to every Apex application implicitly. > >>>>>>>>> > >>>>>>>>> The proposal: > >>>>>>>>> > >>>>>>>>> - add a predefined configuration text file (we can make any > choice > >>>>>>>>> > >>>>>>>> for > >>>>>> the > >>>>>>>> file syntax: XML, JSON or Properties) to Apex engine resources > with > >>>>>>>>> predefined values of some of the Apex attributes (now we can > include > >>>>>>>>> LIB_JARS > >>>>>>>>> attribute only); > >>>>>>>>> - allow to have a configuration text file with the same > >>>>>>>>> > >>>>>>>> functionality > >>>>>> in > >>>>>> > >>>>>>> the Apex installation folder "conf"; > >>>>>>>>> - read the content of the predefined configuration text files by > the > >>>>>>>>> > >>>>>>>> stram > >>>>>>>> client in runtime and add the jars to the list of the dependent > >>>>>>>> jars; > >>>>>> - allow to use paths to jars and Java classes to refer to the > >>>>>>>> dependent > >>>>>>> jars (the references can have the extensions: .class and .jar). > >>>>>>>>> Thanks, > >>>>>>>>> Sergey > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > > > >