I am happy to help (either mentor or volunteer). This is a good idea. Have helped out in Apache projects before.
Debo Sent from my iPhone > On Aug 7, 2018, at 6:08 PM, P. Taylor Goetz <ptgo...@gmail.com> wrote: > > Henry Saputra (hsaputra) has been added to the mentor list. > > We are still interested in proposal feedback and mentor volunteers. > > -Taylor > >> On Aug 6, 2018, at 10:47 AM, P. Taylor Goetz <ptgo...@apache.org> wrote: >> >> I would like to propose DLab as an Apache Incubator project. >> >> The text of the proposal can be found below as well as on the Incubator wiki: >> >> https://wiki.apache.org/incubator/DLabProposal >> >> We are seeking additional mentors and would welcome anyone who would like to >> volunteer. >> >> -Taylor >> >> >> = DLab Proposal = >> >> == Abstract == >> DLab is a platform for creating self-service, exploratory data science >> environments in the cloud using best-of-breed data science tools. >> >> DLab includes a self-service web console, used to create and manage >> exploratory environments. It allows teams to spin up analytical environments >> with just a single click of a mouse. Once established, the environment can >> be managed by an analytical team itself, leveraging simple and easy-to-use >> web-based interface. >> >> == Proposal == >> In order to work effectively, data scientists rely on a varying suite of >> analytics tools that are readily available. However, many of those tools are >> non-trivial to set up in terms of hardware provisioning, software >> installation, configuration, and deployment. Setting up a collaborative, >> multi-tenant development environment for data scientists consumes >> substantial IT and DevOps resources, as well as time. These factors often >> combine to hinder the agility and effectiveness of data science teams within >> an organization. Current solutions are largely closed source and/or >> proprietary, and committing to a given solution introduces the potential for >> vendor lock-in. >> >> EPAM Systems developed DLab in response to the lack of open source, >> permissibly licensed solutions to better enable data science workflows. The >> ALv2 was selected to encourage open development and user adoption. DLab was >> open sourced on Dec 29, 2016 and is under active development with support >> from EPAM Systems. >> >> We believe DLab is a unique solution with no current open source equivalent. >> Our primary goals of incubation are to grow and diversify the DLab community >> to ensure its long-term sustainability. >> >> == Rationale == >> DLab is a platform that provides data scientists with the ability to >> self-provision, without IT support, exploratory and production environments >> with their preferred set of tools installed and pre-configured. Tool options >> include, but are not limited to: >> >> * Apache Spark >> * Apache Flink (planned) >> * Apache Zeppelin >> * Jupyter >> * TensorFlow + Jupyter >> * Deep Learning + Jupyter >> >> DLab leverages cloud computing providers for virtual hardware provisioning >> and currently supports the following: >> >> * Amazon Web Services (AWS) >> * Microsoft Azure >> * Google Compute Platform (GCP) (under development) >> >> DLab offers git-based collaboration tools for data scientists and developers >> and integrates with the following git service providers: >> >> * GItHub >> * GitLab >> * BitBucket >> >> Additionally, DLab includes the option to configure the UnGit tool in an >> environment to facilitate collaboration. >> Finally, DLab integrates closely with many security and SSO offerings, >> including: >> >> * LDAP >> * Microsoft Active Directory >> * AWS Identity Access Management service >> >> DLab was designed from the ground up to be highly configurable, flexible, >> and extensible platform. We believe these qualities will encourage community >> growth by enabling contributors to easily add new integrations and >> extensions. >> >> == Initial Goals == >> The initial goal will be to move the existing codebase to Apache and >> integrate with the Apache development process and infrastructure. A primary >> goal of incubation will be to grow and diversify the DLab PPMC. We are well >> aware that the project community is comprised of individuals from a single >> company. We aim to change that during incubation. >> >> == Current Status == >> As previously mentioned, DLab is under active development at EPAM Systems, >> and is being used in a number of production deployments: >> >> * [An investment company] is using DLab as an AWS-based analytics platform >> for their data scientists to provide a convenient way to perform >> multi-tenant data analytics. This enables data scientists to easily >> provision work environments with integrated data sources based on >> Elasticsearch, Apache HBase, and Neo4j, and utilizing Apache Spark. This >> enabled a “one click”, self service option for users to provision an >> environment with the necessary tools and data. >> >> * [An electronics manufacturing company] leverages DLab for data quality, >> data exploration, and analytics. The company’s data scientists leverage DLab >> to work with data sources that have been transferred to the cloud in order >> to find new insights on the data, and help the implementation team define >> requirements for data engineering. The main goal is to increase the >> utilization of various tools by decreasing time to deployment. >> >> * [A retail company] is using DLab as an image recognition framework, to >> enable automated restocking of inventory. >> >> * [A travel company] is using DLab to create recommendation engine that will >> allow end users to find more relevant accommodations faster and at a lower >> cost. >> >> === Meritocracy === >> We value meritocracy and we understand that it is the basis for an open >> community that encourages multiple companies and individuals to contribute >> and be invested in the project’s future. We will encourage and monitor >> participation and make sure to extend privileges and responsibilities to all >> contributors. >> >> === Community === >> DLab is currently being used by developers at EPAM and a gowing number of >> customers are actively using it in production environments. By bringing DLab >> to Apache we hope to broaden and diversity the user and developer community >> through open collaboration. >> >> === Core Developers === >> DLab was initially developed at EPAM Systems and is under active >> development. We believe DLab will be of interest to a broad range of users >> and devlopers and that incubating the project at the ASF will help us build >> a diverse, sustainable community. >> >> === Alignment === >> DLab utilizes other Apache projects such as Apache Spark, Apache Toree >> (incubating), and Apache Zeppelin, along with a number of other Apache >> libraries. We anticipate integration with additional Apache projects as the >> DLab community and interest in the project grows. >> >> == Known Risks == >> >> === Orphaned products === >> EPAM Systems is committed to the future development of DLab and understands >> that graduation to a TLP, while preferable, is not the only positive outcome >> of incubation. >> >> Should the DLab project be accepted by the Incubator, the prospective PPMC >> would be willing to agree to a target incubation period of 2 years or less, >> knowing that every Incubator project incurs a certain cost in terms of ASF >> infrastructure and volunteer time. >> >> === Inexperience with Open Source === >> Many DLab contributors are already familiar with open source processes and >> several of them are committers on other Apache projects. We will be actively >> working with experienced Apache community members to improve our project. >> >> === Homogenous Developers === >> The initial committers of DLab all come from EPAM Systems, though we are >> committed to recruiting and developing additional committers from a wide >> spectrum of industries and backgrounds. >> >> === Reliance on Salaried Developers === >> It is expected that DLab development will occur on both salaried time and on >> volunteer time, after hours. All of the initial committers are paid by EPAM >> Systems to contribute to this project. However, they are all passionate >> about the project, and we are both confident and hopeful that the project >> will continue even if no salaried developers contribute to the project. >> >> === Relationships with Other Apache Products === >> As mentioned in the Rationale section, DLab utilizes a number of existing >> Apache projects (Spark, Toree, Zeppelin, et. al.), and we expect that list >> to expand as the community grows and diversifies. Any Apache project in the >> big data, data science, and/or analytics space would be potentially relevant. >> >> === A Excessive Fascination with the Apache Brand === >> We are applying to the Incubator process because we think it is the next >> logical step for the DLab project after open-sourcing the code. This >> proposal is not for the purpose of generating publicity. Rather, we want to >> make sure to create a very inclusive and meritocratic community, outside the >> umbrella of a single company. EPAM has a long history of contributing to >> Apache projects and the DLab developers and contributors understand the >> implication of making it an Apache project. >> >> == Required Resources == >> >> === Mailing lists === >> * d...@dlab.incubator.apache.org >> * comm...@dlab.incubator.apache.org >> * priv...@dlab.incubator.apache.org >> >> === Source control === >> * https://git-wip-us.apache.org/repos/asf/incubator-dlab >> >> === Issue tracking === >> * JIRA DLab (DLAB) >> >> == Documentation == >> * DLab Website: http://dlab.opensource.epam.com >> * DLab code base: https://github.com/epam/DLab >> * DLab Overview: https://github.com/epam/DLab/blob/master/README.md >> * DLab User Guide: https://github.com/epam/DLab/blob/master/USER_GUIDE.md >> >> == Initial Source == >> The DLab codebase is currently hosted on Github: https://github.com/epam/DLab >> >> == Source and Intellectual Property Submission Plan == >> The DLab source code in Github is currently licensed under Apache License >> v2.0 and the copyright is assigned to EPAM Systems. If DLab becomes an >> Incubator project at the ASF, EPAM Systems will transfer the source code and >> trademark ownership to the Apache Software Foundation via a Software Grant >> Agreement. >> >> == External Dependencies == >> To the best of our knowledge, all of DLab dependencies are distributed under >> Apache compatible licenses. >> >> DLab was designed to be highly extensible, and we expect and encourage the >> development of third-party extensions and plug-ins. We also understand that >> any such component, if it requires a dependency forbidden by Apache license >> policy, would not be eligible for inclusion in an Apache release, and would >> have to be hosted, supported, etc. outside of ASF infrastructure and labeled >> appropriately. >> >> === External dependencies licensed under Apache License 2.0: === >> MongoDB Java Driver - org.mongodb:mongo-java-driver >> (http://mongodb.github.io/mongo-java-driver/3.2/driver) >> >> Dropwizard (https://github.com/dropwizard/dropwizard) >> >> Dropwizard Template Config >> (https://github.com/tkrille/dropwizard-template-config) >> >> Apache Directory Server (https://github.com/apache/directory-server) >> >> Jackson (https://github.com/FasterXML/jackson) >> >> AWS Java SDK (https://github.com/aws/aws-sdk-java) >> >> Boto3 (https://github.com/boto/boto3) >> >> === External dependencies licensed under the MIT License: === >> angular2-app (https://www.npmjs.com/package/angular2-app) >> >> angular2-seed (https://www.npmjs.com/package/angular2-seed) >> >> angular2-seed-advanced (https://www.npmjs.org/package/angular2-seed-advanced) >> >> angular2-seed-n3UX (https://www.npmjs.com/package/angular2-seed-n3UX) >> >> http-status-enum (https://www.npmjs.com/package/http-status-enum) >> Mockito (https://github.com/mockito/mockito) >> >> ng2-translate (https://www.npmjs.com/package/ng2-translate) >> >> SLF4J (http://www.slf4j.org/) >> >> === External dependencies licensed under the CDDL License: === >> Jersey (https://github.com/jersey/jersey) >> >> === External dependencies licensed under the Python Software License Version >> 2: === >> jython (https://github.com/jythontools/jython) >> >> === ASF Projects: === >> Apache Spark, Apache Toree (incubating), Apache Zeppelin >> >> == Cryptography == >> Not applicable. >> >> == Initial Committers == >> * Dmytro Liaskovskyi dmytro_liaskovs...@epam.com >> * Volodymyr Veres volodymyr_ve...@epam.com >> * Oleh Hrynets oleh_hryn...@epam.com >> * Oleh Hrynyk oleh_hry...@epam.com >> * Oleh Martushevskyi oleh_martushevs...@epam.com >> * Oleh Moskovych oleh_moskov...@epam.com >> * Vadym Kuznetsov vadym_kuznet...@epam.com >> * Usein Faradzhev usein_faradz...@epam.com >> * Bohdan Hliva bohdan_hl...@epam.com >> * Oleksandr Melnychuk oleksandr_melnych...@epam.com >> * Mikhail Teplitskiy mikhail_teplits...@epam.com >> * Vira Vitanska vira_vitan...@epam.com >> * Andriana Kovalyshyn andriana_kovalys...@epam.com >> * Oleksandr Chaparin oleksandr_chapa...@epam.com >> * Denys Shliakhov denys_shliak...@epam.com >> * Nazar Barabash nazar_barab...@epam.com >> * Yuriy Holinko yuriy_holi...@epam.com >> * Petro Kotsiuba petro_kotsi...@epam.com >> * Bogdan Rudyi bogdan_ru...@epam.com >> * Mikhail Teplitskyi mikhail_teplits...@epam.com >> >> == Sponsors == >> >> === Champion === >> * P. Taylor Goetz ptgo...@apache.org >> >> === Nominated Mentors === >> * P. Taylor Goetz ptgo...@apache.org >> >> === Sponsoring Entity === >> * The Apache Incubator >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org