Henry Saputra (hsaputra) has been added to the mentor list.

We are still interested in proposal feedback and mentor volunteers.

-Taylor

> On Aug 6, 2018, at 10:47 AM, P. Taylor Goetz <ptgo...@apache.org> wrote:
> 
> I would like to propose DLab as an Apache Incubator project.
> 
> The text of the proposal can be found below as well as on the Incubator wiki:
> 
> https://wiki.apache.org/incubator/DLabProposal
> 
> We are seeking additional mentors and would welcome anyone who would like to 
> volunteer.
> 
> -Taylor
> 
> 
> = DLab Proposal =
> 
> == Abstract ==
> DLab is a platform for creating self-service, exploratory data science 
> environments in the cloud using best-of-breed data science tools.
> 
> DLab includes a self-service web console, used to create and manage 
> exploratory environments. It allows teams to spin up analytical environments 
> with just a single click of a mouse. Once established, the environment can be 
> managed by an analytical team itself, leveraging simple and easy-to-use 
> web-based interface.
> 
> == Proposal ==
> In order to work effectively, data scientists rely on a varying suite of 
> analytics tools that are readily available. However, many of those tools are 
> non-trivial to set up in terms of hardware provisioning, software 
> installation, configuration, and deployment. Setting up a collaborative, 
> multi-tenant development environment for data scientists consumes substantial 
> IT and DevOps resources, as well as time. These factors often combine to 
> hinder the agility and effectiveness of data science teams within an 
> organization. Current solutions are largely closed source and/or proprietary, 
> and committing to a given solution introduces the potential for vendor 
> lock-in.
> 
> EPAM Systems developed DLab in response to the lack of open source, 
> permissibly licensed solutions to better enable data science workflows. The 
> ALv2 was selected to encourage open development and user adoption. DLab was 
> open sourced on Dec 29, 2016 and is under active development with support 
> from EPAM Systems.
> 
> We believe DLab is a unique solution with no current open source equivalent. 
> Our primary goals of incubation are to grow and diversify the DLab community 
> to ensure its long-term sustainability.
> 
> == Rationale ==
> DLab is a platform that provides data scientists with the ability to 
> self-provision, without IT support, exploratory and production environments 
> with their preferred set of tools installed and pre-configured. Tool options 
> include, but are not limited to:
> 
> * Apache Spark
> * Apache Flink (planned)
> * Apache Zeppelin
> * Jupyter
> * TensorFlow + Jupyter
> * Deep Learning + Jupyter
> 
> DLab leverages cloud computing providers for virtual hardware provisioning 
> and currently supports the following:
> 
> * Amazon Web Services (AWS)
> * Microsoft Azure
> * Google Compute Platform (GCP) (under development)
> 
> DLab offers git-based collaboration tools for data scientists and developers 
> and integrates with the following git service providers:
> 
> * GItHub
> * GitLab
> * BitBucket
> 
> Additionally, DLab includes the option to configure the UnGit tool in an 
> environment to facilitate collaboration.
> Finally, DLab integrates closely with many security and SSO offerings, 
> including:
> 
> * LDAP
> * Microsoft Active Directory
> * AWS Identity Access Management service
> 
> DLab was designed from the ground up to be highly configurable, flexible, and 
> extensible platform. We believe these qualities will encourage community 
> growth by enabling contributors to easily add new integrations and extensions.
> 
> == Initial Goals ==
> The initial goal will be to move the existing codebase to Apache and 
> integrate with the Apache development process and infrastructure. A primary 
> goal of incubation will be to grow and diversify the DLab PPMC. We are well 
> aware that the project community is comprised of individuals from a single 
> company. We aim to change that during incubation.
> 
> == Current Status ==
> As previously mentioned, DLab is under active development at EPAM Systems, 
> and is being used in a number of production deployments:
> 
> * [An investment company] is using DLab as an AWS-based analytics platform 
> for their data scientists to provide a convenient way to perform multi-tenant 
> data analytics. This enables data scientists to easily provision work 
> environments with integrated data sources based on Elasticsearch, Apache 
> HBase, and Neo4j, and utilizing Apache Spark. This enabled a “one click”, 
> self service option for users to provision an environment with the necessary 
> tools and data.
> 
> * [An electronics manufacturing company] leverages DLab for data quality, 
> data exploration, and analytics. The company’s data scientists leverage DLab 
> to work with data sources that have been transferred to the cloud in order to 
> find new insights on the data, and help the implementation team define 
> requirements for data engineering. The main goal is to increase the 
> utilization of various tools by decreasing time to deployment.
> 
> * [A retail company] is using DLab as an image recognition framework, to 
> enable automated restocking of inventory.
> 
> * [A travel company] is using DLab to create recommendation engine that will 
> allow end users to find more relevant accommodations faster and at a lower 
> cost.
> 
> === Meritocracy ===
> We value meritocracy and we understand that it is the basis for an open 
> community that encourages multiple companies and individuals to contribute 
> and be invested in the project’s future. We will encourage and monitor 
> participation and make sure to extend privileges and responsibilities to all 
> contributors.
> 
> === Community ===
> DLab is currently being used by developers at EPAM and a gowing number of 
> customers are actively using it in production environments. By bringing DLab 
> to Apache we hope to broaden and diversity the user and developer community 
> through open collaboration.
> 
> === Core Developers ===
> DLab was initially developed at EPAM Systems and is under active development. 
> We believe DLab will be of interest to a broad range of users and devlopers 
> and that incubating the project at the ASF will help us build a diverse, 
> sustainable community.
> 
> === Alignment ===
> DLab utilizes other Apache projects such as Apache Spark, Apache Toree 
> (incubating), and Apache Zeppelin, along with a number of other Apache 
> libraries. We anticipate integration with additional Apache projects as the 
> DLab community and interest in the project grows.
> 
> == Known Risks ==
> 
> === Orphaned products ===
> EPAM Systems is committed to the future development of DLab and understands 
> that graduation to a TLP, while preferable, is not the only positive outcome 
> of incubation.
> 
> Should the DLab project be accepted by the Incubator, the prospective PPMC 
> would be willing to agree to a target incubation period of 2 years or less, 
> knowing that every Incubator project incurs a certain cost in terms of ASF 
> infrastructure and volunteer time.
> 
> === Inexperience with Open Source ===
> Many DLab contributors are already familiar with open source processes and 
> several of them are committers on other Apache projects. We will be actively 
> working with experienced Apache community members to improve our project.
> 
> === Homogenous Developers ===
> The initial committers of DLab all come from EPAM Systems,  though we are 
> committed to recruiting and developing additional committers from a wide 
> spectrum of industries and backgrounds.
> 
> === Reliance on Salaried Developers ===
> It is expected that DLab development will occur on both salaried time and on 
> volunteer time, after hours. All of the initial committers are paid by EPAM 
> Systems to contribute to this project. However, they are all passionate about 
> the project, and we are both confident and hopeful that the project will 
> continue even if no salaried developers contribute to the project.
> 
> === Relationships with Other Apache Products ===
> As mentioned in the Rationale section, DLab utilizes a number of existing 
> Apache projects (Spark, Toree, Zeppelin, et. al.), and we expect that list to 
> expand as the community grows and diversifies. Any Apache project in the big 
> data, data science, and/or analytics space would be potentially relevant.
> 
> === A Excessive Fascination with the Apache Brand ===
> We are applying to the Incubator process because we think it is the next 
> logical step for the DLab project after open-sourcing the code. This proposal 
> is not for the purpose of generating publicity. Rather, we want to make sure 
> to create a very inclusive and meritocratic community, outside the umbrella 
> of a single company. EPAM has a long history of contributing to Apache 
> projects and the DLab developers and contributors understand the implication 
> of making it an Apache project.
> 
> == Required Resources ==
> 
> === Mailing lists ===
> * d...@dlab.incubator.apache.org
> * comm...@dlab.incubator.apache.org
> * priv...@dlab.incubator.apache.org
> 
> === Source control ===
> * https://git-wip-us.apache.org/repos/asf/incubator-dlab
> 
> === Issue tracking ===
> * JIRA DLab (DLAB)
> 
> == Documentation ==
> * DLab Website: http://dlab.opensource.epam.com
> * DLab code base: https://github.com/epam/DLab
> * DLab Overview: https://github.com/epam/DLab/blob/master/README.md
> * DLab User Guide: https://github.com/epam/DLab/blob/master/USER_GUIDE.md
> 
> == Initial Source ==
> The DLab codebase is currently hosted on Github: https://github.com/epam/DLab
> 
> == Source and Intellectual Property Submission Plan ==
> The DLab source code in Github is currently licensed under Apache License 
> v2.0 and the copyright is assigned to EPAM Systems. If DLab becomes an 
> Incubator project at the ASF, EPAM Systems will transfer the source code and 
> trademark ownership to the Apache Software Foundation via a Software Grant 
> Agreement.
> 
> == External Dependencies ==
> To the best of our knowledge, all of DLab dependencies are distributed under 
> Apache compatible licenses.
> 
> DLab was designed to be highly extensible, and we expect and encourage the 
> development of third-party extensions and plug-ins. We also understand that 
> any such component, if it requires a dependency forbidden by Apache license 
> policy, would not be eligible for inclusion in an Apache release, and would 
> have to be hosted, supported, etc. outside of ASF infrastructure and labeled 
> appropriately.
> 
> === External dependencies licensed under Apache License 2.0: ===
> MongoDB Java Driver - org.mongodb:mongo-java-driver 
> (http://mongodb.github.io/mongo-java-driver/3.2/driver)
> 
> Dropwizard (https://github.com/dropwizard/dropwizard)
> 
> Dropwizard Template Config 
> (https://github.com/tkrille/dropwizard-template-config)
> 
> Apache Directory Server (https://github.com/apache/directory-server)
> 
> Jackson (https://github.com/FasterXML/jackson)
> 
> AWS Java SDK (https://github.com/aws/aws-sdk-java)
> 
> Boto3 (https://github.com/boto/boto3)
> 
> === External dependencies licensed under the MIT License: ===
> angular2-app (https://www.npmjs.com/package/angular2-app)
> 
> angular2-seed (https://www.npmjs.com/package/angular2-seed)
> 
> angular2-seed-advanced (https://www.npmjs.org/package/angular2-seed-advanced)
> 
> angular2-seed-n3UX (https://www.npmjs.com/package/angular2-seed-n3UX)
> 
> http-status-enum (https://www.npmjs.com/package/http-status-enum)
> Mockito (https://github.com/mockito/mockito)
> 
> ng2-translate (https://www.npmjs.com/package/ng2-translate)
> 
> SLF4J (http://www.slf4j.org/)
> 
> === External dependencies licensed under the CDDL License: ===
> Jersey (https://github.com/jersey/jersey)
> 
> === External dependencies licensed under the Python Software License Version 
> 2: ===
> jython (https://github.com/jythontools/jython)
> 
> === ASF Projects: ===
> Apache Spark, Apache Toree (incubating), Apache Zeppelin
> 
> == Cryptography ==
> Not applicable.
> 
> == Initial Committers ==
> * Dmytro Liaskovskyi dmytro_liaskovs...@epam.com
> * Volodymyr Veres volodymyr_ve...@epam.com
> * Oleh Hrynets oleh_hryn...@epam.com
> * Oleh Hrynyk oleh_hry...@epam.com
> * Oleh Martushevskyi oleh_martushevs...@epam.com
> * Oleh Moskovych oleh_moskov...@epam.com
> * Vadym Kuznetsov vadym_kuznet...@epam.com
> * Usein Faradzhev usein_faradz...@epam.com
> * Bohdan Hliva bohdan_hl...@epam.com
> * Oleksandr Melnychuk oleksandr_melnych...@epam.com
> * Mikhail Teplitskiy mikhail_teplits...@epam.com
> * Vira Vitanska vira_vitan...@epam.com
> * Andriana Kovalyshyn andriana_kovalys...@epam.com
> * Oleksandr Chaparin oleksandr_chapa...@epam.com
> * Denys Shliakhov denys_shliak...@epam.com
> * Nazar Barabash nazar_barab...@epam.com
> * Yuriy Holinko yuriy_holi...@epam.com
> * Petro Kotsiuba petro_kotsi...@epam.com
> * Bogdan Rudyi bogdan_ru...@epam.com
> * Mikhail Teplitskyi mikhail_teplits...@epam.com
> 
> == Sponsors ==
> 
> === Champion ===
> * P. Taylor Goetz ptgo...@apache.org
> 
> === Nominated Mentors ===
> * P. Taylor Goetz ptgo...@apache.org
> 
> === Sponsoring Entity ===
> * The Apache Incubator
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to