date:20180126

[RESULT] [VOTE] Apache DataFu 1.3.3 release RC1

2018-01-26 Thread Matthew Hayes

The vote has passed with three +1 binding votes, and no -1s or 0s.

Binding +1s: Justin Mclean (binding), Jakob Homan (binding), Roman
Shaposhnik (binding)

Thanks for taking the time to look at the release!

-Matt

On Fri, Jan 26, 2018 at 3:38 PM, Matthew Hayes <
matthew.terence.ha...@gmail.com> wrote:

> Thanks for taking a look Roman.  I have filed the following JIRA about the
> gradle.properties issue you raised:
>
> https://issues.apache.org/jira/browse/DATAFU-136
>
>
>
> On Fri, Jan 26, 2018 at 2:53 PM, Roman Shaposhnik 
> wrote:
>
>> +1 (binding)
>>
>> - checked sigs
>> - LICENSES/NOTICE/DISCLAIMER look good
>> - Headers look good
>> - checked for the release tag in Git
>> - offending (cat X) material got removed, but otherwise git tag
>> matches release tarball
>>
>> Thanks,
>> Roman.
>>
>> P.S. Weirdly gradle.properties is set to non-release on tag -- you may
>> want to look
>> into that for future releases
>>
>>
>> On Thu, Jan 25, 2018 at 10:59 AM, Jakob Homan  wrote:
>> > +1 (binding)
>> > - Sigs/asc look good
>> > - NOTICE/LICENSE/DISCLAIMER look good
>> > - Licenses look good
>> > - Tests succeed
>> > - Gradle binaries not included
>> >
>> > Good work.
>> > -Jakob
>> >
>> > On 24 January 2018 at 13:01, Justin Mclean  wrote:
>> >> Hi,
>> >>
>> >>> Hi, it's been almost 72 hours since the vote was opened.  How many
>> votes do
>> >>> we need for this to pass?  Can other folks take a look if necessary?
>> >>
>> >> I suggest asking your mentor who are IPMC member to vote.
>> >>
>> >> Thanks,
>> >> Justin
>> >>
>> >> -
>> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> >> For additional commands, e-mail: general-h...@incubator.apache.org
>> >>
>> >
>> > -
>> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> > For additional commands, e-mail: general-h...@incubator.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>>
>

Re: [VOTE] Apache DataFu 1.3.3 release RC1

2018-01-26 Thread Matthew Hayes

Thanks for taking a look Roman.  I have filed the following JIRA about the
gradle.properties issue you raised:

https://issues.apache.org/jira/browse/DATAFU-136



On Fri, Jan 26, 2018 at 2:53 PM, Roman Shaposhnik 
wrote:

> +1 (binding)
>
> - checked sigs
> - LICENSES/NOTICE/DISCLAIMER look good
> - Headers look good
> - checked for the release tag in Git
> - offending (cat X) material got removed, but otherwise git tag
> matches release tarball
>
> Thanks,
> Roman.
>
> P.S. Weirdly gradle.properties is set to non-release on tag -- you may
> want to look
> into that for future releases
>
>
> On Thu, Jan 25, 2018 at 10:59 AM, Jakob Homan  wrote:
> > +1 (binding)
> > - Sigs/asc look good
> > - NOTICE/LICENSE/DISCLAIMER look good
> > - Licenses look good
> > - Tests succeed
> > - Gradle binaries not included
> >
> > Good work.
> > -Jakob
> >
> > On 24 January 2018 at 13:01, Justin Mclean  wrote:
> >> Hi,
> >>
> >>> Hi, it's been almost 72 hours since the vote was opened.  How many
> votes do
> >>> we need for this to pass?  Can other folks take a look if necessary?
> >>
> >> I suggest asking your mentor who are IPMC member to vote.
> >>
> >> Thanks,
> >> Justin
> >>
> >> -
> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> For additional commands, e-mail: general-h...@incubator.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>

Re: [VOTE] Apache DataFu 1.3.3 release RC1

2018-01-26 Thread Roman Shaposhnik

+1 (binding)

- checked sigs
- LICENSES/NOTICE/DISCLAIMER look good
- Headers look good
- checked for the release tag in Git
- offending (cat X) material got removed, but otherwise git tag
matches release tarball

Thanks,
Roman.

P.S. Weirdly gradle.properties is set to non-release on tag -- you may
want to look
into that for future releases


On Thu, Jan 25, 2018 at 10:59 AM, Jakob Homan  wrote:
> +1 (binding)
> - Sigs/asc look good
> - NOTICE/LICENSE/DISCLAIMER look good
> - Licenses look good
> - Tests succeed
> - Gradle binaries not included
>
> Good work.
> -Jakob
>
> On 24 January 2018 at 13:01, Justin Mclean  wrote:
>> Hi,
>>
>>> Hi, it's been almost 72 hours since the vote was opened.  How many votes do
>>> we need for this to pass?  Can other folks take a look if necessary?
>>
>> I suggest asking your mentor who are IPMC member to vote.
>>
>> Thanks,
>> Justin
>>
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

2018-01-26 Thread Romain Manni-Bucau

Le 26 janv. 2018 21:53, "Byung-Gon Chun"  a écrit :

On Sat, Jan 27, 2018 at 5:41 AM, Romain Manni-Bucau 
wrote:

> Why not doing a beam subproject? Any blocker?
>
>
Thanks for the question, Romain.

We have a flexible, efficient runtime that supports various user programs
(e.g., Beam and Spark programs).
We are taking advantage of Beam as a programming layer, but our focus is
more on optimizing execution on various deployment scenarios.
We also plan to support other programming layers.



I tend to think it can converge since beam is about portability and
complementary IMHO. Can be worth PoCing.



> Otherwise +1 to have it @asf, makes a lot of sense.
>
>
Thanks for the support!

-Gon


> Le 26 janv. 2018 20:58, "Byung-Gon Chun"  a écrit :
>
> > On Sat, Jan 27, 2018 at 4:09 AM, Davor Bonaci  wrote:
> >
> > > Great work -- I think this technology has a lot of promise, and I'd
> love
> > to
> > > see its evolution inside the Foundation.
> > >
> > >
> > Thanks, Davor!
> >
> >
> > > Parts of it, like the Onyx Intermediate Representation [1], overlap
> with
> > > the work-in-progress inside the Apache Beam project ("portability").
> We'd
> > > love to work together on this -- would you be open to such
> collaboration?
> > > If so, it may not be necessary to start from scratch, and leverage the
> > work
> > > already done.
> > >
> > >
> > Sure. We're open to collaboration.
> >
> >
> > > Regarding the name, Onyx would likely have to be renamed, due to a
> > conflict
> > > with a related technology [2].
> > >
> > >
> > Thanks for pointing it out. It's difficult to come up with a good short
> > name. :)
> > Do you have any suggestion?
> >
> > Thanks!
> > -Gon
> >
> > ---
> > Byung-Gon Chun
> >
> >
> >
> > > Davor
> > >
> > > [1] https://snuspl.github.io/onyx/docs/ir/
> > > [2] http://www.onyxplatform.org/
> > >
> > > On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun 
> > wrote:
> > >
> > > > Dear Apache Incubator Community,
> > > >
> > > > Please accept the following proposal for presentation and
discussion:
> > > > https://wiki.apache.org/incubator/OnyxProposal
> > > >
> > > > Onyx is a data processing system that aims to flexibly control the
> > > runtime
> > > > behaviors of a job to adapt to varying deployment characteristics
> > (e.g.,
> > > > harnessing transient resources in datacenters, cross-datacenter
> > > deployment,
> > > > changing runtime based on job characteristics, etc.). Onyx provides
> > ways
> > > to
> > > > extend the system’s capabilities and incorporate the extensions to
> the
> > > > flexible job execution.
> > > > Onyx translates a user program (e.g., Apache Beam, Apache Spark)
into
> > an
> > > > Intermediate Representation (IR) DAG, which Onyx optimizes and
> deploys
> > > > based on a deployment policy.
> > > >
> > > > I've attached the proposal below.
> > > >
> > > > Best regards,
> > > > Byung-Gon Chun
> > > >
> > > > = OnyxProposal =
> > > >
> > > > == Abstract ==
> > > > Onyx is a data processing system for flexible employment with
> > > > different execution scenarios for various deployment characteristics
> > > > on clusters.
> > > >
> > > > == Proposal ==
> > > > Today, there is a wide variety of data processing systems with
> > > > different designs for better performance and datacenter efficiency.
> > > > They include processing data on specific resource environments and
> > > > running jobs with specific attributes. Although each system
> > > > successfully solves the problems it targets, most systems are
> designed
> > > > in the way that runtime behaviors are built tightly inside the
system
> > > > core to hide the complexity of distributed computing. This makes it
> > > > hard for a single system to support different deployment
> > > > characteristics with different runtime behaviors without substantial
> > > > effort.
> > > >
> > > > Onyx is a data processing system that aims to flexibly control the
> > > > runtime behaviors of a job to adapt to varying deployment
> > > > characteristics. Moreover, it provides a means of extending the
> > > > system’s capabilities and incorporating the extensions to the
> flexible
> > > > job execution.
> > > >
> > > > In order to be able to easily modify runtime behaviors to adapt to
> > > > varying deployment characteristics, Onyx exposes runtime behaviors
to
> > > > be flexibly configured and modified at both compile-time and runtime
> > > > through a set of high-level graph pass interfaces.
> > > >
> > > > We hope to contribute to the big data processing community by
> enabling
> > > > more flexibility and extensibility in job executions. Furthermore,
we
> > > > can benefit more together as a community when we work together as a
> > > > community to mature the system with more use cases and understanding
> > > > of diverse deployment characteristics. The Apache Software
Foundation
> > > > is the perfect place to achieve these aspirations.
> > > >
> > > > == Background ==
> > > > Many data processing systems have distinctive runtime beha

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

2018-01-26 Thread Byung-Gon Chun

On Sat, Jan 27, 2018 at 5:41 AM, Romain Manni-Bucau 
wrote:

> Why not doing a beam subproject? Any blocker?
>
>
Thanks for the question, Romain.

We have a flexible, efficient runtime that supports various user programs
(e.g., Beam and Spark programs).
We are taking advantage of Beam as a programming layer, but our focus is
more on optimizing execution on various deployment scenarios.
We also plan to support other programming layers.


> Otherwise +1 to have it @asf, makes a lot of sense.
>
>
Thanks for the support!

-Gon


> Le 26 janv. 2018 20:58, "Byung-Gon Chun"  a écrit :
>
> > On Sat, Jan 27, 2018 at 4:09 AM, Davor Bonaci  wrote:
> >
> > > Great work -- I think this technology has a lot of promise, and I'd
> love
> > to
> > > see its evolution inside the Foundation.
> > >
> > >
> > Thanks, Davor!
> >
> >
> > > Parts of it, like the Onyx Intermediate Representation [1], overlap
> with
> > > the work-in-progress inside the Apache Beam project ("portability").
> We'd
> > > love to work together on this -- would you be open to such
> collaboration?
> > > If so, it may not be necessary to start from scratch, and leverage the
> > work
> > > already done.
> > >
> > >
> > Sure. We're open to collaboration.
> >
> >
> > > Regarding the name, Onyx would likely have to be renamed, due to a
> > conflict
> > > with a related technology [2].
> > >
> > >
> > Thanks for pointing it out. It's difficult to come up with a good short
> > name. :)
> > Do you have any suggestion?
> >
> > Thanks!
> > -Gon
> >
> > ---
> > Byung-Gon Chun
> >
> >
> >
> > > Davor
> > >
> > > [1] https://snuspl.github.io/onyx/docs/ir/
> > > [2] http://www.onyxplatform.org/
> > >
> > > On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun 
> > wrote:
> > >
> > > > Dear Apache Incubator Community,
> > > >
> > > > Please accept the following proposal for presentation and discussion:
> > > > https://wiki.apache.org/incubator/OnyxProposal
> > > >
> > > > Onyx is a data processing system that aims to flexibly control the
> > > runtime
> > > > behaviors of a job to adapt to varying deployment characteristics
> > (e.g.,
> > > > harnessing transient resources in datacenters, cross-datacenter
> > > deployment,
> > > > changing runtime based on job characteristics, etc.). Onyx provides
> > ways
> > > to
> > > > extend the system’s capabilities and incorporate the extensions to
> the
> > > > flexible job execution.
> > > > Onyx translates a user program (e.g., Apache Beam, Apache Spark) into
> > an
> > > > Intermediate Representation (IR) DAG, which Onyx optimizes and
> deploys
> > > > based on a deployment policy.
> > > >
> > > > I've attached the proposal below.
> > > >
> > > > Best regards,
> > > > Byung-Gon Chun
> > > >
> > > > = OnyxProposal =
> > > >
> > > > == Abstract ==
> > > > Onyx is a data processing system for flexible employment with
> > > > different execution scenarios for various deployment characteristics
> > > > on clusters.
> > > >
> > > > == Proposal ==
> > > > Today, there is a wide variety of data processing systems with
> > > > different designs for better performance and datacenter efficiency.
> > > > They include processing data on specific resource environments and
> > > > running jobs with specific attributes. Although each system
> > > > successfully solves the problems it targets, most systems are
> designed
> > > > in the way that runtime behaviors are built tightly inside the system
> > > > core to hide the complexity of distributed computing. This makes it
> > > > hard for a single system to support different deployment
> > > > characteristics with different runtime behaviors without substantial
> > > > effort.
> > > >
> > > > Onyx is a data processing system that aims to flexibly control the
> > > > runtime behaviors of a job to adapt to varying deployment
> > > > characteristics. Moreover, it provides a means of extending the
> > > > system’s capabilities and incorporating the extensions to the
> flexible
> > > > job execution.
> > > >
> > > > In order to be able to easily modify runtime behaviors to adapt to
> > > > varying deployment characteristics, Onyx exposes runtime behaviors to
> > > > be flexibly configured and modified at both compile-time and runtime
> > > > through a set of high-level graph pass interfaces.
> > > >
> > > > We hope to contribute to the big data processing community by
> enabling
> > > > more flexibility and extensibility in job executions. Furthermore, we
> > > > can benefit more together as a community when we work together as a
> > > > community to mature the system with more use cases and understanding
> > > > of diverse deployment characteristics. The Apache Software Foundation
> > > > is the perfect place to achieve these aspirations.
> > > >
> > > > == Background ==
> > > > Many data processing systems have distinctive runtime behaviors
> > > > optimized and configured for specific deployment characteristics like
> > > > different resource environments and for handling special job
> > > > attri

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

2018-01-26 Thread Romain Manni-Bucau

Why not doing a beam subproject? Any blocker?

Otherwise +1 to have it @asf, makes a lot of sense.

Le 26 janv. 2018 20:58, "Byung-Gon Chun"  a écrit :

> On Sat, Jan 27, 2018 at 4:09 AM, Davor Bonaci  wrote:
>
> > Great work -- I think this technology has a lot of promise, and I'd love
> to
> > see its evolution inside the Foundation.
> >
> >
> Thanks, Davor!
>
>
> > Parts of it, like the Onyx Intermediate Representation [1], overlap with
> > the work-in-progress inside the Apache Beam project ("portability"). We'd
> > love to work together on this -- would you be open to such collaboration?
> > If so, it may not be necessary to start from scratch, and leverage the
> work
> > already done.
> >
> >
> Sure. We're open to collaboration.
>
>
> > Regarding the name, Onyx would likely have to be renamed, due to a
> conflict
> > with a related technology [2].
> >
> >
> Thanks for pointing it out. It's difficult to come up with a good short
> name. :)
> Do you have any suggestion?
>
> Thanks!
> -Gon
>
> ---
> Byung-Gon Chun
>
>
>
> > Davor
> >
> > [1] https://snuspl.github.io/onyx/docs/ir/
> > [2] http://www.onyxplatform.org/
> >
> > On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun 
> wrote:
> >
> > > Dear Apache Incubator Community,
> > >
> > > Please accept the following proposal for presentation and discussion:
> > > https://wiki.apache.org/incubator/OnyxProposal
> > >
> > > Onyx is a data processing system that aims to flexibly control the
> > runtime
> > > behaviors of a job to adapt to varying deployment characteristics
> (e.g.,
> > > harnessing transient resources in datacenters, cross-datacenter
> > deployment,
> > > changing runtime based on job characteristics, etc.). Onyx provides
> ways
> > to
> > > extend the system’s capabilities and incorporate the extensions to the
> > > flexible job execution.
> > > Onyx translates a user program (e.g., Apache Beam, Apache Spark) into
> an
> > > Intermediate Representation (IR) DAG, which Onyx optimizes and deploys
> > > based on a deployment policy.
> > >
> > > I've attached the proposal below.
> > >
> > > Best regards,
> > > Byung-Gon Chun
> > >
> > > = OnyxProposal =
> > >
> > > == Abstract ==
> > > Onyx is a data processing system for flexible employment with
> > > different execution scenarios for various deployment characteristics
> > > on clusters.
> > >
> > > == Proposal ==
> > > Today, there is a wide variety of data processing systems with
> > > different designs for better performance and datacenter efficiency.
> > > They include processing data on specific resource environments and
> > > running jobs with specific attributes. Although each system
> > > successfully solves the problems it targets, most systems are designed
> > > in the way that runtime behaviors are built tightly inside the system
> > > core to hide the complexity of distributed computing. This makes it
> > > hard for a single system to support different deployment
> > > characteristics with different runtime behaviors without substantial
> > > effort.
> > >
> > > Onyx is a data processing system that aims to flexibly control the
> > > runtime behaviors of a job to adapt to varying deployment
> > > characteristics. Moreover, it provides a means of extending the
> > > system’s capabilities and incorporating the extensions to the flexible
> > > job execution.
> > >
> > > In order to be able to easily modify runtime behaviors to adapt to
> > > varying deployment characteristics, Onyx exposes runtime behaviors to
> > > be flexibly configured and modified at both compile-time and runtime
> > > through a set of high-level graph pass interfaces.
> > >
> > > We hope to contribute to the big data processing community by enabling
> > > more flexibility and extensibility in job executions. Furthermore, we
> > > can benefit more together as a community when we work together as a
> > > community to mature the system with more use cases and understanding
> > > of diverse deployment characteristics. The Apache Software Foundation
> > > is the perfect place to achieve these aspirations.
> > >
> > > == Background ==
> > > Many data processing systems have distinctive runtime behaviors
> > > optimized and configured for specific deployment characteristics like
> > > different resource environments and for handling special job
> > > attributes.
> > >
> > > For example, much research have been conducted to overcome the
> > > challenge of running data processing jobs on cheap, unreliable
> > > transient resources. Likewise, techniques for disaggregating different
> > > types of resources, like memory, CPU and GPU, are being actively
> > > developed to use datacenter resources more efficiently. Many
> > > researchers are also working to run data processing jobs in even more
> > > diverse environments, such as across distant datacenters. Similarly,
> > > for special job attributes, many works take different approaches, such
> > > as runtime optimization, to solve problems like data skew, and to
> > > optimize

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

2018-01-26 Thread Byung-Gon Chun

On Sat, Jan 27, 2018 at 4:09 AM, Davor Bonaci  wrote:

> Great work -- I think this technology has a lot of promise, and I'd love to
> see its evolution inside the Foundation.
>
>
Thanks, Davor!


> Parts of it, like the Onyx Intermediate Representation [1], overlap with
> the work-in-progress inside the Apache Beam project ("portability"). We'd
> love to work together on this -- would you be open to such collaboration?
> If so, it may not be necessary to start from scratch, and leverage the work
> already done.
>
>
Sure. We're open to collaboration.


> Regarding the name, Onyx would likely have to be renamed, due to a conflict
> with a related technology [2].
>
>
Thanks for pointing it out. It's difficult to come up with a good short
name. :)
Do you have any suggestion?

Thanks!
-Gon

---
Byung-Gon Chun



> Davor
>
> [1] https://snuspl.github.io/onyx/docs/ir/
> [2] http://www.onyxplatform.org/
>
> On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun  wrote:
>
> > Dear Apache Incubator Community,
> >
> > Please accept the following proposal for presentation and discussion:
> > https://wiki.apache.org/incubator/OnyxProposal
> >
> > Onyx is a data processing system that aims to flexibly control the
> runtime
> > behaviors of a job to adapt to varying deployment characteristics (e.g.,
> > harnessing transient resources in datacenters, cross-datacenter
> deployment,
> > changing runtime based on job characteristics, etc.). Onyx provides ways
> to
> > extend the system’s capabilities and incorporate the extensions to the
> > flexible job execution.
> > Onyx translates a user program (e.g., Apache Beam, Apache Spark) into an
> > Intermediate Representation (IR) DAG, which Onyx optimizes and deploys
> > based on a deployment policy.
> >
> > I've attached the proposal below.
> >
> > Best regards,
> > Byung-Gon Chun
> >
> > = OnyxProposal =
> >
> > == Abstract ==
> > Onyx is a data processing system for flexible employment with
> > different execution scenarios for various deployment characteristics
> > on clusters.
> >
> > == Proposal ==
> > Today, there is a wide variety of data processing systems with
> > different designs for better performance and datacenter efficiency.
> > They include processing data on specific resource environments and
> > running jobs with specific attributes. Although each system
> > successfully solves the problems it targets, most systems are designed
> > in the way that runtime behaviors are built tightly inside the system
> > core to hide the complexity of distributed computing. This makes it
> > hard for a single system to support different deployment
> > characteristics with different runtime behaviors without substantial
> > effort.
> >
> > Onyx is a data processing system that aims to flexibly control the
> > runtime behaviors of a job to adapt to varying deployment
> > characteristics. Moreover, it provides a means of extending the
> > system’s capabilities and incorporating the extensions to the flexible
> > job execution.
> >
> > In order to be able to easily modify runtime behaviors to adapt to
> > varying deployment characteristics, Onyx exposes runtime behaviors to
> > be flexibly configured and modified at both compile-time and runtime
> > through a set of high-level graph pass interfaces.
> >
> > We hope to contribute to the big data processing community by enabling
> > more flexibility and extensibility in job executions. Furthermore, we
> > can benefit more together as a community when we work together as a
> > community to mature the system with more use cases and understanding
> > of diverse deployment characteristics. The Apache Software Foundation
> > is the perfect place to achieve these aspirations.
> >
> > == Background ==
> > Many data processing systems have distinctive runtime behaviors
> > optimized and configured for specific deployment characteristics like
> > different resource environments and for handling special job
> > attributes.
> >
> > For example, much research have been conducted to overcome the
> > challenge of running data processing jobs on cheap, unreliable
> > transient resources. Likewise, techniques for disaggregating different
> > types of resources, like memory, CPU and GPU, are being actively
> > developed to use datacenter resources more efficiently. Many
> > researchers are also working to run data processing jobs in even more
> > diverse environments, such as across distant datacenters. Similarly,
> > for special job attributes, many works take different approaches, such
> > as runtime optimization, to solve problems like data skew, and to
> > optimize systems for data processing jobs with small-scale input data.
> >
> > Although each of the systems performs well with the jobs and in the
> > environments they target, they perform poorly with unconsidered cases,
> > and do not consider supporting multiple deployment characteristics on
> > a single system in their designs.
> >
> > For an application writer to optimize an application to per

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

2018-01-26 Thread Davor Bonaci

Great work -- I think this technology has a lot of promise, and I'd love to
see its evolution inside the Foundation.

Parts of it, like the Onyx Intermediate Representation [1], overlap with
the work-in-progress inside the Apache Beam project ("portability"). We'd
love to work together on this -- would you be open to such collaboration?
If so, it may not be necessary to start from scratch, and leverage the work
already done.

Regarding the name, Onyx would likely have to be renamed, due to a conflict
with a related technology [2].

Davor

[1] https://snuspl.github.io/onyx/docs/ir/
[2] http://www.onyxplatform.org/

On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun  wrote:

> Dear Apache Incubator Community,
>
> Please accept the following proposal for presentation and discussion:
> https://wiki.apache.org/incubator/OnyxProposal
>
> Onyx is a data processing system that aims to flexibly control the runtime
> behaviors of a job to adapt to varying deployment characteristics (e.g.,
> harnessing transient resources in datacenters, cross-datacenter deployment,
> changing runtime based on job characteristics, etc.). Onyx provides ways to
> extend the system’s capabilities and incorporate the extensions to the
> flexible job execution.
> Onyx translates a user program (e.g., Apache Beam, Apache Spark) into an
> Intermediate Representation (IR) DAG, which Onyx optimizes and deploys
> based on a deployment policy.
>
> I've attached the proposal below.
>
> Best regards,
> Byung-Gon Chun
>
> = OnyxProposal =
>
> == Abstract ==
> Onyx is a data processing system for flexible employment with
> different execution scenarios for various deployment characteristics
> on clusters.
>
> == Proposal ==
> Today, there is a wide variety of data processing systems with
> different designs for better performance and datacenter efficiency.
> They include processing data on specific resource environments and
> running jobs with specific attributes. Although each system
> successfully solves the problems it targets, most systems are designed
> in the way that runtime behaviors are built tightly inside the system
> core to hide the complexity of distributed computing. This makes it
> hard for a single system to support different deployment
> characteristics with different runtime behaviors without substantial
> effort.
>
> Onyx is a data processing system that aims to flexibly control the
> runtime behaviors of a job to adapt to varying deployment
> characteristics. Moreover, it provides a means of extending the
> system’s capabilities and incorporating the extensions to the flexible
> job execution.
>
> In order to be able to easily modify runtime behaviors to adapt to
> varying deployment characteristics, Onyx exposes runtime behaviors to
> be flexibly configured and modified at both compile-time and runtime
> through a set of high-level graph pass interfaces.
>
> We hope to contribute to the big data processing community by enabling
> more flexibility and extensibility in job executions. Furthermore, we
> can benefit more together as a community when we work together as a
> community to mature the system with more use cases and understanding
> of diverse deployment characteristics. The Apache Software Foundation
> is the perfect place to achieve these aspirations.
>
> == Background ==
> Many data processing systems have distinctive runtime behaviors
> optimized and configured for specific deployment characteristics like
> different resource environments and for handling special job
> attributes.
>
> For example, much research have been conducted to overcome the
> challenge of running data processing jobs on cheap, unreliable
> transient resources. Likewise, techniques for disaggregating different
> types of resources, like memory, CPU and GPU, are being actively
> developed to use datacenter resources more efficiently. Many
> researchers are also working to run data processing jobs in even more
> diverse environments, such as across distant datacenters. Similarly,
> for special job attributes, many works take different approaches, such
> as runtime optimization, to solve problems like data skew, and to
> optimize systems for data processing jobs with small-scale input data.
>
> Although each of the systems performs well with the jobs and in the
> environments they target, they perform poorly with unconsidered cases,
> and do not consider supporting multiple deployment characteristics on
> a single system in their designs.
>
> For an application writer to optimize an application to perform well
> on a certain system engraved with its underlying behaviors, it
> requires a deep understanding of the system itself, which is an
> overhead that often requires a lot of time and effort. Moreover, for a
> developer to modify such system behaviors, it requires modifications
> of the system core, which requires an even deeper understanding of the
> system itself.
>
> With this background, Onyx is designed to represent all of its jobs as
> an Inte

Re: License headers on test data (was Re: [VOTE] Release Apache NetBeans 9.0 Beta (incubating) rc2)

2018-01-26 Thread Jaroslav Tulach

2018-01-23 17:36 GMT+01:00 Alex Harui :

> FWIW,  some build and test processes have a "generate-sources" and/or
> "generate-test-sources" step.  Have you considered having a step in your
> test processes copy the source test files into a temporary folder and
> remove the headers as part of that step?   Then you may not need to change
> the test harness and expected result set.
>

Thanks Alex. Should the introduction of headers to test data looking like
Java files be inevitable, this is my favorite solution as well.
-jt



> On 1/23/18, 5:46 AM, "Geertjan Wielenga"
>  wrote:
>
> >OK, makes sense, thanks for these insights and ideas.
> >
> >Gj
> >
> >On Tue, Jan 23, 2018 at 2:40 PM, Bertrand Delacretaz
> > wrote:
> >> Hi,
> >>
> >> On Tue, Jan 23, 2018 at 2:35 PM, Geertjan Wielenga
> >>  wrote:
> >>
> >>>...
> >>>
> >>>https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub
> .
> >>>com%2Fapache%2Fincubator-netbeans%2Fblob%2Fmaster%2Fnbbui
> ld%2Fbuild.xml&
> >>>data=02%7C01%7Caharui%40adobe.com%7C0a91dd1e925e467c4bed0
> 8d56267bfbc%7Cf
> >>>a7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636523120091967
> 780&sdata=flP%2
> >>>BmQUSLXLED3puYUrZzALhDD3adb%2F%2BSkgekR07mOQ%3D&reserved=0
> >>> This is what line 2105 says:
> >>>   ...
> >>
> >> Maybe grouping those exclusions by families would make it easier for
> >> reviewers to understand them: first the ones which are not creative,
> >> then those where a header would cause tests to fail etc.
> >>
> >>> ...You're saying the comment isn't needed in the README...
> >>
> >> What I'm saying is that it shouldn't be duplicated - have the README
> >> point to that build.xml file,or as discussed a file that just has RAT
> >> exclusions, and add the comments next to the exclusions, pointing to
> >> apache.org docs where useful.
> >>
> >>> ...can NETBEANS-306 be closed as resolved?...
> >>
> >> I suggest grouping the exclusions that fall in that family and adding
> >> a pointer to the Apache docs that mention that the header is not
> >> required if it causes tests to fail.
> >>
> >> You then get links from README -> commented RAT exclusions -> Apache
> >> documentation which provide a clear justification.
> >>
> >> -Bertrand
> >>
> >> -
> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >> For additional commands, e-mail: general-h...@incubator.apache.org
> >>
> >
> >-
> >To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> >For additional commands, e-mail: general-h...@incubator.apache.org
> >
>
>

Re: [Request] Write access to the incubator wiki

2018-01-26 Thread John D. Ament

Added, happy editing!

On Thu, Jan 25, 2018 at 9:37 PM Joo Yeon Kim  wrote:

> Hi,
>
> Please grant me write access to the incubator wiki:
> https://wiki.apache.org/incubator.
>
> My user name is JooYeonKim.
>
> Thank you :)
>
> - Joo Yeon Kim
>

[RESULT] [VOTE] Apache DataFu 1.3.3 release RC1

Re: [VOTE] Apache DataFu 1.3.3 release RC1

Re: [VOTE] Apache DataFu 1.3.3 release RC1

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

Re: [PROPOSAL] Onyx - proposal for Apache Incubation

Re: License headers on test data (was Re: [VOTE] Release Apache NetBeans 9.0 Beta (incubating) rc2)

Re: [Request] Write access to the incubator wiki

10 matches

Site Navigation

Mail list logo

Footer information