Fwd: 404 issues

2018-03-06 Thread Hen
Apologies for my rustiness :(

Are we still able to manage a mod_rewrite configuration per project, or did
that go away?

Thanks,

Hen

-- Forwarded message --
From: Aaron Markham 
Date: Mon, Mar 5, 2018 at 3:54 PM
Subject: 404 issues
To: d...@mxnet.incubator.apache.org


I've been notified by several parties about 404s for files that are
now longer available on the site.
https://github.com/apache/incubator-mxnet/issues/9917

This page returns 404:
https://mxnet.incubator.apache.org/api/python/module.html

It was moved here:
https://mxnet.incubator.apache.org/api/python/module/module.html

There are many other examples of moved content. Some are temporarily
"fixed" via adding a meta-refresh tag in the html source for the old
pages. For example the meta tag is being used to redirect to faq, the
new location for the how_to docs.

It would seem a better solution for us is to use htaccess files and
publish persistent redirects for the new location(s) of content.

Do we have a way of pushing a config to the Apache infra to facilitate
this? I think we'd need config access if we're to put up some custom
404 pages too (which would be nicer than what we have now.)

Also, is there a good way to access the log files to get a better idea
of the 404 situation?

Cheers,
Aaron


Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release

2018-03-06 Thread Makoto Yui
Groeten,

Thanks for clarification.

Makoto

2018-03-07 15:27 GMT+09:00 Henk P. Penning :
> On Wed, 7 Mar 2018, Makoto Yui wrote:
>
>> Date: Wed, 7 Mar 2018 07:15:15 +0100
>> From: Makoto Yui 
>> To: general@incubator.apache.org
>> Subject: Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release
>>
>>> Recently the "release policy" [1] changed ; it (now) says :
>>
>>
>> Good to know about it.
>>
>>> MUST supply at least one (SHA or MD5) checksum file,
>>> SHOULD NOT supply a MD5 checksum file (because MD5 is too broken).
>
>
>> This sounds confusing. Could be "MUST supply SHA checksum file(s)," (?)
>
>
>   See RFC 2119 : https://www.ietf.org/rfc/rfc2119.txt
>
>   "MUST", "SHOULD", "SHOULD NOT" etc are technical terms.
>   RFC 2119 explains what they mean.
>
>   Hint :
>
> MUST   == obligation ; a strict requirement
> SHOULD == recommendation ;
>
>   Note : the policy can't say "MUST supply SHA checksum",
>   because many projects have older releases in /dist/
>   that are MD5-only ; under "new policy", that is
>   frowned upon, but not forbidden.
>
>> Makoto
>
>
>   Groeten,
>
>   HPP
>
>
>    _
> Henk P. Penning, ICT-beta R Uithof MG-403_/ \_
> Faculty of Science, Utrecht UniversityT +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL  F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penn...@uu.nl \_/
>
>> 2018-03-07 14:58 GMT+09:00 Henk P. Penning :
>>>
>>> On Wed, 7 Mar 2018, Yi JIN wrote:
>>>
 Date: Wed, 7 Mar 2018 03:56:54 +0100
 From: Yi JIN 
 To: general@incubator.apache.org
 Subject: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release
>>>
>>>
>>>
 The artifacts can be downloaded here:


 https://dist.apache.org/repos/dist/dev/incubator/hawq/2.3.0.0-incubating.RC2/
>>>
>>>
>>>
>>>   Recently the "release policy" [1] changed ; it (now) says :
>>>
>>> ...
>>> * SHOULD NOT supply a MD5 checksum file
>>> ...
>>>
>>>   [1] https://www.apache.org/dev/release-distribution#sigs-and-sums
>>>
 Yi Jin (yjin)
>>>
>>>
>>>
>>>   Groeten,
>>>
>>>   Henk Penning -- apache.org infrastructure ; dist & mirrors
>>>
>>>    _
>>> Henk P. Penning, ICT-beta R Uithof MG-403_/ \_
>>> Faculty of Science, Utrecht UniversityT +31 30 253 4106 / \_/ \
>>> Leuvenlaan 4, 3584CE Utrecht, NL  F +31 30 253 4553 \_/ \_/
>>> http://www.staff.science.uu.nl/~penni101/ M penn...@uu.nl \_/
>>>
>>> -
>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>>> For additional commands, e-mail: general-h...@incubator.apache.org
>>>
>>
>>
>>
>> --
>> Makoto YUI 
>> Research Engineer, Treasure Data, Inc.
>> http://myui.github.io/
>>
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>



-- 
Makoto YUI 
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release

2018-03-06 Thread Henk P. Penning

On Wed, 7 Mar 2018, Makoto Yui wrote:


Date: Wed, 7 Mar 2018 07:15:15 +0100
From: Makoto Yui 
To: general@incubator.apache.org
Subject: Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release


Recently the "release policy" [1] changed ; it (now) says :


Good to know about it.


MUST supply at least one (SHA or MD5) checksum file,
SHOULD NOT supply a MD5 checksum file (because MD5 is too broken).



This sounds confusing. Could be "MUST supply SHA checksum file(s)," (?)


  See RFC 2119 : https://www.ietf.org/rfc/rfc2119.txt

  "MUST", "SHOULD", "SHOULD NOT" etc are technical terms.
  RFC 2119 explains what they mean.

  Hint :

MUST   == obligation ; a strict requirement
SHOULD == recommendation ;

  Note : the policy can't say "MUST supply SHA checksum",
  because many projects have older releases in /dist/
  that are MD5-only ; under "new policy", that is
  frowned upon, but not forbidden.


Makoto


  Groeten,

  HPP

   _
Henk P. Penning, ICT-beta R Uithof MG-403_/ \_
Faculty of Science, Utrecht UniversityT +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL  F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penn...@uu.nl \_/


2018-03-07 14:58 GMT+09:00 Henk P. Penning :

On Wed, 7 Mar 2018, Yi JIN wrote:


Date: Wed, 7 Mar 2018 03:56:54 +0100
From: Yi JIN 
To: general@incubator.apache.org
Subject: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release




The artifacts can be downloaded here:

https://dist.apache.org/repos/dist/dev/incubator/hawq/2.3.0.0-incubating.RC2/



  Recently the "release policy" [1] changed ; it (now) says :

...
* SHOULD NOT supply a MD5 checksum file
...

  [1] https://www.apache.org/dev/release-distribution#sigs-and-sums


Yi Jin (yjin)



  Groeten,

  Henk Penning -- apache.org infrastructure ; dist & mirrors

   _
Henk P. Penning, ICT-beta R Uithof MG-403_/ \_
Faculty of Science, Utrecht UniversityT +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL  F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penn...@uu.nl \_/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org





--
Makoto YUI 
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org




-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release

2018-03-06 Thread Makoto Yui
> Recently the "release policy" [1] changed ; it (now) says :

Good to know about it.

> MUST supply at least one (SHA or MD5) checksum file,
> SHOULD NOT supply a MD5 checksum file (because MD5 is too broken).

This sounds confusing. Could be "MUST supply SHA checksum file(s)," (?)

Makoto

2018-03-07 14:58 GMT+09:00 Henk P. Penning :
> On Wed, 7 Mar 2018, Yi JIN wrote:
>
>> Date: Wed, 7 Mar 2018 03:56:54 +0100
>> From: Yi JIN 
>> To: general@incubator.apache.org
>> Subject: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release
>
>
>> The artifacts can be downloaded here:
>>
>> https://dist.apache.org/repos/dist/dev/incubator/hawq/2.3.0.0-incubating.RC2/
>
>
>   Recently the "release policy" [1] changed ; it (now) says :
>
> ...
> * SHOULD NOT supply a MD5 checksum file
> ...
>
>   [1] https://www.apache.org/dev/release-distribution#sigs-and-sums
>
>> Yi Jin (yjin)
>
>
>   Groeten,
>
>   Henk Penning -- apache.org infrastructure ; dist & mirrors
>
>    _
> Henk P. Penning, ICT-beta R Uithof MG-403_/ \_
> Faculty of Science, Utrecht UniversityT +31 30 253 4106 / \_/ \
> Leuvenlaan 4, 3584CE Utrecht, NL  F +31 30 253 4553 \_/ \_/
> http://www.staff.science.uu.nl/~penni101/ M penn...@uu.nl \_/
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>



-- 
Makoto YUI 
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release

2018-03-06 Thread Henk P. Penning

On Wed, 7 Mar 2018, Yi JIN wrote:


Date: Wed, 7 Mar 2018 03:56:54 +0100
From: Yi JIN 
To: general@incubator.apache.org
Subject: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release



The artifacts can be downloaded here:
https://dist.apache.org/repos/dist/dev/incubator/hawq/2.3.0.0-incubating.RC2/


  Recently the "release policy" [1] changed ; it (now) says :

...
* SHOULD NOT supply a MD5 checksum file
...

  [1] https://www.apache.org/dev/release-distribution#sigs-and-sums


Yi Jin (yjin)


  Groeten,

  Henk Penning -- apache.org infrastructure ; dist & mirrors

   _
Henk P. Penning, ICT-beta R Uithof MG-403_/ \_
Faculty of Science, Utrecht UniversityT +31 30 253 4106 / \_/ \
Leuvenlaan 4, 3584CE Utrecht, NL  F +31 30 253 4553 \_/ \_/
http://www.staff.science.uu.nl/~penni101/ M penn...@uu.nl \_/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Timothy Chen
+1 as well, I think the work Dr. Elephant is doing can also be
potentially applied to more than Spark and Hadoop.

Tim

On Tue, Mar 6, 2018 at 8:38 PM, Kevin A. McGrail  wrote:
> I'm intrigued by the proposal and the product. I'm a 0.5+.
>
> I'd love to know more about why LI put it on GitHub and what problems it's
> having that are leading to a foundation.
>
> --
> Kevin A. McGrail
> Asst. Treasurer & VP Fundraising, Apache Software Foundation
> Chair Emeritus Apache SpamAssassin Project
> https://www.linkedin.com/in/kmcgrail - 703.798.0171
>
> On Tue, Mar 6, 2018 at 8:27 PM, Gangumalla, Uma 
> wrote:
>
>> I would +1 to have as a separate project instead of pushing under Hadoop.
>> When a project can sustain by having potential to build community on its
>> own and can run logically as independent module, I feel that’s good enough
>> to start as separate project.
>>
>> I could not recall the discussions on removal of Vaidya package from
>> Hadoop. If someone remembers, it would be great to know the reasons for
>> removal of that package from Hadoop base. [ probably at the time of
>> mavenization ? ]
>>
>> Regards,
>> Uma
>>
>> On 3/6/18, 3:17 PM, "md...@cloudera.com on behalf of Mike Drob" <
>> md...@cloudera.com on behalf of md...@apache.org> wrote:
>>
>> Why does Dr. Elephant make sense as a separate project instead of
>> contributing to Hadoop directly?
>>
>> What is the relationship between Dr. Elephant and the (now seemingly
>> defunct) Hadoop Vaidya?
>>
>> On Tue, Mar 6, 2018 at 5:08 PM, Carl Steinbach  wrote:
>>
>> > Hi,
>> >
>> > I would like to propose Dr. Elephant as an Apache Incubator
>> > project. The proposal is available as a draft at
>> > https://wiki.apache.org/incubator/DrElephantProposal. I have also
>> > included the text of the proposal below.
>> >
>> > Any feedback from the community is much appreciated.
>> >
>> > Thanks.
>> >
>> > - Carl
>> >
>> >
>> > = ABSTRACT =
>> >
>> > Dr. Elephant is a performance monitoring and tuning service for
>> Apache
>> > Hadoop and Apache Spark jobs and workflows. While the system is
>> > primarily aimed at developers, we have discovered that it is also
>> > popular with cluster operators who use it to monitor the health of
>> > workloads running on their clusters.
>> >
>> > = PROPOSAL =
>> >
>> > Dr. Elephant was open sourced by LinkedIn in 2016 and is currently
>> > hosted on GitHub. We believe that being a part of the Apache Software
>> > Foundation will improve the diversity and help form a strong
>> community
>> > around the project.
>> >
>> > LinkedIn submits this proposal to donate the code base to the Apache
>> > Software Foundation. The code is already under Apache License 2.0.
>> > Both the source code and documentation are hosted on Github.
>> >
>> >  * Code: http://github.com/linkedin/dr-elephant
>> >  * Documentation: https://github.com/linkedin/dr-elephant/wiki
>> >
>> > = Background =
>> >
>> > Dr. Elephant is a service that helps users of Apache Hadoop and
>> Apache
>> > Spark understand, analyze, and improve the performance of jobs and
>> > workflows running on their clusters. It automatically gathers
>> metrics,
>> > performs analysis, and presents the results along with actionable
>> > advice. The goal of the project is to improve developer productivity
>> > and increase cluster efficiency by reducing the time and domain
>> > expertise required to diagnose and treat sick jobs. It analyzes
>> Hadoop
>> > and Spark jobs using a set of configurable, extensible, rule-based
>> > heuristics that provide insights on job performance, and then uses
>> > this information to provide recommendations about how to tune jobs to
>> > make them run more efficiently.
>> >
>> > Dr. Elephant was open sourced in 2016 after two years of
>> > successful production use at Linkedin. In the time since many new
>> > features have been added including support for the Oozie and Airflow
>> > workflow schedulers, improved metrics, and enhancements to the Spark
>> > history fetcher and Spark heuristics. It is also important to note
>> > that many of these contributions came from developers outside of
>> > LinkedIn. We have also been happy to see that many people have been
>> > able to benefit from running Dr. Elephant including companies like
>> > Airbnb, Foursquare, Hulu, and Pinterest.
>> >
>> > = RATIONALE =
>> >
>> > Dr. Elephant's entry to the ASF will be beneficial to both the
>> > Dr. Elephant and Apache communities. Dr. Elephant has greatly
>> > benefited from its open source roots. Its community and adoption has
>> > grown greatly as a result. More importantly, the feedback from the
>> > community, whether 

Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Kevin A. McGrail
I'm intrigued by the proposal and the product. I'm a 0.5+.

I'd love to know more about why LI put it on GitHub and what problems it's
having that are leading to a foundation.

--
Kevin A. McGrail
Asst. Treasurer & VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171

On Tue, Mar 6, 2018 at 8:27 PM, Gangumalla, Uma 
wrote:

> I would +1 to have as a separate project instead of pushing under Hadoop.
> When a project can sustain by having potential to build community on its
> own and can run logically as independent module, I feel that’s good enough
> to start as separate project.
>
> I could not recall the discussions on removal of Vaidya package from
> Hadoop. If someone remembers, it would be great to know the reasons for
> removal of that package from Hadoop base. [ probably at the time of
> mavenization ? ]
>
> Regards,
> Uma
>
> On 3/6/18, 3:17 PM, "md...@cloudera.com on behalf of Mike Drob" <
> md...@cloudera.com on behalf of md...@apache.org> wrote:
>
> Why does Dr. Elephant make sense as a separate project instead of
> contributing to Hadoop directly?
>
> What is the relationship between Dr. Elephant and the (now seemingly
> defunct) Hadoop Vaidya?
>
> On Tue, Mar 6, 2018 at 5:08 PM, Carl Steinbach  wrote:
>
> > Hi,
> >
> > I would like to propose Dr. Elephant as an Apache Incubator
> > project. The proposal is available as a draft at
> > https://wiki.apache.org/incubator/DrElephantProposal. I have also
> > included the text of the proposal below.
> >
> > Any feedback from the community is much appreciated.
> >
> > Thanks.
> >
> > - Carl
> >
> >
> > = ABSTRACT =
> >
> > Dr. Elephant is a performance monitoring and tuning service for
> Apache
> > Hadoop and Apache Spark jobs and workflows. While the system is
> > primarily aimed at developers, we have discovered that it is also
> > popular with cluster operators who use it to monitor the health of
> > workloads running on their clusters.
> >
> > = PROPOSAL =
> >
> > Dr. Elephant was open sourced by LinkedIn in 2016 and is currently
> > hosted on GitHub. We believe that being a part of the Apache Software
> > Foundation will improve the diversity and help form a strong
> community
> > around the project.
> >
> > LinkedIn submits this proposal to donate the code base to the Apache
> > Software Foundation. The code is already under Apache License 2.0.
> > Both the source code and documentation are hosted on Github.
> >
> >  * Code: http://github.com/linkedin/dr-elephant
> >  * Documentation: https://github.com/linkedin/dr-elephant/wiki
> >
> > = Background =
> >
> > Dr. Elephant is a service that helps users of Apache Hadoop and
> Apache
> > Spark understand, analyze, and improve the performance of jobs and
> > workflows running on their clusters. It automatically gathers
> metrics,
> > performs analysis, and presents the results along with actionable
> > advice. The goal of the project is to improve developer productivity
> > and increase cluster efficiency by reducing the time and domain
> > expertise required to diagnose and treat sick jobs. It analyzes
> Hadoop
> > and Spark jobs using a set of configurable, extensible, rule-based
> > heuristics that provide insights on job performance, and then uses
> > this information to provide recommendations about how to tune jobs to
> > make them run more efficiently.
> >
> > Dr. Elephant was open sourced in 2016 after two years of
> > successful production use at Linkedin. In the time since many new
> > features have been added including support for the Oozie and Airflow
> > workflow schedulers, improved metrics, and enhancements to the Spark
> > history fetcher and Spark heuristics. It is also important to note
> > that many of these contributions came from developers outside of
> > LinkedIn. We have also been happy to see that many people have been
> > able to benefit from running Dr. Elephant including companies like
> > Airbnb, Foursquare, Hulu, and Pinterest.
> >
> > = RATIONALE =
> >
> > Dr. Elephant's entry to the ASF will be beneficial to both the
> > Dr. Elephant and Apache communities. Dr. Elephant has greatly
> > benefited from its open source roots. Its community and adoption has
> > grown greatly as a result. More importantly, the feedback from the
> > community, whether through interactions at meetups or through the
> > mailing list, have allowed for a rich exchange of ideas. We believe a
> > partnership with the Apache Foundation is the logical next step. The
> > Dr. Elephant community will greatly benefit from the established
> > development and consensus processes 

Re: IPMC join request

2018-03-06 Thread Carl Steinbach
Obviously I'm +1 on this!

On Mar 6, 2018 6:39 PM, "Felix Cheung"  wrote:

> Hi all,
>
> I'd like to join IPMC, initially to help mentor Dr Elephant as incubator
> project but also looking forward to help mentor other Apache incubator
> projects.
>
> I am PPMC/PMC of Apache Zeppelin (since incubation to TLP) and PMC of
> Apache Spark, Release Manager for releases.
>
> Thanks!
> Felix
>


Re: Wiki conflict for March report

2018-03-06 Thread Suneel Marthi
...and I am the culprit... apologies Mark.

On Tue, Mar 6, 2018 at 8:55 PM, John D. Ament  wrote:

> Thanks Mark.  I went through the changes from when the conflict was
> introduced to now, I don't see anything missing.  I also privately pinged
> the culprit.
>
> On Tue, Mar 6, 2018 at 7:34 PM Mark Thomas  wrote:
>
> > All,
> >
> > I have just spent rather longer than I would have liked cleaning up a
> > conflicted edit in this month's report.
> >
> > I have two requests.
> >
> > 1. If you have submitted your report, please check I didn't accidentally
> > remove content. I tried hard not to but the report was a mess and
> > something might have slipped through the net.
> >
> > 2. If you are editing the report please, please, please take care not to
> > create edit conflicts and if you do manage to create please clean up
> > your own mess rather than leaving it for others to do it for you.
> >
> > Mark
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


[VOTE]: Apache HAWQ 2.3.0.0-incubating Release

2018-03-06 Thread Yi JIN
Hi IPMC members,

The PPMC vote for the Apache HAWQ 2.3.0.0-incubating release has passed.
So I request IPMC now to vote on this release candidate. Thank you!

The release page is here:
https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-incubating+Release

The PPMC vote thread is located here:
https://lists.apache.org/thread.html/fa5b41cd7461bd729146e10d8f7a54156c818f93e5a1160c42e76c79@%3Cdev.hawq.apache.org%3E

The artifacts can be downloaded here:
https://dist.apache.org/repos/dist/dev/incubator/hawq/2.3.0.0-incubating.RC2/
The artifacts have been signed with Key : CE60F90D1333092A

All JIRAs completed for this release are tagged with 'FixVersion
=2.3.0.0-incubating'
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12340262=Html=12318826

Please vote accordingly:
[ ] +1, accept as the official Apache HAWQ 2.3.0.0-incubating release
[ ] -1, do not accept as the official Apache HAWQ 2.3.0.0-incubating release
because...

The vote will run for at least 72 hours.

Best regards,
Yi Jin (yjin)


IPMC join request

2018-03-06 Thread Felix Cheung
Hi all,

I'd like to join IPMC, initially to help mentor Dr Elephant as incubator
project but also looking forward to help mentor other Apache incubator
projects.

I am PPMC/PMC of Apache Zeppelin (since incubation to TLP) and PMC of
Apache Spark, Release Manager for releases.

Thanks!
Felix


Re: Wiki conflict for March report

2018-03-06 Thread John D. Ament
Thanks Mark.  I went through the changes from when the conflict was
introduced to now, I don't see anything missing.  I also privately pinged
the culprit.

On Tue, Mar 6, 2018 at 7:34 PM Mark Thomas  wrote:

> All,
>
> I have just spent rather longer than I would have liked cleaning up a
> conflicted edit in this month's report.
>
> I have two requests.
>
> 1. If you have submitted your report, please check I didn't accidentally
> remove content. I tried hard not to but the report was a mess and
> something might have slipped through the net.
>
> 2. If you are editing the report please, please, please take care not to
> create edit conflicts and if you do manage to create please clean up
> your own mess rather than leaving it for others to do it for you.
>
> Mark
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: Wiki conflict for March report

2018-03-06 Thread Ted Dunning
Mark

Thanks for doing dirty work.

On Mar 6, 2018 4:34 PM, "Mark Thomas"  wrote:

> All,
>
> I have just spent rather longer than I would have liked cleaning up a
> conflicted edit in this month's report.
>
> I have two requests.
>
> 1. If you have submitted your report, please check I didn't accidentally
> remove content. I tried hard not to but the report was a mess and
> something might have slipped through the net.
>
> 2. If you are editing the report please, please, please take care not to
> create edit conflicts and if you do manage to create please clean up
> your own mess rather than leaving it for others to do it for you.
>
> Mark
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Gangumalla, Uma
I would +1 to have as a separate project instead of pushing under Hadoop. When 
a project can sustain by having potential to build community on its own and can 
run logically as independent module, I feel that’s good enough to start as 
separate project.  

I could not recall the discussions on removal of Vaidya package from Hadoop. If 
someone remembers, it would be great to know the reasons for removal of that 
package from Hadoop base. [ probably at the time of mavenization ? ]

Regards,
Uma

On 3/6/18, 3:17 PM, "md...@cloudera.com on behalf of Mike Drob" 
 wrote:

Why does Dr. Elephant make sense as a separate project instead of
contributing to Hadoop directly?

What is the relationship between Dr. Elephant and the (now seemingly
defunct) Hadoop Vaidya?

On Tue, Mar 6, 2018 at 5:08 PM, Carl Steinbach  wrote:

> Hi,
>
> I would like to propose Dr. Elephant as an Apache Incubator
> project. The proposal is available as a draft at
> https://wiki.apache.org/incubator/DrElephantProposal. I have also
> included the text of the proposal below.
>
> Any feedback from the community is much appreciated.
>
> Thanks.
>
> - Carl
>
>
> = ABSTRACT =
>
> Dr. Elephant is a performance monitoring and tuning service for Apache
> Hadoop and Apache Spark jobs and workflows. While the system is
> primarily aimed at developers, we have discovered that it is also
> popular with cluster operators who use it to monitor the health of
> workloads running on their clusters.
>
> = PROPOSAL =
>
> Dr. Elephant was open sourced by LinkedIn in 2016 and is currently
> hosted on GitHub. We believe that being a part of the Apache Software
> Foundation will improve the diversity and help form a strong community
> around the project.
>
> LinkedIn submits this proposal to donate the code base to the Apache
> Software Foundation. The code is already under Apache License 2.0.
> Both the source code and documentation are hosted on Github.
>
>  * Code: http://github.com/linkedin/dr-elephant
>  * Documentation: https://github.com/linkedin/dr-elephant/wiki
>
> = Background =
>
> Dr. Elephant is a service that helps users of Apache Hadoop and Apache
> Spark understand, analyze, and improve the performance of jobs and
> workflows running on their clusters. It automatically gathers metrics,
> performs analysis, and presents the results along with actionable
> advice. The goal of the project is to improve developer productivity
> and increase cluster efficiency by reducing the time and domain
> expertise required to diagnose and treat sick jobs. It analyzes Hadoop
> and Spark jobs using a set of configurable, extensible, rule-based
> heuristics that provide insights on job performance, and then uses
> this information to provide recommendations about how to tune jobs to
> make them run more efficiently.
>
> Dr. Elephant was open sourced in 2016 after two years of
> successful production use at Linkedin. In the time since many new
> features have been added including support for the Oozie and Airflow
> workflow schedulers, improved metrics, and enhancements to the Spark
> history fetcher and Spark heuristics. It is also important to note
> that many of these contributions came from developers outside of
> LinkedIn. We have also been happy to see that many people have been
> able to benefit from running Dr. Elephant including companies like
> Airbnb, Foursquare, Hulu, and Pinterest.
>
> = RATIONALE =
>
> Dr. Elephant's entry to the ASF will be beneficial to both the
> Dr. Elephant and Apache communities. Dr. Elephant has greatly
> benefited from its open source roots. Its community and adoption has
> grown greatly as a result. More importantly, the feedback from the
> community, whether through interactions at meetups or through the
> mailing list, have allowed for a rich exchange of ideas. We believe a
> partnership with the Apache Foundation is the logical next step. The
> Dr. Elephant community will greatly benefit from the established
> development and consensus processes that have worked well for other
> projects. The Apache process has served many other open source
> projects well and we believe that the Dr. Elephant community will
> greatly benefit from these practices as well.
>
> = CURRENT STATUS =
>
> Dr. Elephant is currently open sourced under the Apache License
> Version 2.0 and is available at github.com/linkedin/dr-elephant. All
> of the development is done using GitHub Pull Requests.
>
> We are aware of at least 10 organizations that are running
> Dr. Elephant, and many of these organizations have also contributed
 

Wiki conflict for March report

2018-03-06 Thread Mark Thomas
All,

I have just spent rather longer than I would have liked cleaning up a
conflicted edit in this month's report.

I have two requests.

1. If you have submitted your report, please check I didn't accidentally
remove content. I tried hard not to but the report was a mess and
something might have slipped through the net.

2. If you are editing the report please, please, please take care not to
create edit conflicts and if you do manage to create please clean up
your own mess rather than leaving it for others to do it for you.

Mark

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Mike Drob
Thanks for explaining your thoughts, Carl! Those answers all sound great to
me.

Best of luck!


On Tue, Mar 6, 2018, 5:58 PM Carl Steinbach  wrote:

> Hi Mike,
>
> Why does Dr. Elephant make sense as a separate project instead of
> > contributing to Hadoop directly?
> >
>
> Here are a couple reasons why I think Dr. Elephant is more likely to
> succeed as a separate project:
>
> * Dr. Elephant supports Hadoop *and* Spark, and may support other
>   execution layers in the future. If we make Dr. Elephant a part of
>   Hadoop I expect that it will discourage contributions from people
>   who are interested mainly in Spark support, and vice versa.
>
> * If Dr. Elephant is added to Hadoop it will be necessary for the
>   Hadoop project to declare a dependency on Spark. I doubt this change
>   will get approved.
>
> * We don't want to tie Dr. Elephant to a specific version of Hadoop or
>   Spark, or tie the Dr. Elephant release cycle to the Hadoop or Spark
>   release cycles.
>
> * None of the current Dr. Elephant committers are Hadoop committers,
>   and I doubt that the Hadoop PMC is going to give them a commit bit
>   just to work on Dr. Elephant. As a result the existing committers
>   would be effectively forfeiting their right to continue maintaining
>   their own project. I think this is one of the reasons why many
>   Hadoop contrib projects are poorly maintained.
>
>
>
> > What is the relationship between Dr. Elephant and the (now seemingly
> > defunct) Hadoop Vaidya?
> >
>
> Vaidya was a command line tool for tuning Hadoop jobs. Dr. Elephant is
> an always-on service for tuning Hadoop and Spark jobs. We were unaware
> of Vaidya when we started working on Dr. Elephant.
>
> - Carl
>


Re: Write access to incubator wiki

2018-03-06 Thread Roman Shaposhnik
Done!

On Tue, Mar 6, 2018 at 3:58 PM, Mark Thomas  wrote:
> Hi,
>
> Please grant 'markt' write access to the incubator wiki.
>
> Thanks,
>
> Mark
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Write access to incubator wiki

2018-03-06 Thread Mark Thomas
Hi,

Please grant 'markt' write access to the incubator wiki.

Thanks,

Mark

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Carl Steinbach
Hi Mike,

Why does Dr. Elephant make sense as a separate project instead of
> contributing to Hadoop directly?
>

Here are a couple reasons why I think Dr. Elephant is more likely to
succeed as a separate project:

* Dr. Elephant supports Hadoop *and* Spark, and may support other
  execution layers in the future. If we make Dr. Elephant a part of
  Hadoop I expect that it will discourage contributions from people
  who are interested mainly in Spark support, and vice versa.

* If Dr. Elephant is added to Hadoop it will be necessary for the
  Hadoop project to declare a dependency on Spark. I doubt this change
  will get approved.

* We don't want to tie Dr. Elephant to a specific version of Hadoop or
  Spark, or tie the Dr. Elephant release cycle to the Hadoop or Spark
  release cycles.

* None of the current Dr. Elephant committers are Hadoop committers,
  and I doubt that the Hadoop PMC is going to give them a commit bit
  just to work on Dr. Elephant. As a result the existing committers
  would be effectively forfeiting their right to continue maintaining
  their own project. I think this is one of the reasons why many
  Hadoop contrib projects are poorly maintained.



> What is the relationship between Dr. Elephant and the (now seemingly
> defunct) Hadoop Vaidya?
>

Vaidya was a command line tool for tuning Hadoop jobs. Dr. Elephant is
an always-on service for tuning Hadoop and Spark jobs. We were unaware
of Vaidya when we started working on Dr. Elephant.

- Carl


Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Roman Shaposhnik
On Tue, Mar 6, 2018 at 3:17 PM, Mike Drob  wrote:
> Why does Dr. Elephant make sense as a separate project instead of
> contributing to Hadoop directly?
>
> What is the relationship between Dr. Elephant and the (now seemingly
> defunct) Hadoop Vaidya?

A different way to ask the same question would be: how closely is it tied
to YARN as a scheduler? Does it support other schedulers (as in running
Spark on Mesos for example)?

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Mike Drob
Why does Dr. Elephant make sense as a separate project instead of
contributing to Hadoop directly?

What is the relationship between Dr. Elephant and the (now seemingly
defunct) Hadoop Vaidya?

On Tue, Mar 6, 2018 at 5:08 PM, Carl Steinbach  wrote:

> Hi,
>
> I would like to propose Dr. Elephant as an Apache Incubator
> project. The proposal is available as a draft at
> https://wiki.apache.org/incubator/DrElephantProposal. I have also
> included the text of the proposal below.
>
> Any feedback from the community is much appreciated.
>
> Thanks.
>
> - Carl
>
>
> = ABSTRACT =
>
> Dr. Elephant is a performance monitoring and tuning service for Apache
> Hadoop and Apache Spark jobs and workflows. While the system is
> primarily aimed at developers, we have discovered that it is also
> popular with cluster operators who use it to monitor the health of
> workloads running on their clusters.
>
> = PROPOSAL =
>
> Dr. Elephant was open sourced by LinkedIn in 2016 and is currently
> hosted on GitHub. We believe that being a part of the Apache Software
> Foundation will improve the diversity and help form a strong community
> around the project.
>
> LinkedIn submits this proposal to donate the code base to the Apache
> Software Foundation. The code is already under Apache License 2.0.
> Both the source code and documentation are hosted on Github.
>
>  * Code: http://github.com/linkedin/dr-elephant
>  * Documentation: https://github.com/linkedin/dr-elephant/wiki
>
> = Background =
>
> Dr. Elephant is a service that helps users of Apache Hadoop and Apache
> Spark understand, analyze, and improve the performance of jobs and
> workflows running on their clusters. It automatically gathers metrics,
> performs analysis, and presents the results along with actionable
> advice. The goal of the project is to improve developer productivity
> and increase cluster efficiency by reducing the time and domain
> expertise required to diagnose and treat sick jobs. It analyzes Hadoop
> and Spark jobs using a set of configurable, extensible, rule-based
> heuristics that provide insights on job performance, and then uses
> this information to provide recommendations about how to tune jobs to
> make them run more efficiently.
>
> Dr. Elephant was open sourced in 2016 after two years of
> successful production use at Linkedin. In the time since many new
> features have been added including support for the Oozie and Airflow
> workflow schedulers, improved metrics, and enhancements to the Spark
> history fetcher and Spark heuristics. It is also important to note
> that many of these contributions came from developers outside of
> LinkedIn. We have also been happy to see that many people have been
> able to benefit from running Dr. Elephant including companies like
> Airbnb, Foursquare, Hulu, and Pinterest.
>
> = RATIONALE =
>
> Dr. Elephant's entry to the ASF will be beneficial to both the
> Dr. Elephant and Apache communities. Dr. Elephant has greatly
> benefited from its open source roots. Its community and adoption has
> grown greatly as a result. More importantly, the feedback from the
> community, whether through interactions at meetups or through the
> mailing list, have allowed for a rich exchange of ideas. We believe a
> partnership with the Apache Foundation is the logical next step. The
> Dr. Elephant community will greatly benefit from the established
> development and consensus processes that have worked well for other
> projects. The Apache process has served many other open source
> projects well and we believe that the Dr. Elephant community will
> greatly benefit from these practices as well.
>
> = CURRENT STATUS =
>
> Dr. Elephant is currently open sourced under the Apache License
> Version 2.0 and is available at github.com/linkedin/dr-elephant. All
> of the development is done using GitHub Pull Requests.
>
> We are aware of at least 10 organizations that are running
> Dr. Elephant, and many of these organizations have also contributed
> code. Dr. Elephant has also been integrated into commercial products
> such as Pepperdata's Application Profiler.
>
> = INITIAL GOALS =
>
> Our initial goals are as follows:
>
>  * Migrate the existing codebase to Apache
>  * Study and integrate with the Apache development process
>  * Ensure all dependencies are compliant with Apache License version 2.0
>  * Incremental development and releases per Apache guidelines
>  * Diversify the set of core developers and committers
>
> = MERITOCRACY =
>
> Following the Apache meritocracy model, we intend to build an open and
> diverse community around Dr. Elephant. We will encourage the community to
> contribute to discussions and the codebase.
>
> = COMMUNITY =
>
> The need for a simple and understandable performance monitoring and
> tuning service for Hadoop and Spark is tremendous. Dr. Elephant is
> currently being used by at least 10 organizations worldwide (some
> examples are listed here). We hope to extend the contributor 

[DISCUSS] Dr. Elephant Incubator Proposal

2018-03-06 Thread Carl Steinbach
Hi,

I would like to propose Dr. Elephant as an Apache Incubator
project. The proposal is available as a draft at
https://wiki.apache.org/incubator/DrElephantProposal. I have also
included the text of the proposal below.

Any feedback from the community is much appreciated.

Thanks.

- Carl


= ABSTRACT =

Dr. Elephant is a performance monitoring and tuning service for Apache
Hadoop and Apache Spark jobs and workflows. While the system is
primarily aimed at developers, we have discovered that it is also
popular with cluster operators who use it to monitor the health of
workloads running on their clusters.

= PROPOSAL =

Dr. Elephant was open sourced by LinkedIn in 2016 and is currently
hosted on GitHub. We believe that being a part of the Apache Software
Foundation will improve the diversity and help form a strong community
around the project.

LinkedIn submits this proposal to donate the code base to the Apache
Software Foundation. The code is already under Apache License 2.0.
Both the source code and documentation are hosted on Github.

 * Code: http://github.com/linkedin/dr-elephant
 * Documentation: https://github.com/linkedin/dr-elephant/wiki

= Background =

Dr. Elephant is a service that helps users of Apache Hadoop and Apache
Spark understand, analyze, and improve the performance of jobs and
workflows running on their clusters. It automatically gathers metrics,
performs analysis, and presents the results along with actionable
advice. The goal of the project is to improve developer productivity
and increase cluster efficiency by reducing the time and domain
expertise required to diagnose and treat sick jobs. It analyzes Hadoop
and Spark jobs using a set of configurable, extensible, rule-based
heuristics that provide insights on job performance, and then uses
this information to provide recommendations about how to tune jobs to
make them run more efficiently.

Dr. Elephant was open sourced in 2016 after two years of
successful production use at Linkedin. In the time since many new
features have been added including support for the Oozie and Airflow
workflow schedulers, improved metrics, and enhancements to the Spark
history fetcher and Spark heuristics. It is also important to note
that many of these contributions came from developers outside of
LinkedIn. We have also been happy to see that many people have been
able to benefit from running Dr. Elephant including companies like
Airbnb, Foursquare, Hulu, and Pinterest.

= RATIONALE =

Dr. Elephant's entry to the ASF will be beneficial to both the
Dr. Elephant and Apache communities. Dr. Elephant has greatly
benefited from its open source roots. Its community and adoption has
grown greatly as a result. More importantly, the feedback from the
community, whether through interactions at meetups or through the
mailing list, have allowed for a rich exchange of ideas. We believe a
partnership with the Apache Foundation is the logical next step. The
Dr. Elephant community will greatly benefit from the established
development and consensus processes that have worked well for other
projects. The Apache process has served many other open source
projects well and we believe that the Dr. Elephant community will
greatly benefit from these practices as well.

= CURRENT STATUS =

Dr. Elephant is currently open sourced under the Apache License
Version 2.0 and is available at github.com/linkedin/dr-elephant. All
of the development is done using GitHub Pull Requests.

We are aware of at least 10 organizations that are running
Dr. Elephant, and many of these organizations have also contributed
code. Dr. Elephant has also been integrated into commercial products
such as Pepperdata's Application Profiler.

= INITIAL GOALS =

Our initial goals are as follows:

 * Migrate the existing codebase to Apache
 * Study and integrate with the Apache development process
 * Ensure all dependencies are compliant with Apache License version 2.0
 * Incremental development and releases per Apache guidelines
 * Diversify the set of core developers and committers

= MERITOCRACY =

Following the Apache meritocracy model, we intend to build an open and
diverse community around Dr. Elephant. We will encourage the community to
contribute to discussions and the codebase.

= COMMUNITY =

The need for a simple and understandable performance monitoring and
tuning service for Hadoop and Spark is tremendous. Dr. Elephant is
currently being used by at least 10 organizations worldwide (some
examples are listed here). We hope to extend the contributor base
significantly by bringing Dr. Elephant into Apache.

= CORE DEVELOPERS =

Dr. Elephant was started by engineers at LinkedIn. Many other
individuals and organizations have contributed to the project, and
this diversity is reflected in the list of initial committers.

= ALIGNMENT =

Apache is the most natural home for Dr. Elephant because of its close
relationship to Apache Hadoop and Apache Spark, and its integration
with Apache Oozie and Apache 

Re: Apache Traffic Control graduation community vote

2018-03-06 Thread Dave Neuman
Hi All,
I wanted to let you know that this vote passed!  Please see the result
thread for more details [1].  We are excited to take the next steps towards
becoming a top level project!

Thanks,
Dave

[1]
https://lists.apache.org/thread.html/074de5dcc6d01346e26e3a06caabf4bee01303b6a522f11c62ca83fe@%3Cdev.trafficcontrol.apache.org%3E

On Thu, Mar 1, 2018 at 10:10 AM, Dave Neuman  wrote:

> Hello IPMC,
> As mentioned in the Guide to Successful Graduation [1] it is recommended
> that we notifiy the Incubator when a graduation vote is initiated within
> our project.  This email is to let you know that Traffic Control has
> decided to call a vote for graduation [2].  If this vote is successful we
> will proceed with nominating a chair and drafting a resolution.  Please let
> me, anyone on the PMC, or one of our mentors know if you have any questions.
>
> Thanks,
> Dave
>
>
>
> [1] https://incubator.apache.org/guides/graduation.html#
> community_graduation_vote
> [2] https://lists.apache.org/thread.html/fb1fae0785feb6568cef6deb6fa207
> 23eba54ed63a445462d44564d3@%3Cdev.trafficcontrol.apache.org%3E
>


[RESULT][VOTE] Release Apache Daffodil (Incubating) 2.1.0-rc2

2018-03-06 Thread Steve Lawrence
The Apache Daffodil (Incubating) 2.1.0-rc2 vote failed with one +1
binding vote and one -1 binding vote.

+1 (binding):
John D. Ament (carried over the from Daffodil dev vote)

-1 (binding) and reasons:
Justin Mclean - LICENSE is missing full text of the licenses, NOTICE
contains unneeded information

Vote thread:
https://lists.apache.org/thread.html/1bc33636ec76ae80d390a0ac988d8e96f37190a3d1bc02f9668f78c1@%3Cgeneral.incubator.apache.org%3E

We will prepare another release candidate to resolve the issues raised
during this vote.

Thanks,
- Steve


On 03/02/2018 02:50 PM, Steve Lawrence wrote:
> The Apache Daffodil community has voted and approved the proposed
> release of Apache Daffodil (Incubating) 2.1.0-rc2.
> 
> We now kindly request the Incubator PMC members review and vote on this
> incubator release.
> 
> Daffodil is an open source implementation of the DFDL specification that
> uses DFDL schemas to parse fixed format data into an infoset, which is
> most commonly represented as either XML or JSON. This allows the use of
> well-established XML or JSON technologies and libraries to consume,
> inspect, and manipulate fixed format data in existing solutions.
> Daffodil is also capable of the reverse by serializing or "unparsing" an
> XML or JSON infoset back to the original data format.
> 
> Vote thread:
> https://lists.apache.org/thread.html/4b71db31a6a420098a18139a046c5493d5685137251b4727736a9f18@%3Cdev.daffodil.apache.org%3E
> 
> Specific issues found during the rc2 VOTE that are planned to be
> resolved in the next release include:
> 
>   DAFFODIL-1906: Updates to LICENSE/NOTICE files
> https://issues.apache.org/jira/browse/DAFFODIL-1906
> 
>   LEGAL-369: Guidance on Open Grid Forum (OGF) document inclusion
> https://issues.apache.org/jira/browse/LEGAL-369
> 
> Result thread:
> https://lists.apache.org/thread.html/f883421a96deffee80e59bd2fbbf07062dfe0ee26e4c4c4cfa194ba5@%3Cdev.daffodil.apache.org%3E
> 
> 
> All distribution packages, including signatures, digests, etc. can be
> found at:
> 
> https://dist.apache.org/repos/dist/dev/incubator/daffodil/2.1.0-rc2/
> 
> Staging artifacts can be found at:
> 
> https://repository.apache.org/content/repositories/orgapachedaffodil-1001/
> 
> This release has been signed with PGP key 033AE661, corresponding to
> slawre...@apache.org, which is included in the repository's KEYS file.
> This key can be found on keyservers, such as:
> 
> http://pgp.mit.edu/pks/lookup?op=get=0x033AE661
> 
> It is also listed here:
> 
> https://people.apache.org/keys/committer/slawrence.asc
> 
> The release candidate has been tagged in git with v2.1.0-rc2.
> 
> For reference, here is a list of all closed JIRAs tagged with 2.1.0:
> 
> https://issues.apache.org/jira/browse/DAFFODIL-1897?jql=project%20%3D%20DAFFODIL%20AND%20fixVersion%20%3D%202.1.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC
> 
> For a summary of the changes in this release, see:
> 
> https://daffodil.apache.org/releases/2.1.0/
> 
> Please review and vote. The vote will be open for at least 72 hours.
> 
> [ ] +1 approve
> [ ] +0 no opinion
> [ ] -1 disapprove (and reason why)
> 
> Thanks,
> - Steve
> 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org