Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Varun Vasudev
+1

Many thanks to Allen and Andrew for driving this.

-Varun



On 7/3/15, 10:25 AM, "Vinayakumar B"  wrote:

>+1 for the auto generation.
>
>bq. Besides, after a release R1 is out, someone may (accidentally or
>intentionally) modify the JIRA summary.
>Is there any possibility that, we can restrict someone from editing the
>issue in jira once its marked as "closed" after release?
>
>Regards,
>Vinay
>
>On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla  wrote:
>
>> Huge +1
>>
>> On Thursday, July 2, 2015, Chris Nauroth  wrote:
>>
>> > +1
>> >
>> > Thank you to Allen for the script, and thank you to Andrew for
>> > volunteering to drive the conversion.
>> >
>> > --Chris Nauroth
>> >
>> >
>> >
>> >
>> > On 7/2/15, 2:01 PM, "Andrew Wang" > >
>> > wrote:
>> >
>> > >Hi all,
>> > >
>> > >I want to revive the discussion on this thread, since the overhead of
>> > >CHANGES.txt came up again in the context of backporting fixes for
>> > >maintenance releases.
>> > >
>> > >Allen's automatic generation script (HADOOP-11731) went into trunk but
>> not
>> > >branch-2, so we're still maintaining CHANGES.txt everywhere. What do
>> > >people
>> > >think about backporting this to branch-2 and then removing CHANGES.txt
>> > >from
>> > >trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
>> > >HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source
>> of
>> > >information, and JIRA is at least as reliable and probably much more so.
>> > >Thus I don't see any downsides to backporting it.
>> > >
>> > >Would like to hear everyone's thoughts on this, I'm willing to drive the
>> > >effort.
>> > >
>> > >Thanks,
>> > >Andrew
>> > >
>> > >On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze 
>> > >wrote:
>> > >
>> > >> Generating change log from JIRA is a good idea.  It bases on an
>> > >>assumption
>> > >> that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
>> > >>the
>> > >> committed change. Unfortunately, the assumption is invalid for many
>> > >>cases
>> > >> since we never enforce that the JIRA summary must be the same as the
>> > >>change
>> > >> log.  We may compare the current CHANGES.txt with the generated change
>> > >> log.  I beg the diff is long.
>> > >> Besides, after a release R1 is out, someone may (accidentally or
>> > >> intentionally) modify the JIRA summary.  Then, the entry for the same
>> > >>item
>> > >> in a later release R2 could be different from the one in R1.
>> > >> I agree that manually editing CHANGES.txt is not a perfect solution.
>> > >> However, it works well in the past for many releases.  I suggest we
>> keep
>> > >> the current dev workflow.  Try using the new script provided by
>> > >> HADOOP-11731 to generate the next release.  If everything works well,
>> we
>> > >> shell remove CHANGES.txt and revise the dev workflow.  What do you
>> > >>think?
>> > >> Regards,Tsz-Wo
>> > >>
>> > >>
>> > >>  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer <
>> > >> a...@altiscale.com > wrote:
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli <
>> > >> vino...@hortonworks.com > wrote:
>> > >>
>> > >> >
>> > >> > We'd then doing two commits for every patch. Let's simply not remove
>> > >> CHANGES.txt from trunk, keep the existing dev workflow, but doc the
>> > >>release
>> > >> process to remove CHANGES.txt in trunk at the time of a release going
>> > >>out
>> > >> of trunk.
>> > >>
>> > >>
>> > >>
>> > >> Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
>> > >> Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
>> > >>some
>> > >> patches that went into both branches.)  So that folks who are
>> > >>committing to
>> > >> both branches and want to cherry pick all changes can.
>> > >>
>> > >> I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
>> > >> useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š
>> and
>> > >> that jira is only what is miscategorized, not what is missing.
>> > >>
>> > >>
>> > >>
>> > >>
>> >
>> >
>>
>> --
>> Mobile
>>



[jira] [Created] (YARN-3883) YarnClient.getApplicationReport() doesn't not give diagnostics for the FINISHED state applications some times

2015-07-02 Thread Devaraj K (JIRA)
Devaraj K created YARN-3883:
---

 Summary: YarnClient.getApplicationReport() doesn't not give 
diagnostics for the FINISHED state applications some times 
 Key: YARN-3883
 URL: https://issues.apache.org/jira/browse/YARN-3883
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Devaraj K


YarnClient.getApplicationReport() doesn't not give diagnostics for the FINISHED 
state applications some times 

Below one is the report from the YarnClient.getApplicationReport(), It doesn't 
show the diagnostics for the application which has FinalStatus as FAILED and 
YarnApplicationState as FINISHED.
{code:xml}
15/07/03 15:53:27 INFO yarn.Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: XX.XXX.XX.XX
 ApplicationMaster RPC port: 0
 queue: default
 start time: 1435918986890
 final status: FAILED
 tracking URL: 
http://stobdtserver2:8088/proxy/application_1435848120635_0015/
 user: root
{code}


But we can see the Diagnostics information in the RM Web UI for the same 
application.
{code:xml}
YarnApplicationState:   FINISHED
Queue:  default
FinalStatus Reported by AM: FAILED
Started:Fri Jul 03 15:53:06 +0530 2015
Elapsed:20sec
Tracking URL:   History
Log Aggregation Status  DISABLED
Diagnostics:User class threw exception: java.lang.NumberFormatException: 
For input string: "xx"
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Vinayakumar B
+1 for the auto generation.

bq. Besides, after a release R1 is out, someone may (accidentally or
intentionally) modify the JIRA summary.
Is there any possibility that, we can restrict someone from editing the
issue in jira once its marked as "closed" after release?

Regards,
Vinay

On Fri, Jul 3, 2015 at 8:32 AM, Karthik Kambatla  wrote:

> Huge +1
>
> On Thursday, July 2, 2015, Chris Nauroth  wrote:
>
> > +1
> >
> > Thank you to Allen for the script, and thank you to Andrew for
> > volunteering to drive the conversion.
> >
> > --Chris Nauroth
> >
> >
> >
> >
> > On 7/2/15, 2:01 PM, "Andrew Wang"  >
> > wrote:
> >
> > >Hi all,
> > >
> > >I want to revive the discussion on this thread, since the overhead of
> > >CHANGES.txt came up again in the context of backporting fixes for
> > >maintenance releases.
> > >
> > >Allen's automatic generation script (HADOOP-11731) went into trunk but
> not
> > >branch-2, so we're still maintaining CHANGES.txt everywhere. What do
> > >people
> > >think about backporting this to branch-2 and then removing CHANGES.txt
> > >from
> > >trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
> > >HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source
> of
> > >information, and JIRA is at least as reliable and probably much more so.
> > >Thus I don't see any downsides to backporting it.
> > >
> > >Would like to hear everyone's thoughts on this, I'm willing to drive the
> > >effort.
> > >
> > >Thanks,
> > >Andrew
> > >
> > >On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze 
> > >wrote:
> > >
> > >> Generating change log from JIRA is a good idea.  It bases on an
> > >>assumption
> > >> that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
> > >>the
> > >> committed change. Unfortunately, the assumption is invalid for many
> > >>cases
> > >> since we never enforce that the JIRA summary must be the same as the
> > >>change
> > >> log.  We may compare the current CHANGES.txt with the generated change
> > >> log.  I beg the diff is long.
> > >> Besides, after a release R1 is out, someone may (accidentally or
> > >> intentionally) modify the JIRA summary.  Then, the entry for the same
> > >>item
> > >> in a later release R2 could be different from the one in R1.
> > >> I agree that manually editing CHANGES.txt is not a perfect solution.
> > >> However, it works well in the past for many releases.  I suggest we
> keep
> > >> the current dev workflow.  Try using the new script provided by
> > >> HADOOP-11731 to generate the next release.  If everything works well,
> we
> > >> shell remove CHANGES.txt and revise the dev workflow.  What do you
> > >>think?
> > >> Regards,Tsz-Wo
> > >>
> > >>
> > >>  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer <
> > >> a...@altiscale.com > wrote:
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli <
> > >> vino...@hortonworks.com > wrote:
> > >>
> > >> >
> > >> > We'd then doing two commits for every patch. Let's simply not remove
> > >> CHANGES.txt from trunk, keep the existing dev workflow, but doc the
> > >>release
> > >> process to remove CHANGES.txt in trunk at the time of a release going
> > >>out
> > >> of trunk.
> > >>
> > >>
> > >>
> > >> Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
> > >> Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
> > >>some
> > >> patches that went into both branches.)  So that folks who are
> > >>committing to
> > >> both branches and want to cherry pick all changes can.
> > >>
> > >> I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
> > >> useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š
> and
> > >> that jira is only what is miscategorized, not what is missing.
> > >>
> > >>
> > >>
> > >>
> >
> >
>
> --
> Mobile
>


Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Karthik Kambatla
Huge +1

On Thursday, July 2, 2015, Chris Nauroth  wrote:

> +1
>
> Thank you to Allen for the script, and thank you to Andrew for
> volunteering to drive the conversion.
>
> --Chris Nauroth
>
>
>
>
> On 7/2/15, 2:01 PM, "Andrew Wang" >
> wrote:
>
> >Hi all,
> >
> >I want to revive the discussion on this thread, since the overhead of
> >CHANGES.txt came up again in the context of backporting fixes for
> >maintenance releases.
> >
> >Allen's automatic generation script (HADOOP-11731) went into trunk but not
> >branch-2, so we're still maintaining CHANGES.txt everywhere. What do
> >people
> >think about backporting this to branch-2 and then removing CHANGES.txt
> >from
> >trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
> >HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of
> >information, and JIRA is at least as reliable and probably much more so.
> >Thus I don't see any downsides to backporting it.
> >
> >Would like to hear everyone's thoughts on this, I'm willing to drive the
> >effort.
> >
> >Thanks,
> >Andrew
> >
> >On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze 
> >wrote:
> >
> >> Generating change log from JIRA is a good idea.  It bases on an
> >>assumption
> >> that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
> >>the
> >> committed change. Unfortunately, the assumption is invalid for many
> >>cases
> >> since we never enforce that the JIRA summary must be the same as the
> >>change
> >> log.  We may compare the current CHANGES.txt with the generated change
> >> log.  I beg the diff is long.
> >> Besides, after a release R1 is out, someone may (accidentally or
> >> intentionally) modify the JIRA summary.  Then, the entry for the same
> >>item
> >> in a later release R2 could be different from the one in R1.
> >> I agree that manually editing CHANGES.txt is not a perfect solution.
> >> However, it works well in the past for many releases.  I suggest we keep
> >> the current dev workflow.  Try using the new script provided by
> >> HADOOP-11731 to generate the next release.  If everything works well, we
> >> shell remove CHANGES.txt and revise the dev workflow.  What do you
> >>think?
> >> Regards,Tsz-Wo
> >>
> >>
> >>  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer <
> >> a...@altiscale.com > wrote:
> >>
> >>
> >>
> >>
> >>
> >> On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli <
> >> vino...@hortonworks.com > wrote:
> >>
> >> >
> >> > We'd then doing two commits for every patch. Let's simply not remove
> >> CHANGES.txt from trunk, keep the existing dev workflow, but doc the
> >>release
> >> process to remove CHANGES.txt in trunk at the time of a release going
> >>out
> >> of trunk.
> >>
> >>
> >>
> >> Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
> >> Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
> >>some
> >> patches that went into both branches.)  So that folks who are
> >>committing to
> >> both branches and want to cherry pick all changes can.
> >>
> >> I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
> >> useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and
> >> that jira is only what is miscategorized, not what is missing.
> >>
> >>
> >>
> >>
>
>

-- 
Mobile


[jira] [Created] (YARN-3882) AggregatedLogFormat should close aclScanner and ownerScanner after create them.

2015-07-02 Thread zhihai xu (JIRA)
zhihai xu created YARN-3882:
---

 Summary: AggregatedLogFormat should close aclScanner and 
ownerScanner after create them.
 Key: YARN-3882
 URL: https://issues.apache.org/jira/browse/YARN-3882
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor


AggregatedLogFormat should close aclScanner and ownerScanner after create them. 
{{aclScanner}} and {{ownerScanner}} are created by createScanner in 
{{getApplicationAcls}} and {{getApplicationOwner}} and are never closed. 
{{TFile.Reader.Scanner}} implement java.io.Closeable. We should close them 
after use them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Chris Nauroth
+1

Thank you to Allen for the script, and thank you to Andrew for
volunteering to drive the conversion.

--Chris Nauroth




On 7/2/15, 2:01 PM, "Andrew Wang"  wrote:

>Hi all,
>
>I want to revive the discussion on this thread, since the overhead of
>CHANGES.txt came up again in the context of backporting fixes for
>maintenance releases.
>
>Allen's automatic generation script (HADOOP-11731) went into trunk but not
>branch-2, so we're still maintaining CHANGES.txt everywhere. What do
>people
>think about backporting this to branch-2 and then removing CHANGES.txt
>from
>trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
>HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of
>information, and JIRA is at least as reliable and probably much more so.
>Thus I don't see any downsides to backporting it.
>
>Would like to hear everyone's thoughts on this, I'm willing to drive the
>effort.
>
>Thanks,
>Andrew
>
>On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze 
>wrote:
>
>> Generating change log from JIRA is a good idea.  It bases on an
>>assumption
>> that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect
>>the
>> committed change. Unfortunately, the assumption is invalid for many
>>cases
>> since we never enforce that the JIRA summary must be the same as the
>>change
>> log.  We may compare the current CHANGES.txt with the generated change
>> log.  I beg the diff is long.
>> Besides, after a release R1 is out, someone may (accidentally or
>> intentionally) modify the JIRA summary.  Then, the entry for the same
>>item
>> in a later release R2 could be different from the one in R1.
>> I agree that manually editing CHANGES.txt is not a perfect solution.
>> However, it works well in the past for many releases.  I suggest we keep
>> the current dev workflow.  Try using the new script provided by
>> HADOOP-11731 to generate the next release.  If everything works well, we
>> shell remove CHANGES.txt and revise the dev workflow.  What do you
>>think?
>> Regards,Tsz-Wo
>>
>>
>>  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer <
>> a...@altiscale.com> wrote:
>>
>>
>>
>>
>>
>> On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli <
>> vino...@hortonworks.com> wrote:
>>
>> >
>> > We'd then doing two commits for every patch. Let's simply not remove
>> CHANGES.txt from trunk, keep the existing dev workflow, but doc the
>>release
>> process to remove CHANGES.txt in trunk at the time of a release going
>>out
>> of trunk.
>>
>>
>>
>> Might as well copy branch-2¹s changes.txt into trunk then. (or 2.7¹s.
>> Last I looked, people updated branch-2 and not 2.7¹s or vice versa for
>>some
>> patches that went into both branches.)  So that folks who are
>>committing to
>> both branches and want to cherry pick all changes can.
>>
>> I mean, trunk¹s is very very very wrong. Right now. Today. Borderline
>> useless. See HADOOP-11718 (which I will now close out as won¹t fix)Š and
>> that jira is only what is miscategorized, not what is missing.
>>
>>
>>
>>



Re: IMPORTANT: automatic changelog creation

2015-07-02 Thread Andrew Wang
Hi all,

I want to revive the discussion on this thread, since the overhead of
CHANGES.txt came up again in the context of backporting fixes for
maintenance releases.

Allen's automatic generation script (HADOOP-11731) went into trunk but not
branch-2, so we're still maintaining CHANGES.txt everywhere. What do people
think about backporting this to branch-2 and then removing CHANGES.txt from
trunk/branch-2 (HADOOP-11792)? Based on discussion on this thread and in
HADOOP-11731, we seem to agree that CHANGES.txt is an unreliable source of
information, and JIRA is at least as reliable and probably much more so.
Thus I don't see any downsides to backporting it.

Would like to hear everyone's thoughts on this, I'm willing to drive the
effort.

Thanks,
Andrew

On Thu, Apr 2, 2015 at 2:00 PM, Tsz Wo Sze 
wrote:

> Generating change log from JIRA is a good idea.  It bases on an assumption
> that each JIRA has an accurate summary (a.k.a. JIRA title) to reflect the
> committed change. Unfortunately, the assumption is invalid for many cases
> since we never enforce that the JIRA summary must be the same as the change
> log.  We may compare the current CHANGES.txt with the generated change
> log.  I beg the diff is long.
> Besides, after a release R1 is out, someone may (accidentally or
> intentionally) modify the JIRA summary.  Then, the entry for the same item
> in a later release R2 could be different from the one in R1.
> I agree that manually editing CHANGES.txt is not a perfect solution.
> However, it works well in the past for many releases.  I suggest we keep
> the current dev workflow.  Try using the new script provided by
> HADOOP-11731 to generate the next release.  If everything works well, we
> shell remove CHANGES.txt and revise the dev workflow.  What do you think?
> Regards,Tsz-Wo
>
>
>  On Thursday, April 2, 2015 12:57 PM, Allen Wittenauer <
> a...@altiscale.com> wrote:
>
>
>
>
>
> On Apr 2, 2015, at 12:40 PM, Vinod Kumar Vavilapalli <
> vino...@hortonworks.com> wrote:
>
> >
> > We'd then doing two commits for every patch. Let's simply not remove
> CHANGES.txt from trunk, keep the existing dev workflow, but doc the release
> process to remove CHANGES.txt in trunk at the time of a release going out
> of trunk.
>
>
>
> Might as well copy branch-2’s changes.txt into trunk then. (or 2.7’s.
> Last I looked, people updated branch-2 and not 2.7’s or vice versa for some
> patches that went into both branches.)  So that folks who are committing to
> both branches and want to cherry pick all changes can.
>
> I mean, trunk’s is very very very wrong. Right now. Today. Borderline
> useless. See HADOOP-11718 (which I will now close out as won’t fix)… and
> that jira is only what is miscategorized, not what is missing.
>
>
>
>


[jira] [Created] (YARN-3881) Writing RM cluster-level metrics

2015-07-02 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-3881:
-

 Summary: Writing RM cluster-level metrics
 Key: YARN-3881
 URL: https://issues.apache.org/jira/browse/YARN-3881
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen


RM has a bunch of metrics that we may want to write into the timeline backend 
to. I attached the metrics.json that I've crawled via 
{{http://localhost:8088/jmx?qry=Hadoop:*}}. IMHO, we need to pay attention to 
three groups of metrics:

1. QueueMetrics
2. JvmMetrics
3. ClusterMetrics

The problem is that unlike other metrics belongs to a single application, these 
ones belongs to RM or cluster-wide. Therefore, current write path is not going 
to work for these metrics because they don't have the associated user/flow/app 
context info. We need to rethink of modeling cross-app metrics and the api to 
handle them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3880) Writing more RM side app-level metrics

2015-07-02 Thread Zhijie Shen (JIRA)
Zhijie Shen created YARN-3880:
-

 Summary: Writing more RM side app-level metrics
 Key: YARN-3880
 URL: https://issues.apache.org/jira/browse/YARN-3880
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen


In YARN-3044, we implemented an analog of metrics publisher for ATS v1. While 
it helps to write app/attempt/container life cycle events, it really doesn't 
write  as many app-level system metrics that RM are now having.  Just list the 
metrics that I found missing:

* runningContainers
* memorySeconds
* vcoreSeconds
* preemptedResourceMB
* preemptedResourceVCores
* numNonAMContainerPreempted
* numAMContainerPreempted

Please feel fee to add more into the list if you find it's not covered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.7.1 RC0

2015-07-02 Thread Masatake Iwasaki

+1 (non-binding)

+ verified mds of source and binary tarball
+ built from source tarball
+ deployed binary tarball to 4 nodes cluster and run some 
hadoop-mapreduce-examples jobs


Thanks,
Masatake Iwasaki


On 6/29/15 17:45, Vinod Kumar Vavilapalli wrote:

Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.1.

As discussed before, this is the next stable release to follow up 2.6.0,
and the first stable one in the 2.7.x line.

The RC is available for validation at:
*http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/
*

The RC tag in git is: release-2.7.1-RC0

The maven artifacts are available via repository.apache.org at
*https://repository.apache.org/content/repositories/orgapachehadoop-1019/
*

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

PS: It took 2 months instead of the planned [1] 2 weeks in getting this
release out: post-mortem in a separate thread.

[1]: A 2.7.1 release to follow up 2.7.0
http://markmail.org/thread/zwzze6cqqgwq4rmw