Re: [DISCUSS] A unified and open Hadoop community sync up schedule?

2019-06-14 Thread Wangda Tan
And please let me know if you can help with coordinate logistics stuff,
cross-checking, etc. Let's spend some time next week to get it finalized.

Thanks,
Wangda

On Fri, Jun 14, 2019 at 4:00 PM Wangda Tan  wrote:

> Hi Folks,
>
> Yufei: Agree with all your opinions.
>
> Anu: it might be more efficient to use Google doc to track meeting minutes
> and we can put them together.
>
> I just put the proposal to
> https://calendar.google.com/calendar/b/3?cid=aGFkb29wLmNvbW11bml0eS5zeW5jLnVwQGdtYWlsLmNvbQ,
> you can check if the proposal time works or not. If you agree, we can go
> ahead to add meeting link, google doc, etc.
>
> If you want to have edit permissions, please drop a private email to me so
> I will add you.
>
> We still need more hosts, in each track, ideally we should have at least 3
> hosts per track just like HDFS blocks :), please volunteer, so we can have
> enough members to run the meeting.
>
> Let's shoot by end of the next week, let's get all logistics done and
> starting community sync up series from the week of Jun 25th.
>
> Thanks,
> Wangda
>
> Thanks,
> Wangda
>
>
>
> On Tue, Jun 11, 2019 at 10:23 AM Anu Engineer 
> wrote:
>
>> For Ozone, we have started using the Wiki itself as the agenda and after
>> the meeting is over, we convert it into the meeting notes.
>> Here is an example, the project owner can edit and maintain it, it is
>> like 10 mins work - and allows anyone to add stuff into the agenda too.
>>
>>
>> https://cwiki.apache.org/confluence/display/HADOOP/2019-06-10+Meeting+notes
>>
>> --Anu
>>
>> On Tue, Jun 11, 2019 at 10:20 AM Yufei Gu  wrote:
>>
>>> +1 for this idea. Thanks Wangda for bringing this up.
>>>
>>> Some comments to share:
>>>
>>>- Agenda needed to be posted ahead of meeting and welcome any
>>> interested
>>>party to contribute to topics.
>>>- We should encourage more people to attend. That's whole point of the
>>>meeting.
>>>- Hopefully, this can mitigate the situation that some patches are
>>>waiting for review for ever, which turns away new contributors.
>>>- 30m per session sounds a little bit short, we can try it out and see
>>>if extension is needed.
>>>
>>> Best,
>>>
>>> Yufei
>>>
>>> `This is not a contribution`
>>>
>>>
>>> On Fri, Jun 7, 2019 at 4:39 PM Wangda Tan  wrote:
>>>
>>> > Hi Hadoop-devs,
>>> >
>>> > Previous we have regular YARN community sync up (1 hr, biweekly, but
>>> not
>>> > open to public). Recently because of changes in our schedules, Less
>>> folks
>>> > showed up in the sync up for the last several months.
>>> >
>>> > I saw the K8s community did a pretty good job to run their sig
>>> meetings,
>>> > there's regular meetings for different topics, notes, agenda, etc.
>>> Such as
>>> >
>>> >
>>> https://docs.google.com/document/d/13mwye7nvrmV11q9_Eg77z-1w3X7Q1GTbslpml4J7F3A/edit
>>> >
>>> >
>>> > For Hadoop community, there are less such regular meetings open to the
>>> > public except for Ozone project and offline meetups or
>>> Bird-of-Features in
>>> > Hadoop/DataWorks Summit. Recently we have a few folks joined DataWorks
>>> > Summit at Washington DC and Barcelona, and lots (50+) of folks join the
>>> > Ozone/Hadoop/YARN BoF, ask (good) questions and roadmaps. I think it is
>>> > important to open such conversations to the public and let more
>>> > folk/companies join.
>>> >
>>> > Discussed a small group of community members and wrote a short proposal
>>> > about the form, time and topic of the community sync up, thanks for
>>> > everybody who have contributed to the proposal! Please feel free to add
>>> > your thoughts to the Proposal Google doc
>>> > <
>>> >
>>> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
>>> > >
>>> > .
>>> >
>>> > Especially for the following parts:
>>> > - If you have interests to run any of the community sync-ups, please
>>> put
>>> > your name to the table inside the proposal. We need more volunteers to
>>> help
>>> > run the sync-ups in different timezones.
>>> > - Please add suggestions to the time, frequency and themes and feel
>>> free to
>>> > share your thoughts if we should do sync ups for other topics which
>>> are not
>>> > covered by the proposal.
>>> >
>>> > Link to the Proposal Google doc
>>> > <
>>> >
>>> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
>>> > >
>>> >
>>> > Thanks,
>>> > Wangda Tan
>>> >
>>>
>>


Re: [DISCUSS] A unified and open Hadoop community sync up schedule?

2019-06-14 Thread Wangda Tan
Hi Folks,

Yufei: Agree with all your opinions.

Anu: it might be more efficient to use Google doc to track meeting minutes
and we can put them together.

I just put the proposal to
https://calendar.google.com/calendar/b/3?cid=aGFkb29wLmNvbW11bml0eS5zeW5jLnVwQGdtYWlsLmNvbQ,
you can check if the proposal time works or not. If you agree, we can go
ahead to add meeting link, google doc, etc.

If you want to have edit permissions, please drop a private email to me so
I will add you.

We still need more hosts, in each track, ideally we should have at least 3
hosts per track just like HDFS blocks :), please volunteer, so we can have
enough members to run the meeting.

Let's shoot by end of the next week, let's get all logistics done and
starting community sync up series from the week of Jun 25th.

Thanks,
Wangda

Thanks,
Wangda



On Tue, Jun 11, 2019 at 10:23 AM Anu Engineer 
wrote:

> For Ozone, we have started using the Wiki itself as the agenda and after
> the meeting is over, we convert it into the meeting notes.
> Here is an example, the project owner can edit and maintain it, it is like
> 10 mins work - and allows anyone to add stuff into the agenda too.
>
> https://cwiki.apache.org/confluence/display/HADOOP/2019-06-10+Meeting+notes
>
> --Anu
>
> On Tue, Jun 11, 2019 at 10:20 AM Yufei Gu  wrote:
>
>> +1 for this idea. Thanks Wangda for bringing this up.
>>
>> Some comments to share:
>>
>>- Agenda needed to be posted ahead of meeting and welcome any
>> interested
>>party to contribute to topics.
>>- We should encourage more people to attend. That's whole point of the
>>meeting.
>>- Hopefully, this can mitigate the situation that some patches are
>>waiting for review for ever, which turns away new contributors.
>>- 30m per session sounds a little bit short, we can try it out and see
>>if extension is needed.
>>
>> Best,
>>
>> Yufei
>>
>> `This is not a contribution`
>>
>>
>> On Fri, Jun 7, 2019 at 4:39 PM Wangda Tan  wrote:
>>
>> > Hi Hadoop-devs,
>> >
>> > Previous we have regular YARN community sync up (1 hr, biweekly, but not
>> > open to public). Recently because of changes in our schedules, Less
>> folks
>> > showed up in the sync up for the last several months.
>> >
>> > I saw the K8s community did a pretty good job to run their sig meetings,
>> > there's regular meetings for different topics, notes, agenda, etc. Such
>> as
>> >
>> >
>> https://docs.google.com/document/d/13mwye7nvrmV11q9_Eg77z-1w3X7Q1GTbslpml4J7F3A/edit
>> >
>> >
>> > For Hadoop community, there are less such regular meetings open to the
>> > public except for Ozone project and offline meetups or Bird-of-Features
>> in
>> > Hadoop/DataWorks Summit. Recently we have a few folks joined DataWorks
>> > Summit at Washington DC and Barcelona, and lots (50+) of folks join the
>> > Ozone/Hadoop/YARN BoF, ask (good) questions and roadmaps. I think it is
>> > important to open such conversations to the public and let more
>> > folk/companies join.
>> >
>> > Discussed a small group of community members and wrote a short proposal
>> > about the form, time and topic of the community sync up, thanks for
>> > everybody who have contributed to the proposal! Please feel free to add
>> > your thoughts to the Proposal Google doc
>> > <
>> >
>> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
>> > >
>> > .
>> >
>> > Especially for the following parts:
>> > - If you have interests to run any of the community sync-ups, please put
>> > your name to the table inside the proposal. We need more volunteers to
>> help
>> > run the sync-ups in different timezones.
>> > - Please add suggestions to the time, frequency and themes and feel
>> free to
>> > share your thoughts if we should do sync ups for other topics which are
>> not
>> > covered by the proposal.
>> >
>> > Link to the Proposal Google doc
>> > <
>> >
>> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
>> > >
>> >
>> > Thanks,
>> > Wangda Tan
>> >
>>
>


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-06-14 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/

[Jun 13, 2019 2:44:47 AM] (wwei) YARN-9578. Add limit/actions/summarize options 
for app activities REST
[Jun 13, 2019 3:08:15 AM] (xyao) HDDS-1587. Support dynamically adding 
delegated classes from to isolated
[Jun 13, 2019 6:08:35 PM] (gifuma) YARN-9599. 
TestContainerSchedulerQueuing#testQueueShedding fails
[Jun 13, 2019 11:04:14 PM] (bharat) HDDS-1677. Auditparser robot test shold use 
a world writable working
[Jun 13, 2019 11:18:15 PM] (bharat) HDDS-1680. Create missing parent 
directories during the creation of
[Jun 14, 2019 1:17:25 AM] (tasanuma) HADOOP-16369. Fix zstandard shortname 
misspelled as zts. Contributed by
[Jun 14, 2019 1:26:53 AM] (inigoiri) HDFS-14560. Allow block replication 
parameters to be refreshable.




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore
 
   Unread field:TimelineEventSubDoc.java:[line 56] 
   Unread field:TimelineMetricSubDoc.java:[line 44] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core
 
   Class org.apache.hadoop.applications.mawo.server.common.TaskStatus 
implements Cloneable but does not define or use clone method At 
TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 
39-346] 
   Equals method for 
org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument 
is of type WorkerId At WorkerId.java:the argument is of type WorkerId At 
WorkerId.java:[line 114] 
   
org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does 
not check for null argument At WorkerId.java:null argument At 
WorkerId.java:[lines 114-115] 

Failed junit tests :

   hadoop.hdfs.TestDFSClientRetries 
   
hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.tools.TestDFSAdminWithHA 
   hadoop.mapreduce.v2.app.TestRuntimeEstimators 
   hadoop.yarn.sls.appmaster.TestAMSimulator 
   hadoop.ozone.container.common.impl.TestHddsDispatcher 
   hadoop.hdds.scm.node.TestNodeReportHandler 
   hadoop.ozone.client.rpc.TestOzoneAtRestEncryption 
   hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis 
   hadoop.ozone.client.rpc.TestOzoneRpcClient 
   
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion 
   hadoop.ozone.client.rpc.TestSecureOzoneRpcClient 
   hadoop.ozone.client.rpc.TestWatchForCommit 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-compile-javac-root.txt
  [332K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-patch-hadolint.txt
  [8.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-patch-pylint.txt
  [120K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/diff-patch-shelldocs.txt
  [44K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/whitespace-eol.txt
  [9.6M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/whitespace-tabs.txt
  [1.1M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-documentstore-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-mawo_hadoop-yarn-applications-mawo-core-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1167/artifact/out/branch-findbugs-hadoop-submarine_hadoop-sub

Re: Mapreduce to and from public clouds

2019-06-14 Thread Amit Kabra
Any help here ?

On Thu, Jun 13, 2019 at 12:38 PM Amit Kabra  wrote:

> Hello,
>
> I have a requirement where I need to read/write data to public cloud via
> map reduce job.
>
> Our systems currently read and write of data from hdfs using mapreduce and
> its working well, we write data in sequencefile format.
>
> We might have to move data to public cloud i.e s3 / gcp. Where everything
> remains same just we do read/write to s3/gcp
>
> I did quick search for gcp and I didn't get much info on doing mapreduce
> directly from it. GCS connector for hadoop
> 
> looks closest but I didn't find any map reduce sample for the same.
>
> Any help on where to start for it or is it not even possible say s3/gcp
> outputformat
> 
> not there ,etc and we need to do some hack.
>
> Thanks,
> Amit Kabra.
>
>
>


Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-06-14 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/

[Jun 14, 2019 1:21:30 AM] (tasanuma) HADOOP-16369. Fix zstandard shortname 
misspelled as zts. Contributed by




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient 
non-serializable instance field map In GlobalStorageStatistics.java:instance 
field map In GlobalStorageStatistics.java 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.hdfs.server.namenode.TestDecommissioningStatus 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.datanode.TestDataNodeUUID 
   hadoop.fs.contract.router.web.TestRouterWebHDFSContractSeek 
   hadoop.registry.secure.TestSecureLogins 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
   hadoop.yarn.client.api.impl.TestAMRMProxy 
   hadoop.yarn.sls.TestSLSRunner 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-compile-cc-root-jdk1.8.0_212.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-compile-javac-root-jdk1.8.0_212.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/352/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_212.txt
  [1.1M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-lin