Re: Hadoop 3.2 Release Plan proposal

2018-08-30 Thread Virajith Jalaparti
Hi Sunil,

Quick correction on the task list  (missed this earlier) -- HDFS-12615 is
being done by Inigo Goiri

-Virajith



On Thu, Aug 30, 2018 at 9:30 AM Sunil G  wrote:

> Hi All,
>
> Inline with earlier communication dated 17th July 2018, I would like to
> provide some updates.
>
> We are approaching previously proposed code freeze date (Aug 31).
>
> One of the critical feature Node Attributes feature merge discussion/vote
> is ongoing. Also few other Blocker bugs need a bit more time. With regard
> to this, suggesting to push the feature/code freeze for 2 more weeks to
> accommodate these jiras too.
>
> Proposing Updated changes in plan inline with this:
> Feature freeze date : all features to merge by September 7, 2018.
> Code freeze date : blockers/critical only, no improvements and
>  blocker/critical bug-fixes September 14, 2018.
> Release date: September 28, 2018
>
> If any features in branch which are targeted to 3.2.0, please reply to this
> email thread.
>
> *Here's an updated 3.2.0 feature status:*
>
> 1. Merged & Completed features:
>
> - (Wangda) YARN-8561: Hadoop Submarine project for DeepLearning workloads
> Initial cut.
> - (Uma) HDFS-10285: HDFS Storage Policy Satisfier
> - (Sunil) YARN-7494: Multi Node scheduling support in Capacity Scheduler.
> - (Chandni/Eric) YARN-7512: Support service upgrade via YARN Service API
> and CLI.
>
> 2. Features close to finish:
>
> - (Naga/Sunil) YARN-3409: Node Attributes support in YARN. Merge/Vote
> Ongoing.
> - (Rohith) YARN-5742: Serve aggregated logs of historical apps from ATSv2.
> Patch in progress.
> - (Virajit) HDFS-12615: Router-based HDFS federation. Improvement works.
> - (Steve) S3Guard Phase III, S3a phase V, Support Windows Azure Storage. In
> progress.
>
> 3. Tentative features:
>
> - (Haibo Chen) YARN-1011: Resource overcommitment. Looks challenging to be
> done before Aug 2018.
> - (Eric) YARN-7129: Application Catalog for YARN applications. Challenging
> as more discussions are on-going.
>
> *Summary of 3.2.0 issues status:*
>
> 26 Blocker and Critical issues [1] are open, I am following up with owners
> to get status on each of them to get in by Code Freeze date.
>
> [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND priority in (Blocker,
> Critical) AND resolution = Unresolved AND "Target Version/s" = 3.2.0 ORDER
> BY priority DESC
>
> Thanks,
> Sunil
>
> On Tue, Aug 14, 2018 at 10:30 PM Sunil G  wrote:
>
> > Hi All,
> >
> > Thanks for the feedbacks. Inline with earlier communication dated 17th
> > July 2018, I would like to provide some updates.
> >
> > We are approaching previously proposed feature freeze date (Aug 21, about
> > 7 days from today).
> > If any features in branch which are targeted to 3.2.0, please reply to
> > this email thread.
> > Steve has mentioned about the s3 features which will come close to Code
> > Freeze Date (Aug 31st).
> >
> > *Here's an updated 3.2.0 feature status:*
> >
> > 1. Merged & Completed features:
> >
> > - (Wangda) YARN-8561: Hadoop Submarine project for DeepLearning workloads
> > Initial cut.
> > - (Uma) HDFS-10285: HDFS Storage Policy Satisfier
> >
> > 2. Features close to finish:
> >
> > - (Naga/Sunil) YARN-3409: Node Attributes support in YARN. Major patches
> > are all in, only one last
> > patch is in review state.
> > - (Sunil) YARN-7494: Multi Node scheduling support in Capacity Scheduler.
> > Close to commit.
> > - (Chandni/Eric) YARN-7512: Support service upgrade via YARN Service API
> > and CLI. 2 patches are pending
> > which will be closed by Feature freeze date.
> > - (Rohith) YARN-5742: Serve aggregated logs of historical apps from
> ATSv2.
> > Patch in progress.
> > - (Virajit) HDFS-12615: Router-based HDFS federation. Improvement works.
> > - (Steve) S3Guard Phase III, S3a phase V, Support Windows Azure Storage.
> > In progress.
> >
> > 3. Tentative features:
> >
> > - (Haibo Chen) YARN-1011: Resource overcommitment. Looks challenging to
> be
> > done before Aug 2018.
> > - (Eric) YARN-7129: Application Catalog for YARN applications.
> Challenging
> > as more discussions are on-going.
> >
> > *Summary of 3.2.0 issues status:*
> >
> > 39 Blocker and Critical issues [1] are open, I am checking with owners to
> > get status on each of them to get in by Code Freeze date.
> >
> > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND priority in (Blocker,
> > Critical) AND resolution = Unresolved AND "Target Version/s" = 3.2.0
> ORDER
> > BY priority DESC
> >
> > Thanks,
> > Sunil
> >
> > On Fri, Jul 20, 2018 at 8:03 AM Sunil G  wrote:
> >
> >> Thanks Subru for the thoughts.
> >> One of the main reason for a major release is to push out critical
> >> features with a faster cadence to the users. If we are pulling more and
> >> more different types of features to a minor release, that branch will
> >> become more destabilized and it may be tough to say that 3.1.2 is stable
> >> that 3.1.1 for eg. We always tend to improve and stabilize features in
> >> subsequent minor release.
> >> F

Re: [VOTE] Release Apache Hadoop 3.2.0 - RC1

2019-01-14 Thread Virajith Jalaparti
Thanks Sunil and others who have worked on the making this release happen!

+1 (non-binding)

- Built from source
- Deployed a pseudo-distributed one node cluster
- Ran basic wordcount, sort, pi jobs
- Basic HDFS/WebHDFS commands
- Ran all the ABFS driver tests against an ADLS Gen 2 account in EAST US

Non-blockers (AFAICT): The following tests in ABFS (HADOOP-15407) fail:
- For ACLs ({{ITestAzureBlobFilesystemAcl}}) -- However, I believe these
have been fixed in trunk.
- {{ITestAzureBlobFileSystemE2EScale#testWriteHeavyBytesToFileAcrossThreads}}
fails with an OutOfMemoryError exception. I see the same failure on trunk
as well.


On Mon, Jan 14, 2019 at 6:21 AM Elek, Marton  wrote:

> Thanks Sunil to manage this release.
>
> +1 (non-binding)
>
> 1. built from the source (with clean local maven repo)
> 2. verified signatures + checksum
> 3. deployed 3 node cluster to Google Kubernetes Engine with generated
> k8s resources [1]
> 4. Executed basic HDFS commands
> 5. Executed basic yarn example jobs
>
> Marton
>
> [1]: FTR: resources:
> https://github.com/flokkr/k8s/tree/master/examples/hadoop , generator:
> https://github.com/elek/flekszible
>
>
> On 1/8/19 12:42 PM, Sunil G wrote:
> > Hi folks,
> >
> >
> > Thanks to all of you who helped in this release [1] and for helping to
> vote
> > for RC0. I have created second release candidate (RC1) for Apache Hadoop
> > 3.2.0.
> >
> >
> > Artifacts for this RC are available here:
> >
> > http://home.apache.org/~sunilg/hadoop-3.2.0-RC1/
> >
> >
> > RC tag in git is release-3.2.0-RC1.
> >
> >
> >
> > The maven artifacts are available via repository.apache.org at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1178/
> >
> >
> > This vote will run 7 days (5 weekdays), ending on 14th Jan at 11:59 pm
> PST.
> >
> >
> >
> > 3.2.0 contains 1092 [2] fixed JIRA issues since 3.1.0. Below feature
> > additions
> >
> > are the highlights of this release.
> >
> > 1. Node Attributes Support in YARN
> >
> > 2. Hadoop Submarine project for running Deep Learning workloads on YARN
> >
> > 3. Support service upgrade via YARN Service API and CLI
> >
> > 4. HDFS Storage Policy Satisfier
> >
> > 5. Support Windows Azure Storage - Blob file system in Hadoop
> >
> > 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
> >
> > 7. Improvements in Router-based HDFS federation
> >
> >
> >
> > Thanks to Wangda, Vinod, Marton for helping me in preparing the release.
> >
> > I have done few testing with my pseudo cluster. My +1 to start.
> >
> >
> >
> > Regards,
> >
> > Sunil
> >
> >
> >
> > [1]
> >
> >
> https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
> >
> > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.2.0)
> > AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status = Resolved
> > ORDER BY fixVersion ASC
> >
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


[jira] [Created] (HADOOP-16762) Add support for Filesystem#getFileChecksum in ABFS driver

2019-12-13 Thread Virajith Jalaparti (Jira)
Virajith Jalaparti created HADOOP-16762:
---

 Summary: Add support for Filesystem#getFileChecksum in ABFS driver
 Key: HADOOP-16762
 URL: https://issues.apache.org/jira/browse/HADOOP-16762
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Virajith Jalaparti


Currently, ABFS driver does not support Filesystem#getFileChecksum even though 
the underlying ADLS REST API does.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.

2020-05-07 Thread Virajith Jalaparti (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti reopened HADOOP-15565:
-

Re-opening this issue to backport to 2.10.

> ViewFileSystem.close doesn't close child filesystems and causes FileSystem 
> objects leak.
> 
>
> Key: HADOOP-15565
> URL: https://issues.apache.org/jira/browse/HADOOP-15565
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch, 
> HADOOP-15565.0003.patch, HADOOP-15565.0004.patch, HADOOP-15565.0005.patch, 
> HADOOP-15565.0006.bak, HADOOP-15565.0006.patch, HADOOP-15565.0007.patch, 
> HADOOP-15565.0008.patch
>
>
> ViewFileSystem.close() does nothing but remove itself from FileSystem.CACHE. 
> It's children filesystems are cached in FileSystem.CACHE and shared by all 
> the ViewFileSystem instances. We could't simply close all the children 
> filesystems because it will break the semantic of FileSystem.newInstance().
> We might add an inner cache to ViewFileSystem, let it cache all the children 
> filesystems. The children filesystems are not shared any more. When 
> ViewFileSystem is closed we close all the children filesystems in the inner 
> cache. The ViewFileSystem is still cached by FileSystem.CACHE so there won't 
> be too many FileSystem instances.
> The FileSystem.CACHE caches the ViewFileSysem instance and the other 
> instances(the children filesystems) are cached in the inner cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17072) Add getClusterRoot and getClusterRoots methods to FileSystem and ViewFilesystem

2020-06-16 Thread Virajith Jalaparti (Jira)
Virajith Jalaparti created HADOOP-17072:
---

 Summary: Add getClusterRoot and getClusterRoots methods to 
FileSystem and ViewFilesystem
 Key: HADOOP-17072
 URL: https://issues.apache.org/jira/browse/HADOOP-17072
 Project: Hadoop Common
  Issue Type: Task
  Components: fs, viewfs
Reporter: Virajith Jalaparti


In a federated setting (HDFS federation, federation across multiple buckets on 
S3, multiple containers across Azure storage), certain system tools/pipelines 
require the ability to map paths to the clusters/accounts.

Consider GDPR compliance/retention jobs need to go over the datasets ingested 
over a period of T days and remove/quarantine datasets that are not properly 
annotated/have reached their retention period. Such jobs can rely on renames to 
a global trash/quarantine directory to accomplish their task. However, in a 
federated setting, efficient, atomic renames (as those within a single HDFS 
cluster) are not supported across the different clusters/shards in federation. 
As a result, such jobs will need to get the clusters to which different paths 
map to.

To address such cases, this JIRA proposed to get add two new methods to 
{{FileSystem}}: {{getClusterRoot}} and {{getClusterRoots()}}.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15292) Distcp's use of pread is slowing it down.

2018-03-05 Thread Virajith Jalaparti (JIRA)
Virajith Jalaparti created HADOOP-15292:
---

 Summary: Distcp's use of pread is slowing it down.
 Key: HADOOP-15292
 URL: https://issues.apache.org/jira/browse/HADOOP-15292
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Virajith Jalaparti


Distcp currently uses positioned-reads (in RetriableFileCopyCommand#copyBytes) 
when the source offset is > 0. This results in unnecessary overheads (new 
BlockReader being created on the client-side, multiple readBlock() calls to the 
Datanodes, each of requires the creation of a BlockSender and an inputstream to 
the ReplicaInfo).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18515) Backport HADOOP-17612 to branch-3.3

2022-11-07 Thread Virajith Jalaparti (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti resolved HADOOP-18515.
-
Resolution: Fixed

> Backport HADOOP-17612 to branch-3.3
> ---
>
> Key: HADOOP-18515
> URL: https://issues.apache.org/jira/browse/HADOOP-18515
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: auth, common, nfs, registry
>Affects Versions: 3.3.5
>Reporter: Melissa You
>Assignee: Melissa You
>Priority: Major
>  Labels: pull-request-available
>
> This is a sub-task of HADOOP-18518 to upgrade zk&curator on 3.3 branches. 
> It is a clean cherry pick from [https://github.com/apache/hadoop/pull/3241] .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org