[jira] [Resolved] (HADOOP-13249) RetryInvocationHandler need wrap InterruptedException in IOException when call Thread.sleep
[ https://issues.apache.org/jira/browse/HADOOP-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HADOOP-13249. Resolution: Fixed Fix Version/s: 2.9.0 Thanks for the contribution, Zhihai! I've committed this into trunk and branch-2. Please see if you want to also include it in 2.8, [~zxu]. > RetryInvocationHandler need wrap InterruptedException in IOException when > call Thread.sleep > --- > > Key: HADOOP-13249 > URL: https://issues.apache.org/jira/browse/HADOOP-13249 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 2.8.0 >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.9.0 > > Attachments: HADOOP-13249.000.patch, HADOOP-13249.001.patch, > HADOOP-13249.002.patch > > > RetryInvocationHandler need wrap InterruptedException in IOException when > call Thread.sleep. Otherwise InterruptedException can't be handled correctly > by other components such as HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: [Important] Lots of 2.8.0 commits are in branch-2 only
Thanks for reporting the issue, Wangda. Recreating a new branch-2.8 based on the current branch-2 makes sense to me. Thanks, -Jing On Wed, Dec 16, 2015 at 5:38 PM, Wangda Tanwrote: > Hi folks, > > I found there're lots of commits are in branch-2 only. > Ran "git log branch-2.8..branch-2". > > There're 35 commits for YARN, 41 commits for HDFS and 12 commits for > COMMON. Only several of them are placed in 2.9.0 section in CHANGES.txt. > > I think we can either hard reset branch-2.8 to branch-2 and make necessary > changes such as revert 2.9.0-only patches or backport them one by one. Any > suggestions? > > Thanks, > Wangda >
Re: [VOTE] Release Apache Hadoop 2.6.2
+1 (binding) Download the tarball and run some HDFS tests and MR examples in a pseudo-distributed cluster. Thanks, -Jing On Mon, Oct 26, 2015 at 2:06 PM, Vinod Vavilapalliwrote: > I was helping Sangjin offline with the release. > > We briefly discussed the KEYS problem before, but it missed my attention. > > I will get his KEYS committed right-away, the release is testable right > away though. > > Regarding the voting period, let’s continue voting for two more days, the > period also had the weekend during which a lot of people (atleast myself > and team) didn’t pay attention to this vote. > > Thanks > +Vinod > > > > On Oct 26, 2015, at 1:50 PM, Andrew Wang > wrote: > > > > Hey Sangjin, did you add your release signing keys to the KEYS file? I > > don't see it here: > > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > > > Also only PMC votes are binding on releases, so I think we currently > still > > stand at 0 binding +1s. > > > > On Mon, Oct 26, 2015 at 1:28 PM, Sangjin Lee wrote: > > > >> That makes sense. Thanks for pointing that out. The git commit id is > >> 0cfd050febe4a30b1ee1551dcc527589509fb681. > >> > >> On Mon, Oct 26, 2015 at 12:25 PM, Steve Loughran < > ste...@hortonworks.com> > >> wrote: > >> > >>> > On 22 Oct 2015, at 22:14, Sangjin Lee wrote: > > Hi all, > > I have created a release candidate (RC0) for Hadoop 2.6.2. > > The RC is available at: > >>> http://people.apache.org/~sjlee/hadoop-2.6.2-RC0/ > > The RC tag in git is: release-2.6.2-RC0 > > >>> > >>> Tags can move; we should *never* vote for a release of one. > >>> > >>> What is the git commit # ? > >>> > The list of JIRAs committed for 2.6.2: > > >>> > >> > https://issues.apache.org/jira/browse/YARN-4101?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20fixVersion%20%3D%202.6.2 > > The maven artifacts are staged at > > >> > https://repository.apache.org/content/repositories/orgapachehadoop-1022/ > > Please try out the release candidate and vote. The vote will run for 5 > >>> days. > > Thanks, > Sangjin > >>> > >>> > >> > >
Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk
+1 I've been involved in both development and review on the branch, and I believe it's now ready to get merged into trunk. Many thanks to all the contributors and reviewers! Thanks, -Jing On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <kai.zh...@intel.com> wrote: > Non-binding +1 > > According to our extensive performance tests, striping + ISA-L coder based > erasure coding not only can save storage, but also can increase the > throughput of a client or a cluster. It will be a great addition to HDFS > and its users. Based on the latest branch codes, we also observed it's very > reliable in the concurrent tests. We'll provide the perf test report after > it's sorted out and hope it helps. > Thanks! > > Regards, > Kai > > -Original Message- > From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > Sent: Wednesday, September 23, 2015 8:50 AM > To: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk > > +1 > > Great addition to HDFS. Thanks all contributors for the nice work. > > Regards, > Uma > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: > > >Hi, > > > >I'd like to propose a vote to merge the HDFS-7285 feature branch back > >to trunk. Since November 2014 we have been designing and developing > >this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and > >have committed approximately 210 patches. > > > >The HDFS-7285 feature branch was created to support the first phase of > >HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to > >significantly reduce storage space usage in HDFS clusters. Instead of > >always creating 3 replicas of each block with 200% storage space > >overhead, HDFS-EC provides data durability through parity data blocks. > >With most EC configurations, the storage overhead is no more than 50%. > >Based on profiling results of production clusters, we decided to > >support EC with the striped block layout in the first phase, so that > >small files can be better handled. This means dividing each logical > >HDFS file block into smaller units (striping cells) and spreading them > >on a set of DataNodes in round-robin fashion. Parity cells are > >generated for each stripe of original data cells. We have made changes > >to NameNode, client, and DataNode to generalize the block concept and > >handle the mapping between a logical file block and its internal > >storage blocks. For further details please see the design doc on > >HDFS-7285. > >HADOOP-11264 focuses on providing flexible and high-performance codec > >calculation support. > > > >The nightly Jenkins job of the branch has reported several successful > >runs, and doesn't show new flaky tests compared with trunk. We have > >posted several versions of the test plan including both unit testing > >and cluster testing, and have executed most tests in the plan. The most > >basic functionalities have been extensively tested and verified in > >several real clusters with different hardware configurations; results > >have been very stable. We have created follow-on tasks for more > >advanced error handling and optimization under the umbrella HDFS-8031. > >We also plan to implement or harden the integration of EC with existing > >features such as WebHDFS, snapshot, append, truncate, hflush, hsync, > >and so forth. > > > >Development of this feature has been a collaboration across many > >companies and institutions. I'd like to thank J. Andreina, Takanobu > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, > >Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing > >Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews. > >Andrew and Kai Zheng also made fundamental contributions to the initial > >design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other > >contributors have made great efforts in system testing. Many thanks go > >to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius > >Rus, Suresh, as well as many others for providing helpful feedbacks. > > > >Following the community convention, this vote will last for 7 days > >(ending September 29th). Votes from Hadoop committers are binding but > >non-binding votes are very welcome as well. And here's my non-binding +1. > > > >Thanks, > >--- > >Zhe Zhang > >
Re: [VOTE] Using rebase and merge for feature branch development
+1. Thanks. On Mon, Aug 24, 2015 at 2:47 PM, Zhe Zhang zhezh...@cloudera.com wrote: +1 non-binding. Thanks Andrew! --- Zhe Zhang On Mon, Aug 24, 2015 at 2:38 PM, Karthik Kambatla ka...@cloudera.com wrote: +1 Thanks for driving this, Andrew. On Mon, Aug 24, 2015 at 11:00 AM, Vinayakumar B vinayakum...@apache.org wrote: +1, -Vinay On Aug 24, 2015 11:29 PM, Colin P. McCabe cmcc...@apache.org wrote: +1 cheers, Colin On Mon, Aug 24, 2015 at 10:04 AM, Steve Loughran ste...@hortonworks.com wrote: +1 (binding) On 21 Aug 2015, at 13:44, Andrew Wang andrew.w...@cloudera.com wrote: Hi common-dev, As promised, here is an official vote thread. Let's run it for the standard 7 days, closing on Aug 28th at noon. Only PMC members have binding votes, but of course everyone's input is welcomed. If the vote passes, I'll put the text on the website somewhere as recommended by Steve. Previous discussion threads: http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201508.mbox/%3CCAGB5D2bPWeV2Hk%2B67%3DDamWpVfLTM6nkjb_wG3n4%3DWAN890zqfA%40mail.gmail.com%3E http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201508.mbox/%3CCAGB5D2aDXujQjwdmadVtg2-qrPAJeOgCS2_NHydv8jke8or1UA%40mail.gmail.com%3E Proposal: Feature branch development can use either a merge or rebase workflow, as decided by contributors working on the branch. When using a rebase workflow, the feature branch is periodically rebased on trunk via git rebase trunk and force pushed. Before performing a force-push, a tag should be created of the current feature branch HEAD to preserve history. The tag should identify the feature and date of most recent commit, e.g. tag_feature_HDFS-7285_2015-08-11. It can also be convenient to use a temporary branch to review rebase conflict resolution before force-pushing the main feature branch, e.g. HDFS-7285-rebase. Temporary branches should be deleted after they are force-pushed over the feature branch. Developers are allowed to squash and reorder commits to make rebasing easier. Use this judiciously. When squashing, please maintain the original commit messages in the squashed commit message to preserve history. When using a merge workflow, changes are periodically integrated from trunk to the branch via git merge trunk. Merge conflict resolution can be reviewed by posting the diff of the merge commit. For both rebase and merge workflows, integration of the branch into trunk should happen via git merge --no-ff. --no-ff is important since it generates a merge commit even if the branch applies cleanly on top of trunk. This clearly denotes the set of commits that were made on the branch, and makes it easier to revert the branch if necessary. git merge --no-ff is also the preferred way of integrating a feature branch to other branches, e.g. branch-2. Thanks, Andrew
Re: [DISCUSS] git rebase vs. git merge for branch development
I think we should allow merge-based workflows. I worked and am working in several big feature branches, including HDFS-2802 (100 subtasks) and HDFS-7285 (currently already 200 subtasks), and tried both the merge-based and rebase-based workflows. When the feature change becomes big, the rebase will become a big pain, considering a small change in trunk can cause conflicts for rebasing large number of commits in the feature branch. Using git merge to merge trunk changes into the feature branch is much easier in this case. Thanks, -Jing On Mon, Aug 17, 2015 at 12:17 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hi all, I've thought about this topic more over the last week, and felt I should play devil's advocate for a merge workflow. A few comments: - The issue of merges polluting history is mainly an issue when using a github PR workflow, which results in one merge per PR. Clearly this is not okay, but a separate issue from feature branches. We only have a handful of merge commits per feature branch. - The issue of changes hiding in merge commits can happen when resolving rebase conflicts too, except it's harder to track. Right now neither go through code review, which is sketchy. We probably should review these too, and it's easier to review a single merge commit vs. an entire rebased branch. Merge is also a more natural way of integrating changes from trunk, since you just resolve all conflicts at once at the end. - Merge gives us a linear history on the branch but worse history on trunk/branch-2. Rebase has worse history on the branch but a linear history on trunk/branch-2. This means for quick/small feature branches that don't have a lot of conflicts, rebase is preferred. For large features with lots of conflicts, merge is preferred. This is basically what we're running into on HDFS-7285. - Rebase also comes with increased coordination costs, since public history is being rewritten. This is again okay for smaller efforts (where there are fewer contributors), but more painful with bigger ones. There have been a number of HDFS-7285 branches created basically as a result of rebase, with corresponding JIRA discussions about where to commit things. - The issue of a single squashed commit for the branch-2 backport is arguably an issue with how we structure our branches. If release branches forked off of trunk rather than branch-2, we wouldn't have this problem. We could require branch-2 integration to also happen via git merge. Or we kick trunk out to a feature branch based off of branch-2. Or we shrug and keep the status quo. I'd definitely appreciate commentary from others who've worked on feature branches in git, even in communities outside of Hadoop. If there is support for allowing merge-based workflows in addition to rebase, we'd need to kick off a [VOTE] thread since the last [VOTE] only allows rebase. Best, Andrew On Mon, Aug 17, 2015 at 11:33 AM, Andrew Wang andrew.w...@cloudera.com wrote: @Sangjin, I believe this is covered by the [VOTE] I linked to above, key excerpt being: 3. Force-push on feature-branches is allowed. Before pulling in a feature, the feature-branch should be rebased on latest trunk and the changes applied to trunk through git rebase --onto or git cherry-pick commit-range. This specifies that the last uprev final integration of the branch into trunk happen with rebase. It doesn't say anything about the periodic uprev's, but it'd be very strange to merge periodically and then rebase once at the end. So I take it to mean doing periodic uprevs with rebase too. On Mon, Aug 17, 2015 at 11:23 AM, Sangjin Lee sj...@apache.org wrote: Just to be clear, are we discussing the process of uprev'ing the feature development branch with the latest from the trunk from time to time, or making the final merge of the feature branch onto the trunk? On Mon, Aug 17, 2015 at 10:21 AM, Steve Loughran ste...@hortonworks.com wrote: I haven't done a bit piece of work in the ASF code repo since the migration to git; though I have done it in the svn era. Currently with private git repos -anyone gets SCM control of their source -you can commit for your own reasons (about to make a change, want a private jenkins run, ...) and gain from having many small checkins. More succinctly: if you aren't checking in your work 2+ times a day —why not? -rebasing a painful necessity on personal, private branches to keep the final patch to hadoop git a single diff With the private git process that's the defacto standard, we lose history anyway. I know what I've done and somewhere there's a tag in my own github repo of my work to create a JIRA. But we don't always need that entire history of trying to debug kerberos, typo in exception, and other stuff that accrues
Re: [VOTE] Release Apache Hadoop 2.7.0 RC0
+1 (binding) downloaded the source tar ball, built it, and ran some examples and distcp job. Thanks, -Jing On Tue, Apr 14, 2015 at 11:06 AM, Robert Kanter rkan...@cloudera.com wrote: +1 (non-binding) + verified checksum + Deployed binary tarball, ran some examples, clicked around in the Web UIs thanks - Robert On Tue, Apr 14, 2015 at 10:22 AM, Masatake Iwasaki iwasak...@oss.nttdata.co.jp wrote: +1 (non-binding) + verified signature and mds of source and binary tarball + built from source tarball + deployed binary tarball to 3 nodes cluster and run some hadoop-mapreduce-examples jobs Thanks, Masatake Iwasaki On 4/11/15 08:44, Vinod Kumar Vavilapalli wrote: Hi all, I've created a release candidate RC0 for Apache Hadoop 2.7.0. The RC is available at: http://people.apache.org/~ vinodkv/hadoop-2.7.0-RC0/ The RC tag in git is: release-2.7.0-RC0 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1017/ As discussed before - This release will only work with JDK 1.7 and above - I’d like to use this as a starting release for 2.7.x [1], depending on how it goes, get it stabilized and potentially use a 2.7.1 in a few weeks as the stable release. Please try the release and vote; the vote will run for the usual 5 days. Thanks, Vinod [1]: A 2.7.1 release to follow up 2.7.0 http://markmail.org/thread/zwzze6cqqgwq4rmw
Re: [VOTE] Release Apache Hadoop 2.6.0
+1 (binding) Verified checksums and signatures. Successfully deploy a cluster with NameNode HA setup and run some hdfs commands. Thanks, -Jing On Fri, Nov 14, 2014 at 3:58 PM, Chris Nauroth cnaur...@hortonworks.com wrote: +1 (binding) - Verified checksums and signatures for source and binary tarballs. - Started a pseudo-distributed HDFS cluster in secure mode with SSL. - Tested various file system operations. - Verified HDFS-2856, the new feature to run a secure DataNode without requiring root. - Verified HDFS-7385, the recent blocker related to incorrect ACLs serialized to the edit log. Thank you to Arun as release manager, and thank you to all of the contributors for their hard work on this release. Chris Nauroth Hortonworks http://hortonworks.com/ On Fri, Nov 14, 2014 at 10:57 AM, Yongjun Zhang yzh...@cloudera.com wrote: Thanks Arun for leading the 2.6 release effort. +1 (non-binding) - Downloaded rc1 source and did build - Created two single-node clusters running 2.6 - Ran sample mapreduce job - Ran distcp between two clusters --Yongjun On Thu, Nov 13, 2014 at 3:08 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created another release candidate (rc1) for hadoop-2.6.0 based on the feedback. The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc1 The RC tag in git is: release-2.6.0-rc1 The maven artifacts are available via repository.apache.org at https://repository.apache.org/content/repositories/orgapachehadoop-1013. Please try the release and vote; the vote will run for the usual 5 days. thanks, Arun -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Thinking ahead to hadoop-2.6
Just to give a quick update about the current status of HDFS-6584 (archival storage). So far after HDFS-7081 got committed this morning, the main functionalities have already been finished. As a summary two new DistributedFileSystem APIs are added: getStoragePolicies and setStoragePolicy. We have also been doing system tests for weeks and we will continue testing. There are still one or two pending issues but we're actively working on it. So I'm pretty confident that the archival storage work can get ready given the current plan to release 2.6 next week. On Wed, Sep 24, 2014 at 3:35 PM, Jitendra Pandey jiten...@hortonworks.com wrote: I also believe its worth a week's wait to include HDFS-6584 and HDFS-6581 in 2.6. On Wed, Sep 24, 2014 at 3:28 PM, Suresh Srinivas sur...@hortonworks.com wrote: Given some of the features are in final stages of stabilization, Arun, we should hold off creating 2.6 branch or building an RC by a week? All the features in flux are important ones and worth delaying the release by a week. On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang andrew.w...@cloudera.com wrote: Hey Nicholas, My concern about Archival Storage isn't related to the code quality or the size of the feature. I think that you and Jing did good work. My concern is that once we ship, we're locked into that set of archival storage APIs, and these APIs are not yet finalized. Simply being able to turn off the feature does not change the compatibility story. I'm willing to devote time to help review these JIRAs and kick the tires on the APIs, but my point above was that I'm not sure it'd all be done by the end of the week. Testing might also reveal additional changes that need to be made, which also might not happen by end-of-week. I guess the question before us is if we're comfortable putting something in branch-2.6 and then potentially adding API changes after. I'm okay with that as long as we're all aware that this might happen. Arun, as RM is this cool with you? Again, I like this feature and I'm fine with it's inclusion, just a heads up that we might need some extra time to finalize things before an RC can be cut. Thanks, Andrew On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze s29752-hadoop...@yahoo.com.invalid wrote: Hi, I am worry about KMS and transparent encryption since there are quite many bugs discovered after it got merged to branch-2. It gives us an impression that the feature is not yet well tested. Indeed, transparent encryption is a complicated feature which changes the core part of HDFS. It is not easy to get everything right. For HDFS-6584: Archival Storage, it is a relatively simple and low risk feature. It introduces a new storage type ARCHIVE and the concept of block storage policy to HDFS. When a cluster is configured with ARCHIVE storage, the blocks will be stored using the appropriate storage types specified by storage policies assigned to the files/directories. Cluster admin could disable the feature by simply not configuring any storage type and not setting any storage policy as before. As Suresh mentioned, HDFS-6584 is in the final stages to be merged to branch-2. Regards, Tsz-Wo On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas sur...@hortonworks.com wrote: I actually would like to see both archival storage and single replica memory writes to be in 2.6 release. Archival storage is in the final stages of getting ready for branch-2 merge as Nicholas has already indicated on the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of these features are being in development for sometime. On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang andrew.w...@cloudera.com wrote: Hey Arun, Maybe we could do a quick run through of the Roadmap wiki and add/retarget things accordingly? I think the KMS and transparent encryption are ready to go. We've got a very few further bug fixes pending, but that's it. Two HDFS things that I think probably won't make the end of the week are archival storage (HDFS-6584) and single replica memory writes (HDFS-6581), which I believe are under the HSM banner. HDFS-6484 was just merged to trunk and I think needs a little more work before it goes into branch-2. HDFS-6581 hasn't even been merged to trunk yet, so seems a bit further off yet. Just my 2c as I did not work directly on these features. I just generally shy away from shipping bits quite this fresh. Thanks, Andrew On Tue, Sep 23, 2014 at 3:03 PM, Arun Murthy a...@hortonworks.com wrote: Looks like most of the content
[jira] [Created] (HADOOP-10630) Possible race condition in RetryInvocationHandler
Jing Zhao created HADOOP-10630: -- Summary: Possible race condition in RetryInvocationHandler Key: HADOOP-10630 URL: https://issues.apache.org/jira/browse/HADOOP-10630 Project: Hadoop Common Issue Type: Bug Reporter: Jing Zhao In one of our system tests with NameNode HA setup, we ran 300 threads in LoadGenerator. While one of the NameNodes was already in the active state and started to serve, we still saw one of the client thread failed all the retries in a 20 seconds window. In the meanwhile, we saw a lot of following warning msg in the log: {noformat} WARN retry.RetryInvocationHandler: A failover has occurred since the start of this method invocation attempt. {noformat} After checking the code, we see the following code in RetryInvocationHandler: {code} while (true) { // The number of times this invocation handler has ever been failed over, // before this method invocation attempt. Used to prevent concurrent // failed method invocations from triggering multiple failover attempts. long invocationAttemptFailoverCount; synchronized (proxyProvider) { invocationAttemptFailoverCount = proxyProviderFailoverCount; } .. if (action.action == RetryAction.RetryDecision.FAILOVER_AND_RETRY) { // Make sure that concurrent failed method invocations only cause a // single actual fail over. synchronized (proxyProvider) { if (invocationAttemptFailoverCount == proxyProviderFailoverCount) { proxyProvider.performFailover(currentProxy.proxy); proxyProviderFailoverCount++; currentProxy = proxyProvider.getProxy(); } else { LOG.warn(A failover has occurred since the start of this method + invocation attempt.); } } invocationFailoverCount++; } .. {code} We can see we refresh the value of currentProxy only when the thread performs the failover (while holding the monitor of the proxyProvider). Because currentProxy is not volatile, a thread that does not perform the failover (in which case it will log the warning msg) may fail to get the new value of currentProxy. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10608) Support appending data in DistCp
Jing Zhao created HADOOP-10608: -- Summary: Support appending data in DistCp Key: HADOOP-10608 URL: https://issues.apache.org/jira/browse/HADOOP-10608 Project: Hadoop Common Issue Type: Improvement Reporter: Jing Zhao Assignee: Jing Zhao Currently when doing distcp with -update option, for two files with the same file names but with different file length or checksum, we overwrite the whole file. It will be good if we can detect the case where (sourceFile = targetFile + appended_data), and only transfer the appended data segment to the target. This will be very useful if we're doing incremental distcp. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10535) Make the retry numbers in ActiveStandbyElector configurable
Jing Zhao created HADOOP-10535: -- Summary: Make the retry numbers in ActiveStandbyElector configurable Key: HADOOP-10535 URL: https://issues.apache.org/jira/browse/HADOOP-10535 Project: Hadoop Common Issue Type: Improvement Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Currently in ActiveStandbyElector, when its zookeeper client cannot successfully communicate with ZooKeeper server, the retry number is hard coded to 3 (ActiveStandbyElector#NUM_RETRIES). After retrying 3 times a fatal error will be thrown and the ZKFC will quit. It will be better to make the retry times configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10489) UserGroupInformation#getTokens and UserGroupInformation#addToken can lead to ConcurrentModificationException
Jing Zhao created HADOOP-10489: -- Summary: UserGroupInformation#getTokens and UserGroupInformation#addToken can lead to ConcurrentModificationException Key: HADOOP-10489 URL: https://issues.apache.org/jira/browse/HADOOP-10489 Project: Hadoop Common Issue Type: Bug Reporter: Jing Zhao Currently UserGroupInformation#getTokens and UserGroupInformation#addToken uses UGI's monitor to protect the iteration and modification of Credentials#tokenMap. Per [discussion|https://issues.apache.org/jira/browse/HADOOP-10475?focusedCommentId=13965851page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13965851] in HADOOP-10475, this can still lead to ConcurrentModificationException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10472) KerberosAuthenticator should use org.apache.commons.logging.LogFactory instead of org.slf4j.LoggerFactory
Jing Zhao created HADOOP-10472: -- Summary: KerberosAuthenticator should use org.apache.commons.logging.LogFactory instead of org.slf4j.LoggerFactory Key: HADOOP-10472 URL: https://issues.apache.org/jira/browse/HADOOP-10472 Project: Hadoop Common Issue Type: Bug Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HADOOP-10472.000.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10441) Namenode metric rpc.RetryCache/NameNodeRetryCache.CacheHit can't be correctly processed by Ganglia
Jing Zhao created HADOOP-10441: -- Summary: Namenode metric rpc.RetryCache/NameNodeRetryCache.CacheHit can't be correctly processed by Ganglia Key: HADOOP-10441 URL: https://issues.apache.org/jira/browse/HADOOP-10441 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor The issue is reported by [~dsen]: Recently added Namenode metric rpc.RetryCache/NameNodeRetryCache.CacheHit can't be correctly processed by Ganglia because its name contains / Proposal: Namenode metric rpc.RetryCache/NameNodeRetryCache.CacheHit should be renamed to rpc.RetryCache.NameNodeRetryCache.CacheHit Here - org.apache.hadoop.ipc.metrics.RetryCacheMetrics#RetryCacheMetrics {code} RetryCacheMetrics(RetryCache retryCache) { name = RetryCache/+ retryCache.getCacheName(); registry = new MetricsRegistry(name); if (LOG.isDebugEnabled()) { LOG.debug(Initialized + registry); } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10323) Allow users to do a dryrun of distcp
Jing Zhao created HADOOP-10323: -- Summary: Allow users to do a dryrun of distcp Key: HADOOP-10323 URL: https://issues.apache.org/jira/browse/HADOOP-10323 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Reporter: Jing Zhao Assignee: Jing Zhao This jira plans to add a dryrun option in distcp which will make distcp go through all the steps except the real data copying. In this way, users can quickly understand potential issues. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HADOOP-10170) Unable to compile source code from stable 2.2.0 release.
[ https://issues.apache.org/jira/browse/HADOOP-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HADOOP-10170. Resolution: Duplicate This should have been fixed by HADOOP-10110. Closed as duplicated. Unable to compile source code from stable 2.2.0 release. Key: HADOOP-10170 URL: https://issues.apache.org/jira/browse/HADOOP-10170 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 2.2.0 Environment: Windows 7 (64bit), Maven 3.1.1 JDK1.7.0_45 Reporter: Hadoop Developer I have downloaded src ((hadoop-2.2.0-src.tar.gz )) from http://www.carfab.com/apachesoftware/hadoop/common/stable/ While maven build I am getting following error ERROR] C:\hdfs\hadoop-common-project\hadoop-auth\src\test\java\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java: [88,11] error: cannot access AbstractLifeCycle [ERROR] class file for org.mortbay.component.AbstractLifeCycle not found [ERROR] C:\hdfs\hadoop-common-project\hadoop-auth\src\test\java\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java: [96,29] error: cannot access LifeCycle [ERROR] class file for org.mortbay.component.LifeCycle not found [ERROR] C:\hdfs\hadoop-common-project\hadoop-auth\src\test\java\org\apache\hadoop\security\authentication\client\AuthenticatorTestCase.java Unable to compile source code from stable 2.2.0 release ( There is a Jira Hadoop-10117 which says it's fixed but couldn't get the stable version ) -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HADOOP-10028) Malformed ssl-server.xml.example
Jing Zhao created HADOOP-10028: -- Summary: Malformed ssl-server.xml.example Key: HADOOP-10028 URL: https://issues.apache.org/jira/browse/HADOOP-10028 Project: Hadoop Common Issue Type: Bug Reporter: Jing Zhao Assignee: Haohui Mai Priority: Minor The ssl-server.xml.example file has malformed XML leading to DN start error if the example file is reused. {code} 2013-10-07 16:52:01,639 FATAL conf.Configuration (Configuration.java:loadResource(2151)) - error parsing conf ssl-server.xml org.xml.sax.SAXParseException: The element type description must be terminated by the matching end-tag /description. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:153) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:1989) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HADOOP-10009) Backport HADOOP-7808 to branch-1
[ https://issues.apache.org/jira/browse/HADOOP-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HADOOP-10009. Resolution: Fixed Fix Version/s: 1.3.0 Hadoop Flags: Reviewed Thanks for the work Haohui! I've committed it to branch-1. Backport HADOOP-7808 to branch-1 Key: HADOOP-10009 URL: https://issues.apache.org/jira/browse/HADOOP-10009 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.2.1 Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 1.3.0 Attachments: HADOOP-10009.000.patch, HADOOP-10009.001.patch In branch-1, SecurityUtil::setTokenService() might throw a NullPointerException, which is fixed in HADOOP-7808. The patch should be backported into branch-1 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HADOOP-10017) Fix NPE in DFSClient#getDelegationToken when doing Distcp from a secured cluster to an insecured cluster
Jing Zhao created HADOOP-10017: -- Summary: Fix NPE in DFSClient#getDelegationToken when doing Distcp from a secured cluster to an insecured cluster Key: HADOOP-10017 URL: https://issues.apache.org/jira/browse/HADOOP-10017 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.1.1-beta, 3.0.0 Reporter: Jing Zhao Assignee: Haohui Mai Currently if we run Distcp from a secured cluster and copy data to an insecured cluster, DFSClient#getDelegationToken will throw NPE when processing the NULL token returned by the NN in the insecured cluster. We should be able to handle the NULL token here. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HADOOP-9896) TestIPC fail on trunk with error VM crash or System.exit
[ https://issues.apache.org/jira/browse/HADOOP-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HADOOP-9896. --- Resolution: Duplicate Release Note: (was: Resolving this as HADOOP-9916 should fix the root cause in IPC Client implementation.) Resolving this as HADOOP-9916 should fix the root cause in IPC Client implementation. TestIPC fail on trunk with error VM crash or System.exit Key: HADOOP-9896 URL: https://issues.apache.org/jira/browse/HADOOP-9896 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 3.0.0, 2.3.0 Reporter: shanyu zhao Assignee: Chuan Liu Attachments: HADOOP-9896.2.patch, HADOOP-9896.patch, org.apache.hadoop.ipc.TestIPC-output.txt I'm running hadoop unit tests on a Ubuntu 12.04 64 bit virtual machine, every time I try to run all unit tests with command mvn test, the TestIPC unit test will fail, the console will show The forked VM terminated without saying properly goodbye. VM crash or System.exit called? To reproduce: $cd hadoop-common-project/hadoop-common $mvn clean install -Pdist -DskipTests $mvn test -Pdist -Dtest=TestIPC -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-9896) TestIPC fail on trunk with error VM crash or System.exit
[ https://issues.apache.org/jira/browse/HADOOP-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao reopened HADOOP-9896: --- I think we should mark it as duplicated? TestIPC fail on trunk with error VM crash or System.exit Key: HADOOP-9896 URL: https://issues.apache.org/jira/browse/HADOOP-9896 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 3.0.0, 2.3.0 Reporter: shanyu zhao Assignee: Chuan Liu Attachments: HADOOP-9896.2.patch, HADOOP-9896.patch, org.apache.hadoop.ipc.TestIPC-output.txt I'm running hadoop unit tests on a Ubuntu 12.04 64 bit virtual machine, every time I try to run all unit tests with command mvn test, the TestIPC unit test will fail, the console will show The forked VM terminated without saying properly goodbye. VM crash or System.exit called? To reproduce: $cd hadoop-common-project/hadoop-common $mvn clean install -Pdist -DskipTests $mvn test -Pdist -Dtest=TestIPC -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9786) RetryInvocation#isRpcInvocation should support ProtocolTranslator
Jing Zhao created HADOOP-9786: - Summary: RetryInvocation#isRpcInvocation should support ProtocolTranslator Key: HADOOP-9786 URL: https://issues.apache.org/jira/browse/HADOOP-9786 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Currently RetryInvocation#isRpcInvocation directly uses Proxy#isProxyClass to check RetryInvocation#currentProxy. However, if currentProxy is an instance of ProtocolTranslator (e.g., ClientNamenodeProtocolTranslatorPB), the real dynamically-generated proxy object is contained within currentProxy and needs to be retrieved by calling ProtocolTranslator#getUnderlyingProxyObject. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9735) Deprecated configuration property can overwrite non-deprecated property
Jing Zhao created HADOOP-9735: - Summary: Deprecated configuration property can overwrite non-deprecated property Key: HADOOP-9735 URL: https://issues.apache.org/jira/browse/HADOOP-9735 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: deprecated-conf.test.patch For the current Configuration implementation, if a conf file contains definitions for both a non-deprecated property and its corresponding deprecated property (e.g., fs.defaultFS and fs.default.name), the latter will overwrite the previous one. In the fs.defaultFS example, this may cause client failover not work. It may be better to keep the non-deprecated property's value unchanged. In the meanwhile, Configuration#getPropertySources may return wrong source information for a deprecated property. E.g., after setting fs.defaultFS, Configuration#getPropertySources(fs.default.name) will return because fs.defaultFS is deprecated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-9735) Deprecated configuration property can overwrite non-deprecated property
[ https://issues.apache.org/jira/browse/HADOOP-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HADOOP-9735. --- Resolution: Not A Problem Deprecated configuration property can overwrite non-deprecated property --- Key: HADOOP-9735 URL: https://issues.apache.org/jira/browse/HADOOP-9735 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: deprecated-conf.test.patch For the current Configuration implementation, if a conf file contains definitions for both a non-deprecated property and its corresponding deprecated property (e.g., fs.defaultFS and fs.default.name), the latter will overwrite the previous one. In the fs.defaultFS example, this may cause client failover not work. It may be better to keep the non-deprecated property's value unchanged. In the meanwhile, Configuration#getPropertySources may return wrong source information for a deprecated property. E.g., after setting fs.defaultFS, Configuration#getPropertySources(fs.default.name) will return because fs.defaultFS is deprecated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HADOOP-9473) typo in FileUtil copy() method
[ https://issues.apache.org/jira/browse/HADOOP-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HADOOP-9473. --- Resolution: Fixed typo in FileUtil copy() method -- Key: HADOOP-9473 URL: https://issues.apache.org/jira/browse/HADOOP-9473 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 1.1.2 Reporter: Glen Mazza Priority: Trivial Fix For: 1.2.0, 2.0.5-beta Attachments: HADOOP-9473.branch-1.patch, HADOOP-9473.patch, HADOOP-9473-update-testConf-b1.2.patch typo: {code} Index: src/core/org/apache/hadoop/fs/FileUtil.java === --- src/core/org/apache/hadoop/fs/FileUtil.java (revision 1467295) +++ src/core/org/apache/hadoop/fs/FileUtil.java (working copy) @@ -178,7 +178,7 @@ // Check if dest is directory if (!dstFS.exists(dst)) { throw new IOException(` + dst +': specified destination directory + -doest not exist); +does not exist); } else { FileStatus sdst = dstFS.getFileStatus(dst); if (!sdst.isDir()) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-9492) Fix the typo in testConf.xml to make it consistent with FileUtil#copy()
Jing Zhao created HADOOP-9492: - Summary: Fix the typo in testConf.xml to make it consistent with FileUtil#copy() Key: HADOOP-9492 URL: https://issues.apache.org/jira/browse/HADOOP-9492 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.3.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Trivial HADOOP-9473 fixed a typo in FileUtil#copy(). We need to fix the same typo in testConf.xml accordingly. Otherwise TestCLI will fail in branch-1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HADOOP-9473) typo in FileUtil copy() method
[ https://issues.apache.org/jira/browse/HADOOP-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao reopened HADOOP-9473: --- We also needs to fix the same typo in testConf.xml accordingly. Otherwise TestCLI will fail. typo in FileUtil copy() method -- Key: HADOOP-9473 URL: https://issues.apache.org/jira/browse/HADOOP-9473 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha, 1.1.2 Reporter: Glen Mazza Priority: Trivial Fix For: 1.2.0, 2.0.5-beta Attachments: HADOOP-9473.branch-1.patch, HADOOP-9473.patch typo: {code} Index: src/core/org/apache/hadoop/fs/FileUtil.java === --- src/core/org/apache/hadoop/fs/FileUtil.java (revision 1467295) +++ src/core/org/apache/hadoop/fs/FileUtil.java (working copy) @@ -178,7 +178,7 @@ // Check if dest is directory if (!dstFS.exists(dst)) { throw new IOException(` + dst +': specified destination directory + -doest not exist); +does not exist); } else { FileStatus sdst = dstFS.getFileStatus(dst); if (!sdst.isDir()) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8995) Remove unnecessary bogus exception string from Configuration
Jing Zhao created HADOOP-8995: - Summary: Remove unnecessary bogus exception string from Configuration Key: HADOOP-8995 URL: https://issues.apache.org/jira/browse/HADOOP-8995 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor In Configuration#Configuration(boolean) and Configuration#Configuration(Configuration), bogus exceptions are thrown when Log level is DEBUG. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8988) Backport HADOOP-8343 to branch-1
Jing Zhao created HADOOP-8988: - Summary: Backport HADOOP-8343 to branch-1 Key: HADOOP-8988 URL: https://issues.apache.org/jira/browse/HADOOP-8988 Project: Hadoop Common Issue Type: New Feature Affects Versions: 1.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Attachments: Hadoop.8343.backport.001.patch Backport HADOOP-8343 to branch-1 so as to specifically control the authorization requirements for accessing /jmx, /metrics, and /conf in branch-1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8841) In trunk for command rm, the flags -[rR] and -f are not documented
Jing Zhao created HADOOP-8841: - Summary: In trunk for command rm, the flags -[rR] and -f are not documented Key: HADOOP-8841 URL: https://issues.apache.org/jira/browse/HADOOP-8841 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor We need to add description about the -[rR] and -f flags in the document for trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8831) FSEditLog preallocate() needs to reset the position of PREALLOCATE_BUFFER when more than 1MB size is needed
Jing Zhao created HADOOP-8831: - Summary: FSEditLog preallocate() needs to reset the position of PREALLOCATE_BUFFER when more than 1MB size is needed Key: HADOOP-8831 URL: https://issues.apache.org/jira/browse/HADOOP-8831 Project: Hadoop Common Issue Type: Bug Affects Versions: 1.2.0 Reporter: Jing Zhao Priority: Critical In the new preallocate() function, when the required size is larger 1MB, we need to reset the position for PREALLOCATION_BUFFER every time when we have allocated 1MB. Otherwise seems only 1MB can be allocated even if need is larger than 1MB. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira