[jira] [Created] (HDFS-17375) Take down docs for all Ozone versions prior to 1.3.0

2024-02-07 Thread Arpit Agarwal (Jira)
Arpit Agarwal created HDFS-17375:


 Summary: Take down docs for all Ozone versions prior to 1.3.0
 Key: HDFS-17375
 URL: https://issues.apache.org/jira/browse/HDFS-17375
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Arpit Agarwal


Can we offline the docs for all versions prior to 1.3.0. They are being indexed 
with higher priority in Google docs and have commands that fail on the latest 
releases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Migrate hadoop from log4j1 to log4j2

2022-01-20 Thread Arpit Agarwal
Hi Duo,

Thank you for starting this discussion. Log4j1.2 bridge seems like a practical 
short-term solution. However the bridge will silently affect applications that 
add appenders or filters. NameNode audit logger and metrics come to mind. There 
may be others.

Thanks,
Arpit


> On Jan 20, 2022, at 5:55 AM, Duo Zhang  wrote:
> 
> There are 3 new CVEs for log4j1 reported recently[1][2][3]. So I think it
> is time to speed up the migration to log4j2 work[4] now.
> 
> You can see the discussion on the jira issue[4], our goal is to fully
> migrate to log4j2 and the current most blocking issue is lack of the
> "log4j.rootLogger=INFO,Console" grammer support for log4j2. I've already
> started a discussion thread on the log4j dev mailing list[5] and the result
> is optimistic and I've filed an issue for log4j2[6], but I do not think it
> could be addressed and released soon. If we want to fully migrate to
> log4j2, then either we introduce new environment variables or split the old
> HADOOP_ROOT_LOGGER variable in the startup scripts. And considering the
> complexity of our current startup scripts, the work is not easy and it will
> also break lots of other hadoop deployment systems if they do not use our
> startup scripts...
> 
> So after reconsidering the current situation, I prefer we use the log4j1.2
> bridge to remove the log4j1 dependency first, and once LOG4J2-3341 is
> addressed and released, we start to fully migrate to log4j2. Of course we
> have other problems for log4j1.2 bridge too, as we have TaskLogAppender,
> ContainerLogAppender and ContainerRollingLogAppender which inherit
> FileAppender and RollingFileAppender in log4j1, which are not part of the
> log4j1.2 bridge. But anyway, at least we could just copy the source code to
> hadoop as we have WriteAppender in log4j1.2 bridge, and these two classes
> do not have related CVEs.
> 
> Thoughts? For me I would like us to make a new 3.4.x release line to remove
> the log4j1 dependencies ASAP.
> 
> Thanks.
> 
> 1. https://nvd.nist.gov/vuln/detail/CVE-2022-23302
> 2. https://nvd.nist.gov/vuln/detail/CVE-2022-23305
> 3. https://nvd.nist.gov/vuln/detail/CVE-2022-23307
> 4. https://issues.apache.org/jira/browse/HADOOP-16206
> 5. https://lists.apache.org/thread/gvfb3jkg6t11cyds4jmpo7lrswmx28w3
> 6. https://issues.apache.org/jira/browse/LOG4J2-3341


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] which release lines should we still consider actively maintained?

2021-05-24 Thread Arpit Agarwal
+1 to EOL 3.1.x at least.


> On May 23, 2021, at 9:51 PM, Wei-Chiu Chuang  
> wrote:
> 
> Sean,
> 
> For reasons I don't understand, I never received emails from your new
> address in the mailing list. Only Akira's response.
> 
> I was just able to start a thread like this.
> 
> I am +1 to EOL 3.1.5.
> Reason? Spark is already on Hadoop 3.2. Hive and Tez are actively working
> to support Hadoop 3.3. HBase supports Hadoop 3.3 already. They are the most
> common Hadoop applications so I think a 3.1 isn't that necessarily
> important.
> 
> With Hadoop 3.3.1, we have a number of improvements to support a better
> HDFS upgrade experience, so upgrading from Hadoop 3.1 should be relatively
> easy. Application upgrade takes some effort though (commons-lang ->
> commons-lang3 migration for example)
> I've been maintaining the HDFS code in branch-3.1, so from a
> HDFS perspective the branch is always in a ready to release state.
> 
> The Hadoop 3.1 line is more than 3 years old. Maintaining this branch is
> getting trickier. I am +100 to reduce the number of actively maintained
> release line. IMO, 2 Hadoop 3 lines + 1 Hadoop 2 line is a good idea.
> 
> 
> 
> For Hadoop 3.3 line: If no one beats me, I plan to make a 3.3.2 in 2-3
> months. And another one in another 2-3 months.
> The Hadoop 3.3.1 has nearly 700 commits not in 3.3.0. It is very difficult
> to make/validate a maint release with such a big divergence in the code.
> 
> 
> On Mon, May 24, 2021 at 12:06 PM Akira Ajisaka  > wrote:
> 
>> Hi Sean,
>> 
>> Thank you for starting the discussion.
>> 
>> I think branch-2.10, branch-3.1, branch-3.2, branch-3.3, and trunk
>> (3.4.x) are actively maintained.
>> 
>> The next releases will be:
>> - 3.4.0
>> - 3.3.1 (Thanks, Wei-Chiu!)
>> - 3.2.3
>> - 3.1.5
>> - 2.10.2
>> 
>>> Are there folks willing to go through being release managers to get more
>> of these release lines on a steady cadence?
>> 
>> Now I'm interested in becoming a release manager of 3.1.5.
>> 
>>> If I were to take up maintenance release for one of them which should it
>> be?
>> 
>> 3.2.3 or 2.10.2 seems to be a good choice.
>> 
>>> Should we declare to our downstream users that some of these lines
>> aren’t going to get more releases?
>> 
>> Now I think we don't need to declare that. I believe 3.3.1, 3.2.3,
>> 3.1.5, and 2.10.2 will be released in the near future.
>> There are some earlier discussions of 3.1.x EoL, so 3.1.5 may be a
>> final release of the 3.1.x release line.
>> 
>>> Is there downstream facing documentation somewhere that I missed for
>> setting expectations about our release cadence and actively maintained
>> branches?
>> 
>> As you commented, the confluence wiki pages for Hadoop releases were
>> out of date. Updated [1].
>> 
>>> Do we have a backlog of work written up that could make the release
>> process easier for our release managers?
>> 
>> The release process is documented and maintained:
>> https://cwiki.apache.org/confluence/display/HADOOP2/HowToRelease
>> Also, there are some backlogs [1], [2].
>> 
>> [1]:
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Active+Release+Lines
>> [2]: https://cwiki.apache.org/confluence/display/HADOOP/Roadmap
>> 
>> Thanks,
>> Akira
>> 
>> On Fri, May 21, 2021 at 7:12 AM Sean Busbey 
>> wrote:
>>> 
>>> 
>>> Hi folks!
>>> 
>>> Which release lines do we as a community still consider actively
>> maintained?
>>> 
>>> I found an earlier discussion[1] where we had consensus to consider
>> branches that don’t get maintenance releases on a regular basis end-of-life
>> for practical purposes. The result of that discussion was written up in our
>> wiki docs in the “EOL Release Branches” page, summarized here
>>> 
 If no volunteer to do a maintenance release in a short to mid-term
>> (like 3 months to 1 or 1.5 year).
>>> 
>>> Looking at release lines that are still on our download page[3]:
>>> 
>>> * Hadoop 2.10.z - last release 8 months ago
>>> * Hadoop 3.1.z - last release 9.5 months ago
>>> * Hadoop 3.2.z - last release 4.5 months ago
>>> * Hadoop 3.3.z - last release 10 months ago
>>> 
>>> And then trunk holds 3.4 which hasn’t had a release since the branch-3.3
>> fork ~14 months ago.
>>> 
>>> I can see that Wei-Chiu has been actively working on getting the 3.3.1
>> release out[4] (thanks Wei-Chiu!) but I do not see anything similar for the
>> other release lines.
>>> 
>>> We also have pages on the wiki for our project roadmap of release[5],
>> but it seems out of date since it lists in progress releases that have
>> happened or branches we have announced as end of life, i.e. 2.8.
>>> 
>>> We also have a group of pages (sorry, I’m not sure what the confluence
>> jargon is for this) for “hadoop active release lines”[6] but this list has
>> 2.8, 2.9, 3.0, 3.1, and 3.3. So several declared end of life lines and no
>> 2.10 or 3.2 despite those being our release lines with the most recent
>> releases.
>>> 
>>> Are there folks willing to go through being rele

[jira] [Resolved] (HDFS-15854) Make some parameters configurable for SlowDiskTracker and SlowPeerTracker

2021-03-01 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-15854.
--
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Thanks for the contribution [~tomscut].

> Make some parameters configurable for SlowDiskTracker and SlowPeerTracker
> -
>
> Key: HDFS-15854
> URL: https://issues.apache.org/jira/browse/HDFS-15854
> Project: Hadoop HDFS
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Make some parameters configurable for SlowDiskTracker and SlowPeerTracker. 
> Related to https://issues.apache.org/jira/browse/HDFS-15814.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Apache Hadoop Ozone 0.5.0-beta RC2

2020-03-21 Thread Arpit Agarwal
+1 binding.

- Verified hashes and signatures
- Built from source
- Deployed to 5 node cluster
- Tried ozone shell and filesystem operations
- Ran freon stress test for a while with write validation


I couldn’t find the RC2 tag in the gitbox repo, although it is there in GitHub.

Thanks,
Arpit



> On Mar 21, 2020, at 3:57 PM, Hanisha Koneru  
> wrote:
> 
> Thank you Dinesh for putting up the RCs.
> 
> +1 binding.
> 
> Verified the following:
>  - Built from source
>  - Deployed to 5 node docker cluster and ran sanity tests.
>  - Ran smoke tests
> 
> Thanks
> Hanisha
> 
>> On Mar 15, 2020, at 7:27 PM, Dinesh Chitlangia  wrote:
>> 
>> Hi Folks,
>> 
>> We have put together RC2 for Apache Hadoop Ozone 0.5.0-beta.
>> 
>> The RC artifacts are at:
>> https://home.apache.org/~dineshc/ozone-0.5.0-rc2/
>> 
>> The public key used for signing the artifacts can be found at:
>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> 
>> The maven artifacts are staged at:
>> https://repository.apache.org/content/repositories/orgapachehadoop-1262
>> 
>> The RC tag in git is at:
>> https://github.com/apache/hadoop-ozone/tree/ozone-0.5.0-beta-RC2
>> 
>> This release contains 800+ fixes/improvements [1].
>> Thanks to everyone who put in the effort to make this happen.
>> 
>> *The vote will run for 7 days, ending on March 22nd 2020 at 11:59 pm PST.*
>> 
>> Note: This release is beta quality, it’s not recommended to use in
>> production but we believe that it’s stable enough to try out the feature
>> set and collect feedback.
>> 
>> 
>> [1] https://s.apache.org/ozone-0.5.0-fixed-issues
>> 
>> Thanks,
>> Dinesh Chitlangia
> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary

2020-03-17 Thread Arpit Agarwal
Thanks for the clarification Brahma. Can you update the proposal to state that 
it is optional (it may help to put the proposal on cwiki)?

Also if we go ahead then the RM documentation should be clear this is an 
optional step.


> On Mar 17, 2020, at 11:06 AM, Brahma Reddy Battula  wrote:
> 
> Sure, we can't make mandatory while voting and we can upload to downloads
> once release vote is passed.
> 
> On Tue, 17 Mar 2020 at 11:24 PM, Arpit Agarwal
>  wrote:
> 
>>> Sorry,didn't get you...do you mean, once release voting is
>>> processed and upload by RM..?
>> 
>> Yes, that is what I meant. I don’t want us to make more mandatory work for
>> the release manager because the job is hard enough already.
>> 
>> 
>>> On Mar 17, 2020, at 10:46 AM, Brahma Reddy Battula 
>> wrote:
>>> 
>>> Sorry,didn't get you...do you mean, once release voting is processed and
>>> upload by RM..?
>>> 
>>> FYI. There is docker image for ARM also which support all scripts
>>> (createrelease, start-build-env.sh, etc ).
>>> 
>>> https://issues.apache.org/jira/browse/HADOOP-16797
>>> 
>>> On Tue, Mar 17, 2020 at 10:59 PM Arpit Agarwal
>>>  wrote:
>>> 
>>>> Can ARM binaries be provided after the fact? We cannot increase the RM’s
>>>> burden by asking them to generate an extra set of binaries.
>>>> 
>>>> 
>>>>> On Mar 17, 2020, at 10:23 AM, Brahma Reddy Battula 
>>>> wrote:
>>>>> 
>>>>> + Dev mailing list.
>>>>> 
>>>>> -- Forwarded message -
>>>>> From: Brahma Reddy Battula 
>>>>> Date: Tue, Mar 17, 2020 at 10:31 PM
>>>>> Subject: Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary
>>>>> To: junping_du 
>>>>> 
>>>>> 
>>>>> thanks junping for your reply.
>>>>> 
>>>>> bq.  I think most of us in Hadoop community doesn't want to have
>>>> biased
>>>>> on ARM or any other platforms.
>>>>> 
>>>>> Yes, release voting will be based on the source code.AFAIK,Binary we
>> are
>>>>> providing for user to easy to download and verify.
>>>>> 
>>>>> bq. The only thing I try to understand is how much complexity get
>>>>> involved for our RM work. Does that potentially become a blocker for
>>>> future
>>>>> releases? And how we can get rid of this risk.
>>>>> 
>>>>> As I mentioned earlier, RM need to access the ARM machine(it will be
>>>>> donated and current qbt also using one ARM machine) and build tar using
>>>> the
>>>>> keys. As it can be common machine, RM can delete his keys once release
>>>>> approved.
>>>>> Can be sorted out as I mentioned earlier.(For accessing the ARM
>> machine)
>>>>> 
>>>>> bq.   If you can list the concrete work that RM need to do extra
>> for
>>>>> ARM release, that would help us to better understand.
>>>>> 
>>>>> I can write and update for future reference.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Mar 17, 2020 at 10:41 AM 俊平堵  wrote:
>>>>> 
>>>>>> Hi Brahma,
>>>>>>   I think most of us in Hadoop community doesn't want to have biased
>>>> on
>>>>>> ARM or any other platforms.
>>>>>>   The only thing I try to understand is how much complexity get
>>>>>> involved for our RM work. Does that potentially become a blocker for
>>>> future
>>>>>> releases? And how we can get rid of this risk.
>>>>>>If you can list the concrete work that RM need to do extra for ARM
>>>>>> release, that would help us to better understand.
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Junping
>>>>>> 
>>>>>> Akira Ajisaka  于2020年3月13日周五 上午12:34写道:
>>>>>> 
>>>>>>> If you can provide ARM release for future releases, I'm fine with
>> that.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Akira
>>>>>>> 
>>>>>>> 

Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary

2020-03-17 Thread Arpit Agarwal
> Sorry,didn't get you...do you mean, once release voting is
> processed and upload by RM..?

Yes, that is what I meant. I don’t want us to make more mandatory work for the 
release manager because the job is hard enough already.


> On Mar 17, 2020, at 10:46 AM, Brahma Reddy Battula  wrote:
> 
> Sorry,didn't get you...do you mean, once release voting is processed and
> upload by RM..?
> 
> FYI. There is docker image for ARM also which support all scripts
> (createrelease, start-build-env.sh, etc ).
> 
> https://issues.apache.org/jira/browse/HADOOP-16797
> 
> On Tue, Mar 17, 2020 at 10:59 PM Arpit Agarwal
>  wrote:
> 
>> Can ARM binaries be provided after the fact? We cannot increase the RM’s
>> burden by asking them to generate an extra set of binaries.
>> 
>> 
>>> On Mar 17, 2020, at 10:23 AM, Brahma Reddy Battula 
>> wrote:
>>> 
>>> + Dev mailing list.
>>> 
>>> -- Forwarded message -
>>> From: Brahma Reddy Battula 
>>> Date: Tue, Mar 17, 2020 at 10:31 PM
>>> Subject: Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary
>>> To: junping_du 
>>> 
>>> 
>>> thanks junping for your reply.
>>> 
>>> bq.  I think most of us in Hadoop community doesn't want to have
>> biased
>>> on ARM or any other platforms.
>>> 
>>> Yes, release voting will be based on the source code.AFAIK,Binary we are
>>> providing for user to easy to download and verify.
>>> 
>>> bq. The only thing I try to understand is how much complexity get
>>> involved for our RM work. Does that potentially become a blocker for
>> future
>>> releases? And how we can get rid of this risk.
>>> 
>>> As I mentioned earlier, RM need to access the ARM machine(it will be
>>> donated and current qbt also using one ARM machine) and build tar using
>> the
>>> keys. As it can be common machine, RM can delete his keys once release
>>> approved.
>>> Can be sorted out as I mentioned earlier.(For accessing the ARM machine)
>>> 
>>> bq.   If you can list the concrete work that RM need to do extra for
>>> ARM release, that would help us to better understand.
>>> 
>>> I can write and update for future reference.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Mar 17, 2020 at 10:41 AM 俊平堵  wrote:
>>> 
>>>> Hi Brahma,
>>>>I think most of us in Hadoop community doesn't want to have biased
>> on
>>>> ARM or any other platforms.
>>>>The only thing I try to understand is how much complexity get
>>>> involved for our RM work. Does that potentially become a blocker for
>> future
>>>> releases? And how we can get rid of this risk.
>>>> If you can list the concrete work that RM need to do extra for ARM
>>>> release, that would help us to better understand.
>>>> 
>>>> Thanks,
>>>> 
>>>> Junping
>>>> 
>>>> Akira Ajisaka  于2020年3月13日周五 上午12:34写道:
>>>> 
>>>>> If you can provide ARM release for future releases, I'm fine with that.
>>>>> 
>>>>> Thanks,
>>>>> Akira
>>>>> 
>>>>> On Thu, Mar 12, 2020 at 9:41 PM Brahma Reddy Battula <
>> bra...@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> thanks Akira.
>>>>>> 
>>>>>> Currently only problem is dedicated ARM for future RM.This i want to
>>>>> sort
>>>>>> out like below,if you've some other,please let me know.
>>>>>> 
>>>>>> i) Single machine and share cred to future RM ( as we can delete keys
>>>>> once
>>>>>> release is over).
>>>>>> ii) Creating the jenkins project ( may be we need to discuss in the
>>>>>> board..)
>>>>>> iii) I can provide ARM release for future releases.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Thu, Mar 12, 2020 at 5:14 PM Akira Ajisaka 
>>>>> wrote:
>>>>>> 
>>>>>>> Hi Brahma,
>>>>>>> 
>>>>>>> I think we cannot do any of your proposed actions.
>>>>>>> 
>>>

Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary

2020-03-17 Thread Arpit Agarwal
Can ARM binaries be provided after the fact? We cannot increase the RM’s burden 
by asking them to generate an extra set of binaries.


> On Mar 17, 2020, at 10:23 AM, Brahma Reddy Battula  wrote:
> 
> + Dev mailing list.
> 
> -- Forwarded message -
> From: Brahma Reddy Battula 
> Date: Tue, Mar 17, 2020 at 10:31 PM
> Subject: Re: [DISCUSS] Hadoop 3.3.0 Release include ARM binary
> To: junping_du 
> 
> 
> thanks junping for your reply.
> 
> bq.  I think most of us in Hadoop community doesn't want to have biased
> on ARM or any other platforms.
> 
> Yes, release voting will be based on the source code.AFAIK,Binary we are
> providing for user to easy to download and verify.
> 
> bq. The only thing I try to understand is how much complexity get
> involved for our RM work. Does that potentially become a blocker for future
> releases? And how we can get rid of this risk.
> 
> As I mentioned earlier, RM need to access the ARM machine(it will be
> donated and current qbt also using one ARM machine) and build tar using the
> keys. As it can be common machine, RM can delete his keys once release
> approved.
> Can be sorted out as I mentioned earlier.(For accessing the ARM machine)
> 
> bq.   If you can list the concrete work that RM need to do extra for
> ARM release, that would help us to better understand.
> 
> I can write and update for future reference.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Tue, Mar 17, 2020 at 10:41 AM 俊平堵  wrote:
> 
>> Hi Brahma,
>> I think most of us in Hadoop community doesn't want to have biased on
>> ARM or any other platforms.
>> The only thing I try to understand is how much complexity get
>> involved for our RM work. Does that potentially become a blocker for future
>> releases? And how we can get rid of this risk.
>>  If you can list the concrete work that RM need to do extra for ARM
>> release, that would help us to better understand.
>> 
>> Thanks,
>> 
>> Junping
>> 
>> Akira Ajisaka  于2020年3月13日周五 上午12:34写道:
>> 
>>> If you can provide ARM release for future releases, I'm fine with that.
>>> 
>>> Thanks,
>>> Akira
>>> 
>>> On Thu, Mar 12, 2020 at 9:41 PM Brahma Reddy Battula 
>>> wrote:
>>> 
 thanks Akira.
 
 Currently only problem is dedicated ARM for future RM.This i want to
>>> sort
 out like below,if you've some other,please let me know.
 
 i) Single machine and share cred to future RM ( as we can delete keys
>>> once
 release is over).
 ii) Creating the jenkins project ( may be we need to discuss in the
 board..)
 iii) I can provide ARM release for future releases.
 
 
 
 
 
 
 
 On Thu, Mar 12, 2020 at 5:14 PM Akira Ajisaka 
>>> wrote:
 
> Hi Brahma,
> 
> I think we cannot do any of your proposed actions.
> 
> 
 
>>> http://www.apache.org/legal/release-policy.html#owned-controlled-hardware
>> Strictly speaking, releases must be verified on hardware owned and
> controlled by the committer. That means hardware the committer has
 physical
> possession and control of and exclusively full
>>> administrative/superuser
> access to. That's because only such hardware is qualified to hold a
>>> PGP
> private key, and the release should be verified on the machine the
 private
> key lives on or on a machine as trusted as that.
> 
> https://www.apache.org/dev/release-distribution.html#sigs-and-sums
>> Private keys MUST NOT be stored on any ASF machine. Likewise,
 signatures
> for releases MUST NOT be created on ASF machines.
> 
> We need to have dedicated physical ARM machines for each release
>>> manager,
> and now it is not feasible.
> If you provide an unofficial ARM binary release in some repository,
 that's
> okay.
> 
> -Akira
> 
> On Thu, Mar 12, 2020 at 7:57 PM Brahma Reddy Battula <
>>> bra...@apache.org>
> wrote:
> 
>> Hello folks,
>> 
>> As currently trunk will support ARM based compilation and qbt(1) is
>> running
>> from several months with quite stable, hence planning to propose ARM
>> binary
>> this time.
>> 
>> ( Note : As we'll know voting will be based on the source,so this
>>> will
 not
>> issue.)
>> 
>> *Proposed Change:*
>> Currently in downloads we are keeping only x86 binary(2),Can we keep
>>> ARM
>> binary also.?
>> 
>> *Actions:*
>> a) *Dedicated* *Machine*:
>>   i) Dedicated ARM machine will be donated which I confirmed
>>   ii) Or can use jenkins ARM machine itself which is currently
>>> used
>> for ARM
>> b) *Automate Release:* How about having one release project in
 jenkins..?
>> So that future RM's just trigger the jenkin project.
>> 
>> Please let me know your thoughts on this.
>> 
>> 
>> 1.
>> 
>> 
 
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-qbt-linux-ARM-trunk/
>> 2.http

Re: [VOTE] Apache Hadoop Ozone 0.5.0-beta RC1

2020-03-13 Thread Arpit Agarwal
HDDS-3116 is now fixed in the ozone-0.5.0 branch.

Folks - any more potential blockers before Dinesh spins RC2? I don’t see 
anything in Jira at the moment:

https://issues.apache.org/jira/issues/?jql=project%20in%20(%22HDDS%22)%20and%20%22Target%20Version%2Fs%22%20in%20(0.5.0)%20and%20resolution%20in%20(Unresolved)%20and%20priority%20in%20(Blocker)
 
<https://issues.apache.org/jira/issues/?jql=project%20in%20(%22HDDS%22)%20and%20%22Target%20Version/s%22%20in%20(0.5.0)%20and%20resolution%20in%20(Unresolved)%20and%20priority%20in%20(Blocker)>

Thanks,
Arpit


> On Mar 9, 2020, at 6:15 PM, Shashikant Banerjee 
>  wrote:
> 
> I think https://issues.apache.org/jira/browse/HDDS-3116 
> <https://issues.apache.org/jira/browse/HDDS-3116> is a blocker for
> the release. Because of this, datanodes fail to communicate with SCM and
> marked dead and don't seem to recover.
> This has been observed in multiple test setups.
> 
> Thanks
> Shashi
> 
> On Mon, Mar 9, 2020 at 9:20 PM Attila Doroszlai  <mailto:adorosz...@apache.org>>
> wrote:
> 
>> +1
>> 
>> * Verified GPG signature and SHA512 checksum
>> * Compiled sources
>> * Ran ozone smoke test against both binary and locally compiled versions
>> 
>> Thanks Dinesh for RC1.
>> 
>> -Attila
>> 
>> On Sun, Mar 8, 2020 at 2:34 AM Arpit Agarwal
>>  wrote:
>>> 
>>> +1 (binding)
>>> Verified mds, sha512
>>> Verified signatures
>>> Built from source
>>> Deployed to 3 node cluster
>>> Tried a few ozone shell and filesystem commands
>>> Ran freon load generator
>>> Thanks Dinesh for putting the RC1 together.
>>> 
>>> 
>>> 
>>>> On Mar 6, 2020, at 4:46 PM, Dinesh Chitlangia 
>> wrote:
>>>> 
>>>> Hi Folks,
>>>> 
>>>> We have put together RC1 for Apache Hadoop Ozone 0.5.0-beta.
>>>> 
>>>> The RC artifacts are at:
>>>> https://home.apache.org/~dineshc/ozone-0.5.0-rc1/
>>>> 
>>>> The public key used for signing the artifacts can be found at:
>>>> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>>> 
>>>> The maven artifacts are staged at:
>>>> 
>> https://repository.apache.org/content/repositories/orgapachehadoop-1260
>>>> 
>>>> The RC tag in git is at:
>>>> https://github.com/apache/hadoop-ozone/tree/ozone-0.5.0-beta-RC1
>>>> 
>>>> This release contains 800+ fixes/improvements [1].
>>>> Thanks to everyone who put in the effort to make this happen.
>>>> 
>>>> *The vote will run for 7 days, ending on March 13th 2020 at 11:59 pm
>> PST.*
>>>> 
>>>> Note: This release is beta quality, it’s not recommended to use in
>>>> production but we believe that it’s stable enough to try out the
>> feature
>>>> set and collect feedback.
>>>> 
>>>> 
>>>> [1] https://s.apache.org/ozone-0.5.0-fixed-issues
>>>> 
>>>> Thanks,
>>>> Dinesh Chitlangia
>>> 
>> 
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org 
>> <mailto:hdfs-dev-unsubscr...@hadoop.apache.org>
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org 
>> <mailto:hdfs-dev-h...@hadoop.apache.org>


Re: [VOTE] Apache Hadoop Ozone 0.5.0-beta RC1

2020-03-07 Thread Arpit Agarwal
+1 (binding)
Verified mds, sha512
Verified signatures
Built from source
Deployed to 3 node cluster
Tried a few ozone shell and filesystem commands
Ran freon load generator 
Thanks Dinesh for putting the RC1 together.



> On Mar 6, 2020, at 4:46 PM, Dinesh Chitlangia  wrote:
> 
> Hi Folks,
> 
> We have put together RC1 for Apache Hadoop Ozone 0.5.0-beta.
> 
> The RC artifacts are at:
> https://home.apache.org/~dineshc/ozone-0.5.0-rc1/
> 
> The public key used for signing the artifacts can be found at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> The maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1260
> 
> The RC tag in git is at:
> https://github.com/apache/hadoop-ozone/tree/ozone-0.5.0-beta-RC1
> 
> This release contains 800+ fixes/improvements [1].
> Thanks to everyone who put in the effort to make this happen.
> 
> *The vote will run for 7 days, ending on March 13th 2020 at 11:59 pm PST.*
> 
> Note: This release is beta quality, it’s not recommended to use in
> production but we believe that it’s stable enough to try out the feature
> set and collect feedback.
> 
> 
> [1] https://s.apache.org/ozone-0.5.0-fixed-issues
> 
> Thanks,
> Dinesh Chitlangia



Re: [VOTE] Apache Hadoop Ozone 0.5.0-beta RC0

2020-02-28 Thread Arpit Agarwal
Hi Dinesh,

Thanks for spinning up this RC! Looks like we still had ~15 issues that were 
tagged as Blockers for 0.5.0 in jira.

I’ve moved out most of them, however the remaining 4 look like must fixes.

https://issues.apache.org/jira/issues/?jql=project%20%3D%20%22HDDS%22%20and%20%22Target%20Version%2Fs%22%20in%20(0.5.0)%20and%20resolution%20%3D%20Unresolved%20and%20priority%20%3D%20Blocker
 


Thanks,
Arpit


> On Feb 27, 2020, at 8:23 PM, Dinesh Chitlangia  wrote:
> 
> Hi Folks,
> 
> We have put together RC0 for Apache Hadoop Ozone 0.5.0-beta.
> 
> The RC artifacts are at:
> https://home.apache.org/~dineshc/ozone-0.5.0-rc0/
> 
> The public key used for signing the artifacts can be found at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> The maven artifacts are staged at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1259
> 
> The RC tag in git is at:
> https://github.com/apache/hadoop-ozone/tree/ozone-0.5.0-beta-RC0
> 
> This release contains 800+ fixes/improvements [1].
> Thanks to everyone who put in the effort to make this happen.
> 
> *The vote will run for 7 days, ending on March 4th 2020 at 11:59 pm PST.*
> 
> Note: This release is beta quality, it’s not recommended to use in
> production but we believe that it’s stable enough to try out the feature
> set and collect feedback.
> 
> 
> [1] https://s.apache.org/ozone-0.5.0-fixed-issues
> 
> Thanks,
> Dinesh Chitlangia



Re: [DISCUSS] Feature branch for HDFS-14978 In-place Erasure Coding Conversion

2020-01-23 Thread Arpit Agarwal
+1


> On Jan 23, 2020, at 2:51 PM, Jitendra Pandey  wrote:
> 
> +1 for the feature branch. 
> 
> On Thu, Jan 23, 2020 at 1:34 PM Wei-Chiu Chuang 
>  wrote:
> Hi we are working on a feature to improve Erasure Coding, and I would like
> to seek your opinion on creating a feature branch for it. (HDFS-14978
>  >)
> 
> Reason for a feature branch
> (1) it turns out we need to update NameNode layout version
> (2) It's a medium size project and we want to get this feature merged in
> its entirety.
> 
> Aravindan Vijayan and I are planning to work on this feature.
> 
> Thoughts?



Re: [DISCUSS] Ozone 0.4.2 release

2019-12-07 Thread Arpit Agarwal
+1



> On Dec 6, 2019, at 5:25 PM, Dinesh Chitlangia  wrote:
> 
> All,
> Since the Apache Hadoop Ozone 0.4.1 release, we have had significant
> bug fixes towards performance & stability.
> 
> With that in mind, 0.4.2 release would be good to consolidate all those fixes.
> 
> Pls share your thoughts.
> 
> 
> Thanks,
> Dinesh Chitlangia


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-529) Some Ozone DataNode logs go to a separate ozone.log file

2019-11-06 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-529.

Resolution: Done

Thanks for the note [~Huachao]. This appears to be fixed now. I no longer see 
log output going to ozone.log file.

> Some Ozone DataNode logs go to a separate ozone.log file
> 
>
> Key: HDDS-529
> URL: https://issues.apache.org/jira/browse/HDDS-529
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>    Reporter: Arpit Agarwal
>Assignee: YiSheng Lien
>Priority: Blocker
>  Labels: beta1
>
> Some, but not all DataNode logs go to a separate ozone.log file. Couple of 
> things to fix here:
> # The behavior should be consistent. All log messages should go to the new 
> log file.
> # The new log file name should follow the Hadoop log file convention e.g. 
> _hadoop---.log_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] create ozone-dev and ozone-issues mailing lists

2019-11-06 Thread Arpit Agarwal
Good idea. Thanks Ayush.

Filed https://issues.apache.org/jira/browse/HADOOP-16688 
<https://issues.apache.org/jira/browse/HADOOP-16688>


> On Nov 6, 2019, at 10:29 AM, Ayush Saxena  wrote:
> 
> Hi,
> Probably we should mention the new lists in the document too. Here :
> https://hadoop.apache.org/mailing_lists.html
> 
> -Ayush
> 
> On Mon, 4 Nov 2019 at 16:19, Elek, Marton  wrote:
> 
>> Thanks Arpit.
>> 
>> Notification rules has also been adjusted (thanks to INFRA).
>> 
>> https://issues.apache.org/jira/browse/INFRA-19378
>> 
>> Marton
>> 
>> On 11/1/19 4:46 PM, Arpit Agarwal wrote:
>>> Thanks for kicking this off Marton. Submitted INFRA requests to create
>> the following. The lists should be live soon.
>>> 
>>>  - ozone-dev@h.a.o
>>>  - ozone-issues@h.a.o
>>>  - ozone-commits@h.a.o
>>> 
>>> 
>>> 
>>>> On Oct 31, 2019, at 3:32 AM, Elek, Marton  wrote:
>>>> 
>>>> 
>>>> Thanks for all the votes and feedback.
>>>> 
>>>> The vote is passed with no -1 and with many +1
>>>> 
>>>> The mailing lists will be created soon and the notification settings
>> will be updated.
>>>> 
>>>> Thank you for your patience.
>>>> Marton
>>>> 
>>>> 
>>>> On 10/27/19 9:25 AM, Elek, Marton wrote:
>>>>> As discussed earlier in the thread of "Hadoop-Ozone repository mailing
>> list configurations" [1] I suggested to solve the current misconfiguration
>> problem with creating separated mailing lists (dev/issues) for Hadoop Ozone.
>>>>> It would have some additional benefit: for example it would make
>> easier to follow the Ozone development and future plans.
>>>>> Here I am starting a new vote thread (open for at least 72 hours) to
>> collect more feedback about this.
>>>>> Please express your opinion / vote.
>>>>> Thanks a lot,
>>>>> Marton
>>>>> [1]
>> https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E
>> -
>>>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>>>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>>> 
>>>> -
>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>> 
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>> 
>> 
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> 
>> 



[jira] [Reopened] (HDDS-2208) Propagate System Exceptions from OM transaction apply phase

2019-11-04 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-2208:
-

> Propagate System Exceptions from OM transaction apply phase
> ---
>
> Key: HDDS-2208
> URL: https://issues.apache.org/jira/browse/HDDS-2208
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The change for HDDS-2206 tracks system exceptions during preExecute phase of 
> OM request handling.
> The current jira is to implement exception propagation once the OM request is 
> submitted to Ratis - when the handler is running validateAndUpdateCache for 
> the request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent

2019-11-01 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-1847:
-

I've reverted this to unblock other CI runs which may get stuck on the failing 
tests.

We can recommit with UT fixes.

> Datanode Kerberos principal and keytab config key looks inconsistent
> 
>
> Key: HDDS-1847
> URL: https://issues.apache.org/jira/browse/HDDS-1847
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Eric Yang
>Assignee: Chris Teoh
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Ozone Kerberos configuration can be very confusing:
> | config name | Description |
> | hdds.scm.kerberos.principal | SCM service principal |
> | hdds.scm.kerberos.keytab.file | SCM service keytab file |
> | ozone.om.kerberos.principal | Ozone Manager service principal |
> | ozone.om.kerberos.keytab.file | Ozone Manager keytab file |
> | hdds.scm.http.kerberos.principal | SCM service spnego principal |
> | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file |
> | ozone.om.http.kerberos.principal | Ozone Manager spnego principal |
> | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file |
> | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file |
> | hdds.datanode.http.kerberos.principal | Datanode spnego principal |
> | dfs.datanode.kerberos.principal | Datanode service principal |
> | dfs.datanode.keytab.file | Datanode service keytab file |
> The prefix are very different for each of the datanode configuration.  It 
> would be nice to have some consistency for datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2393) HDDS-1847 broke some unit tests

2019-11-01 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-2393.
-
Resolution: Not A Problem

Reverted HDDS-1847 for now, so this should not be necessary.

Let's include the full fix there.

> HDDS-1847 broke some unit tests
> ---
>
> Key: HDDS-2393
> URL: https://issues.apache.org/jira/browse/HDDS-2393
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Chris Teoh
>Assignee: Chris Teoh
>Priority: Major
>
> Siyao Meng commented on HDDS-1847:
> --
> Looks like this commit breaks {{TestKeyManagerImpl}} in {{setUp()}} and 
> {{cleanup()}}. Run {{TestKeyManagerImpl#testListStatus()}} to steadily repro. 
> I believe there could be other tests that are broken by this.
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerHttpServer.getSpnegoPrincipal(StorageContainerManagerHttpServer.java:74)
> at 
> org.apache.hadoop.hdds.server.BaseHttpServer.(BaseHttpServer.java:81)
> at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerHttpServer.(StorageContainerManagerHttpServer.java:36)
> at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:330)
> at org.apache.hadoop.hdds.scm.TestUtils.getScm(TestUtils.java:544)
> at 
> org.apache.hadoop.ozone.om.TestKeyManagerImpl.setUp(TestKeyManagerImpl.java:150)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {code}
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.ozone.om.TestKeyManagerImpl.cleanup(TestKeyManagerImpl.java:176)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] create ozone-dev and ozone-issues mailing lists

2019-11-01 Thread Arpit Agarwal
Thanks for kicking this off Marton. Submitted INFRA requests to create the 
following. The lists should be live soon.

- ozone-dev@h.a.o
- ozone-issues@h.a.o
- ozone-commits@h.a.o



> On Oct 31, 2019, at 3:32 AM, Elek, Marton  wrote:
> 
> 
> Thanks for all the votes and feedback.
> 
> The vote is passed with no -1 and with many +1
> 
> The mailing lists will be created soon and the notification settings will be 
> updated.
> 
> Thank you for your patience.
> Marton
> 
> 
> On 10/27/19 9:25 AM, Elek, Marton wrote:
>> As discussed earlier in the thread of "Hadoop-Ozone repository mailing list 
>> configurations" [1] I suggested to solve the current misconfiguration 
>> problem with creating separated mailing lists (dev/issues) for Hadoop Ozone.
>> It would have some additional benefit: for example it would make easier to 
>> follow the Ozone development and future plans.
>> Here I am starting a new vote thread (open for at least 72 hours) to collect 
>> more feedback about this.
>> Please express your opinion / vote.
>> Thanks a lot,
>> Marton
>> [1] 
>> https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E
>>  -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] create ozone-dev and ozone-issues mailing lists

2019-10-30 Thread Arpit Agarwal
+1

> On Oct 27, 2019, at 1:25 AM, Elek, Marton  wrote:
> 
> 
> As discussed earlier in the thread of "Hadoop-Ozone repository mailing list 
> configurations" [1] I suggested to solve the current misconfiguration problem 
> with creating separated mailing lists (dev/issues) for Hadoop Ozone.
> 
> It would have some additional benefit: for example it would make easier to 
> follow the Ozone development and future plans.
> 
> Here I am starting a new vote thread (open for at least 72 hours) to collect 
> more feedback about this.
> 
> Please express your opinion / vote.
> 
> Thanks a lot,
> Marton
> 
> [1] 
> https://lists.apache.org/thread.html/dc66a30f48a744534e748c418bf7ab6275896166ca5ade11560ebaef@%3Chdfs-dev.hadoop.apache.org%3E
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-426) Add field modificationTime for Volume and Bucket

2019-10-29 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-426.

Resolution: Duplicate

> Add field modificationTime for Volume and Bucket
> 
>
> Key: HDDS-426
> URL: https://issues.apache.org/jira/browse/HDDS-426
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Dinesh Chitlangia
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie
>
> There are update operations that can be performed for Volume, Bucket and Key.
> While Key records the modification time, Volume and & Bucket do not capture 
> this.
>  
> This Jira proposes to add the required field to Volume and Bucket in order to 
> capture the modficationTime.
>  
> Current Status:
> {noformat}
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoVolume /dummyvol
> 2018-09-10 17:16:12 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "owner" : {
> "name" : "bilbo"
> },
> "quota" : {
> "unit" : "TB",
> "size" : 1048576
> },
> "volumeName" : "dummyvol",
> "createdOn" : "Mon, 10 Sep 2018 17:11:32 GMT",
> "createdBy" : "bilbo"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoBucket /dummyvol/mybuck
> 2018-09-10 17:15:25 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "volumeName" : "dummyvol",
> "bucketName" : "mybuck",
> "createdOn" : "Mon, 10 Sep 2018 17:12:09 GMT",
> "acls" : [ {
> "type" : "USER",
> "name" : "hadoop",
> "rights" : "READ_WRITE"
> }, {
> "type" : "GROUP",
> "name" : "users",
> "rights" : "READ_WRITE"
> }, {
> "type" : "USER",
> "name" : "spark",
> "rights" : "READ_WRITE"
> } ],
> "versioning" : "DISABLED",
> "storageType" : "DISK"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoKey /dummyvol/mybuck/myk1
> 2018-09-10 17:19:43 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "version" : 0,
> "md5hash" : null,
> "createdOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "modifiedOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "size" : 0,
> "keyName" : "myk1",
> "keyLocations" : [ ]
> }{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-426) Add field modificationTime for Volume and Bucket

2019-10-29 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-426:


> Add field modificationTime for Volume and Bucket
> 
>
> Key: HDDS-426
> URL: https://issues.apache.org/jira/browse/HDDS-426
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Dinesh Chitlangia
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: newbie
>
> There are update operations that can be performed for Volume, Bucket and Key.
> While Key records the modification time, Volume and & Bucket do not capture 
> this.
>  
> This Jira proposes to add the required field to Volume and Bucket in order to 
> capture the modficationTime.
>  
> Current Status:
> {noformat}
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoVolume /dummyvol
> 2018-09-10 17:16:12 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "owner" : {
> "name" : "bilbo"
> },
> "quota" : {
> "unit" : "TB",
> "size" : 1048576
> },
> "volumeName" : "dummyvol",
> "createdOn" : "Mon, 10 Sep 2018 17:11:32 GMT",
> "createdBy" : "bilbo"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoBucket /dummyvol/mybuck
> 2018-09-10 17:15:25 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "volumeName" : "dummyvol",
> "bucketName" : "mybuck",
> "createdOn" : "Mon, 10 Sep 2018 17:12:09 GMT",
> "acls" : [ {
> "type" : "USER",
> "name" : "hadoop",
> "rights" : "READ_WRITE"
> }, {
> "type" : "GROUP",
> "name" : "users",
> "rights" : "READ_WRITE"
> }, {
> "type" : "USER",
> "name" : "spark",
> "rights" : "READ_WRITE"
> } ],
> "versioning" : "DISABLED",
> "storageType" : "DISK"
> }
> hadoop@1987b5de4203:~$ ./bin/ozone oz -infoKey /dummyvol/mybuck/myk1
> 2018-09-10 17:19:43 WARN NativeCodeLoader:60 - Unable to load native-hadoop 
> library for your platform... using builtin-java classes where applicable
> {
> "version" : 0,
> "md5hash" : null,
> "createdOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "modifiedOn" : "Mon, 10 Sep 2018 17:19:04 GMT",
> "size" : 0,
> "keyName" : "myk1",
> "keyLocations" : [ ]
> }{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-2206) Separate handling for OMException and IOException in the Ozone Manager

2019-10-25 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-2206:
-

Reverted this based on offline conversation with [~aengineer].

Anu has requested we add a config key to control this behavior.

> Separate handling for OMException and IOException in the Ozone Manager
> --
>
> Key: HDDS-2206
> URL: https://issues.apache.org/jira/browse/HDDS-2206
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> As part of improving error propagation from the OM for ease of 
> troubleshooting and diagnosis, the proposal is to handle IOExceptions 
> separately from the business exceptions which are thrown as OMExceptions.
> Handling for OMExceptions will not be changed in this jira.
> Handling for IOExceptions will include logging the stacktrace on the server, 
> and propagation to the client under the control of a config parameter.
> Similar handling is also proposed for SCMException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2333) Enable sync option for OM non-HA

2019-10-21 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-2333.
-
Fix Version/s: 0.5.0
   Resolution: Fixed

Merged this with [~aengineer]'s +1 on the PR. Thanks for the review Anu and 
thanks Bharat for the contribution.

> Enable sync option for OM non-HA 
> -
>
> Key: HDDS-2333
> URL: https://issues.apache.org/jira/browse/HDDS-2333
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In OM non-HA when double buffer flushes, it should commit with sync turned 
> on. As in non-HA when power failure/system crashes, the operations which are 
> acknowledged by OM might be lost in this kind of scenario. (As in rocks DB 
> with Sync false, the flush is asynchronous and it will not persist to storage 
> system)
>  
> In HA, this is not a problem because the guarantee is provided by ratis and 
> ratis logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2303) [IGNORE] Test Jira

2019-10-14 Thread Arpit Agarwal (Jira)
Arpit Agarwal created HDDS-2303:
---

 Summary: [IGNORE] Test Jira
 Key: HDDS-2303
 URL: https://issues.apache.org/jira/browse/HDDS-2303
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


Ignore this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2213) Reduce key provider loading log level in OzoneFileSystem#getAdditionalTokenIssuers

2019-10-11 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-2213.
-
Fix Version/s: 0.5.0
   Resolution: Fixed

I've merged this.

> Reduce key provider loading log level in 
> OzoneFileSystem#getAdditionalTokenIssuers
> --
>
> Key: HDDS-2213
> URL: https://issues.apache.org/jira/browse/HDDS-2213
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> OzoneFileSystem#getAdditionalTokenIssuers log an error when secure client 
> tries to collect ozone delegation token to run MR/Spark jobs but ozone file 
> system does not have a kms provider configured. In this case, we simply 
> return null provider here in the code below. This is a benign error and we 
> should reduce the log level to debug level.
> {code:java}
> KeyProvider keyProvider;
>  try {
>   keyProvider = getKeyProvider(); }
> catch (IOException ioe) {
>   LOG.error("Error retrieving KeyProvider.", ioe);
>   return null;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-10-11 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-14305:
--

Reopening because this still needs to be fixed correctly.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: Konstantin Shvachko
>Priority: Major
>  Labels: multi-sbnn, release-blocker
> Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14305-007.patch, HDFS-14305-008.patch, 
> HDFS-14305.001.patch, HDFS-14305.002.patch, HDFS-14305.003.patch, 
> HDFS-14305.004.patch, HDFS-14305.005.patch, HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2170) Add Object IDs and Update ID to Volume Object

2019-09-24 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-2170.
-
Fix Version/s: 0.5.0
   Resolution: Fixed

> Add Object IDs and Update ID to Volume Object
> -
>
> Key: HDDS-2170
> URL: https://issues.apache.org/jira/browse/HDDS-2170
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This patch proposes to add object ID and update ID when a volume is created. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree

2019-09-18 Thread Arpit Agarwal
+1


> On Sep 17, 2019, at 2:49 AM, Elek, Marton  wrote:
> 
> 
> 
> TLDR; I propose to move Ozone related code out from Hadoop trunk and store it 
> in a separated *Hadoop* git repository apache/hadoop-ozone.git
> 
> 
> 
> 
> When Ozone was adopted as a new Hadoop subproject it was proposed[1] to be 
> part of the source tree but with separated release cadence, mainly because it 
> had the hadoop-trunk/SNAPSHOT as compile time dependency.
> 
> During the last Ozone releases this dependency is removed to provide more 
> stable releases. Instead of using the latest trunk/SNAPSHOT build from 
> Hadoop, Ozone uses the latest stable Hadoop (3.2.0 as of now).
> 
> As we have no more strict dependency between Hadoop trunk SNAPSHOT and Ozone 
> trunk I propose to separate the two code base from each other with creating a 
> new Hadoop git repository (apache/hadoop-ozone.git):
> 
> With moving Ozone to a separated git repository:
> 
> * It would be easier to contribute and understand the build (as of now we 
> always need `-f pom.ozone.xml` as a Maven parameter)
> * It would be possible to adjust build process without breaking Hadoop/Ozone 
> builds.
> * It would be possible to use different Readme/.asf.yaml/github template for 
> the Hadoop Ozone and core Hadoop. (For example the current github template 
> [2] has a link to the contribution guideline [3]. Ozone has an extended 
> version [4] from this guideline with additional information.)
> * Testing would be more safe as it won't be possible to change core Hadoop 
> and Hadoop Ozone in the same patch.
> * It would be easier to cut branches for Hadoop releases (based on the 
> original consensus, Ozone should be removed from all the release branches 
> after creating relase branches from trunk)
> 
> 
> What do you think?
> 
> Thanks,
> Marton
> 
> [1]: 
> https://lists.apache.org/thread.html/c85e5263dcc0ca1d13cbbe3bcfb53236784a39111b8c353f60582eb4@%3Chdfs-dev.hadoop.apache.org%3E
> [2]: 
> https://github.com/apache/hadoop/blob/trunk/.github/pull_request_template.md
> [3]: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
> [4]: 
> https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute+to+Ozone
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2129) Using dist profile fails with pom.ozone.xml as parent pom

2019-09-13 Thread Arpit Agarwal (Jira)
Arpit Agarwal created HDDS-2129:
---

 Summary: Using dist profile fails with pom.ozone.xml as parent pom
 Key: HDDS-2129
 URL: https://issues.apache.org/jira/browse/HDDS-2129
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


The build fails with the {{dist}} profile. Details in a comment below.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-2057) Incorrect Default OM Port in Ozone FS URI Error Message

2019-09-13 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-2057:
-

Reverted based on discussion with [~bharatviswa].

Bharat, can you comment with the details?

> Incorrect Default OM Port in Ozone FS URI Error Message
> ---
>
> Key: HDDS-2057
> URL: https://issues.apache.org/jira/browse/HDDS-2057
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The error message displayed from BasicOzoneFilesystem.initialize specifies 
> 5678 as the OM port. This is not the default port.
> "Ozone file system URL " +
>  "should be one of the following formats: " +
>  "o3fs://bucket.volume/key OR " +
>  "o3fs://bucket.volume.om-host.example.com/key OR " +
>  "o3fs://bucket.volume.om-host.example.com:5678/key";
>  
> This should be fixed to pull the default value from the configuration 
> parameter, instead of a hard-coded value.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2121) Create a shaded ozone file system (client) jar

2019-09-12 Thread Arpit Agarwal (Jira)
Arpit Agarwal created HDDS-2121:
---

 Summary: Create a shaded ozone file system (client) jar
 Key: HDDS-2121
 URL: https://issues.apache.org/jira/browse/HDDS-2121
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: build
Reporter: Arpit Agarwal


We need a shaded Hadoop filesystem jar that does not include Hadoop ecosystem 
components (Hadoop, HDFS, Ratis, Zookeeper).

A common expected use case for Ozone is Hadoop clients (3.2.0 and later) 
wanting to access Ozone via the Ozone Filesystem interface. For these clients, 
we want to add Ozone file system jar to the classpath, however we want to use 
Hadoop ecosystem dependencies that are `provided` and already expected to be in 
the client classpath.

Note that this is different from the legacy jar which bundles a shaded Hadoop 
3.2.0.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Move Submarine source code, documentation, etc. to a separate Apache Git repo

2019-08-29 Thread Arpit Agarwal
+1

> On Aug 23, 2019, at 7:05 PM, Wangda Tan  wrote:
> 
> Hi devs,
> 
> This is a voting thread to move Submarine source code, documentation from
> Hadoop repo to a separate Apache Git repo. Which is based on discussions of
> https://lists.apache.org/thread.html/e49d60b2e0e021206e22bb2d430f4310019a8b29ee5020f3eea3bd95@%3Cyarn-dev.hadoop.apache.org%3E
> 
> Contributors who have permissions to push to Hadoop Git repository will
> have permissions to push to the new Submarine repository.
> 
> This voting thread will run for 7 days and will end at Aug 30th.
> 
> Please let me know if you have any questions.
> 
> Thanks,
> Wangda Tan


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2046) Fix NOTICE file

2019-08-27 Thread Arpit Agarwal (Jira)
Arpit Agarwal created HDDS-2046:
---

 Summary: Fix NOTICE file
 Key: HDDS-2046
 URL: https://issues.apache.org/jira/browse/HDDS-2046
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 0.4.1
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


NOTICE file needs to be updated based on Justin's comments here:

 

[https://mail-archives.apache.org/mod_mbox/incubator-general/201908.mbox/%3C8EA21F57-A972-4CBE-AC2F-D3830FE6BDB4%40classsoftware.com%3E]

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1829) On OM reload/restart OmMetrics#numKeys should be updated

2019-08-08 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1829.
-
   Resolution: Fixed
Fix Version/s: 0.5.0

Committed to trunk. Thanks for the contribution [~smeng].

> On OM reload/restart OmMetrics#numKeys should be updated
> 
>
> Key: HDDS-1829
> URL: https://issues.apache.org/jira/browse/HDDS-1829
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> When OM is restarted or the state is reloaded, OM Metrics is re-initialized. 
> The saved numKeys value might not be valid as the DB state could have 
> changed. Hence, the numKeys metric must be updated with the correct value on 
> metrics re-initialization.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1739) Handle Apply Transaction Failure in State Machine

2019-08-06 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1739.
-
Resolution: Duplicate

> Handle Apply Transaction Failure in State Machine
> -
>
> Key: HDDS-1739
> URL: https://issues.apache.org/jira/browse/HDDS-1739
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>
> Scope of this jira is to handle failure of applyTransaction() for the 
> Container State Machine.
> 1. Introduce new Replica state - STALE to indicate container is missing 
> transactions. Mark failed container as STALE.
> 2. Trigger immediate ICR to SCM
> 3. Fail new transactions on STALE container
> 4. Notify volume error to the DN (to trigger background volume check)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-1440) Convert all MPU related operations to HA model

2019-07-31 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-1440:
-

> Convert all MPU related operations to HA model
> --
>
> Key: HDDS-1440
> URL: https://issues.apache.org/jira/browse/HDDS-1440
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.5.0
>
>
> In this jira, we shall convert all OM related operations to OM HA model, 
> which is a 2 step.
>  # StartTransaction, where we validate request and check for any errors and 
> return the response.
>  # ApplyTransaction, where original OM request will have a response which 
> needs to be applied to OM DB. This step is just to apply response to Om DB.
> In this way, all requests which are failed with like volume not found or some 
> conditions which i have not satisfied like when deleting volume should be 
> empty, these all will be executed during startTransaction, and if it fails 
> these requests will not be written to raft log also.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1440) Convert all MPU related operations to HA model

2019-07-31 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1440.
-
Resolution: Done

> Convert all MPU related operations to HA model
> --
>
> Key: HDDS-1440
> URL: https://issues.apache.org/jira/browse/HDDS-1440
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.5.0
>
>
> In this jira, we shall convert all OM related operations to OM HA model, 
> which is a 2 step.
>  # StartTransaction, where we validate request and check for any errors and 
> return the response.
>  # ApplyTransaction, where original OM request will have a response which 
> needs to be applied to OM DB. This step is just to apply response to Om DB.
> In this way, all requests which are failed with like volume not found or some 
> conditions which i have not satisfied like when deleting volume should be 
> empty, these all will be executed during startTransaction, and if it fails 
> these requests will not be written to raft log also.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-152) Support HA for Ozone Manager

2019-07-30 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-152.

Resolution: Duplicate

> Support HA for Ozone Manager
> 
>
> Key: HDDS-152
> URL: https://issues.apache.org/jira/browse/HDDS-152
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: DENG FEI
>Priority: Major
>
> Ozone Manager(OM) provide the name services on top of HDDS(SCM). This ticket 
> is opened to add HA support for OM. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-152) Support HA for Ozone Manager

2019-07-30 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-152:


> Support HA for Ozone Manager
> 
>
> Key: HDDS-152
> URL: https://issues.apache.org/jira/browse/HDDS-152
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: DENG FEI
>Priority: Major
>
> Ozone Manager(OM) provide the name services on top of HDDS(SCM). This ticket 
> is opened to add HA support for OM. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-151) Add HA support for Ozone

2019-07-30 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-151.

Resolution: Duplicate

Resolving as a duplicate of HDDS-505. This was filed first, however OM HA 
development has been happening on HDDS-505 for a while now.

> Add HA support for Ozone
> 
>
> Key: HDDS-151
> URL: https://issues.apache.org/jira/browse/HDDS-151
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
>
> This includes HA for OM and SCM and their clients.  For OM and SCM, our 
> initial proposal is to use RATIS to ensure consistent/reliable replication of 
> metadata. We will post a design doc and create a separate branch for the 
> feature development.
> cc: [~anu], [~jnpandey], [~szetszwo], [~msingh], [~hellodengfei]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1862) Verify result of exceptional completion of notifyInstallSnapshotFromLeader

2019-07-25 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1862:
---

 Summary: Verify result of exceptional completion of 
notifyInstallSnapshotFromLeader
 Key: HDDS-1862
 URL: https://issues.apache.org/jira/browse/HDDS-1862
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
 Environment: What happens if the future returned to Ratis from 
{{notifyInstallSnapshotFromLeader}} is completed exceptionally.

The safest option sounds like Ratis should kill the process. Or potentially it 
can retry after a short time.

This jira is to investigate the answer.
Reporter: Arpit Agarwal






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1830) OzoneManagerDoubleBuffer#stop should wait for daemon thread to die

2019-07-25 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1830.
-
   Resolution: Fixed
Fix Version/s: 0.5.0

+1

Merged via GitHub. Thanks for the contribution [~smeng] and thanks for the 
review [~bharatviswa]!

> OzoneManagerDoubleBuffer#stop should wait for daemon thread to die
> --
>
> Key: HDDS-1830
> URL: https://issues.apache.org/jira/browse/HDDS-1830
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Hanisha Koneru
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Based on [~arp]'s comment on HDDS-1649, OzoneManagerDoubleBuffer#stop() calls 
> interrupt() on daemon thread but not join(). The thread might still be 
> running when the call returns. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1860) importContainer hard-codes KeyValue Container classes

2019-07-24 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1860:
---

 Summary: importContainer hard-codes KeyValue Container classes
 Key: HDDS-1860
 URL: https://issues.apache.org/jira/browse/HDDS-1860
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


importContainer should use the same instantiation logic as the DataNode startup 
code. It should not hard-code the KeyValueContainer classes:
{code}
KeyValueContainerData containerData =
new KeyValueContainerData(containerID,
maxSize, originPipelineId, originNodeId);

KeyValueContainer container = new KeyValueContainer(containerData,
conf);
{code}

This will break when we revision the container layout. Also making the 
instantiation logic shared will save code.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1859) Need a way to determine relative position of OM instances when cluster is down

2019-07-24 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1859:
---

 Summary: Need a way to determine relative position of OM instances 
when cluster is down
 Key: HDDS-1859
 URL: https://issues.apache.org/jira/browse/HDDS-1859
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Arpit Agarwal


It will be useful to figure out the relative positions of the OM instances when 
the cluster is down.

In HDFS world, we can do this by examining the edit log files and the 
{{seen_txid}} file.

In Ozone, this is slightly more difficult. We have to determine
# relative positions of Ratis logs, i.e. log index and committed index. We may 
need to build some offline Ratis log parser for this.
# relative positions of state machines. i.e. last applied index.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1315) datanode process dies if it runs out of disk space

2019-07-23 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1315.
-
Resolution: Duplicate

This was fixed by HDDS-1603. [~Sandeep Nemuri] I've made you the reporter for 
HDDS-1603. Thanks.

> datanode process dies if it runs out of disk space
> --
>
> Key: HDDS-1315
> URL: https://issues.apache.org/jira/browse/HDDS-1315
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Sandeep Nemuri
>Assignee: Supratim Deka
>Priority: Major
>
> As of now the datanode process dies if it runs out of disk space which makes 
> the data present in that DN inaccessible.
> datanode logs: 
> {code:java}
> 2019-03-11 04:01:27,141 ERROR org.apache.ratis.server.storage.RaftLogWorker: 
> Terminating with exit status 1: 
> fb635e52-e2eb-46b1-b109-a831c10d3bf8-RaftLogWorker failed.
> java.io.FileNotFoundException: 
> /opt/data/meta/ratis/68e315f3-312c-4c9f-a7bd-590194deb5e7/current/log_inprogress_8705582
>  (No space left on device)
>   at java.io.RandomAccessFile.open0(Native Method)
>   at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
>   at java.io.RandomAccessFile.(RandomAccessFile.java:243)
>   at 
> org.apache.ratis.server.storage.LogOutputStream.(LogOutputStream.java:66)
>   at 
> org.apache.ratis.server.storage.RaftLogWorker$StartLogSegment.execute(RaftLogWorker.java:436)
>   at 
> org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:219)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> {code:java}
> 2019-03-11 04:01:25,531 [grpc-default-executor-9192] INFO   - Operation: 
> WriteChunk : Trace ID:  : Message: java.nio.file.FileSystemException: 
> /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/88/chunks/ba29bb91559179cbf7ab5d86cac47ba1_stream_9fb1e802-dca6-46e0-be12-5ac743d8563d_chunk_1.tmp.11076.8705539:
>  No space left on device : Result: IO_EXCEPTION
> 2019-03-11 04:01:25,543 [grpc-default-executor-9192] INFO   - Operation: 
> WriteChunk : Trace ID:  : Message: java.nio.file.FileSystemException: 
> /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/86/chunks/19ef3c1d36eadbc9538116c68c6e494f_stream_c58e8b91-dc18-4b61-918f-ab1eeda41c02_chunk_1.tmp.11076.8705540:
>  No space left on device : Result: IO_EXCEPTION
> 2019-03-11 04:01:25,546 [grpc-default-executor-9192] INFO   - Operation: 
> WriteChunk : Trace ID:  : Message: java.nio.file.FileSystemException: 
> /opt/data/hdds/a83a7108-91c7-4357-9f68-46753641d429/current/containerDir0/87/chunks/83a6a81f2f703f49a7e0a1413eebfc4c_stream_cae1ed30-c613-4278-8404-c9e37d0b690f_chunk_1.tmp.11076.8705541:
>  No space left on device : Result: IO_EXCEPTION
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1838) Fix illegal reflective access warning in KerberosUtil

2019-07-19 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1838:
---

 Summary: Fix illegal reflective access warning in KerberosUtil
 Key: HDDS-1838
 URL: https://issues.apache.org/jira/browse/HDDS-1838
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


Let's fix the following warning. This can be seen when running {{ozonesecure}} 
tests.
{code:java}
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by 
org.apache.hadoop.security.authentication.util.KerberosUtil 
(file:/opt/hadoop/share/ozone/lib/hadoop-auth-3.2.0.jar) to method 
sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of 
org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
WARNING: All illegal access operations will be denied in a future release{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1837) Move secure tests docker setup to Apache Hadoop repo

2019-07-19 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1837:
---

 Summary: Move secure tests docker setup to Apache Hadoop repo
 Key: HDDS-1837
 URL: https://issues.apache.org/jira/browse/HDDS-1837
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


The {{ozonesecure}} smoke tests download docker image build scripts from 
[https://github.com/ajayydv/docker/.|https://github.com/ajayydv/docker/]

Let's move these to an Apache repo. The code is Apache licensed and contributed 
by an Ozone contributor, so there will be no legal concerns.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1831) Use annotation based configuration for OZONE_OM_RATIS_LOG_PURGE_GAP

2019-07-18 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1831:
---

 Summary: Use annotation based configuration for 
OZONE_OM_RATIS_LOG_PURGE_GAP
 Key: HDDS-1831
 URL: https://issues.apache.org/jira/browse/HDDS-1831
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


HDDS-1649 introduced a new configuration {{OZONE_OM_RATIS_LOG_PURGE_GAP}}. It 
should be defined using the new annotation-based format:

[https://cwiki.apache.org/confluence/display/HADOOP/Java-based+configuration+API]

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1705) Recon: Add estimatedTotalCount to the response of containers and containers/{id} endpoints

2019-07-08 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1705.
-
  Resolution: Fixed
   Fix Version/s: 0.4.1
Target Version/s:   (was: 0.5.0)

I've committed this. Thanks for the contribution [~vivekratnavel] and thanks 
for the review [~swagle].

> Recon: Add estimatedTotalCount to the response of containers and 
> containers/{id} endpoints
> --
>
> Key: HDDS-1705
> URL: https://issues.apache.org/jira/browse/HDDS-1705
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.4.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.1
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1696) RocksDB use separate Write-ahead-log location for RocksDB.

2019-06-24 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1696.
-
Resolution: Won't Fix

Resolving as Won't Fix. We can load RocksDB settings from separate ini file 
including WAL location.

> RocksDB use separate Write-ahead-log location for RocksDB.
> --
>
> Key: HDDS-1696
> URL: https://issues.apache.org/jira/browse/HDDS-1696
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This will help on production systems where WAL logs and db actual data files 
> will be in a different location. During compaction, it will not affect actual 
> writes.
>  
> Suggested by [~msingh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1665) Issue in openKey when allocating block

2019-06-10 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1665.
-
Resolution: Duplicate

> Issue in openKey when allocating block
> --
>
> Key: HDDS-1665
> URL: https://issues.apache.org/jira/browse/HDDS-1665
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> We set size as below



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Merge HDFS-13891(RBF) to trunk

2019-06-10 Thread Arpit Agarwal
Thanks for the explanation Brahma and Iñigo!

+0 from me (haven’t looked at it closely enough to give a +1).

Regards,
Arpit


> On Jun 10, 2019, at 10:12 AM, Brahma Reddy Battula  wrote:
> 
> Dear Arpit,
> 
> Thanks for taking look into it.
> 
> ECBlockGroupStats.merge() is Utility method which is moved to 
> hadoop-hdfs-client module. Ideally it could have been seperate jira. But this 
> changes will not induce any issues, will take necessary action for this.
> 
> On Mon, Jun 10, 2019 at 8:40 PM Arpit Agarwal  <mailto:aagar...@cloudera.com>> wrote:
> I scanned the merge payload for changes to non-RBF code. The changes are 
> minimal, which is good.
> 
> The only commit that I didn’t understand was:
> https://issues.apache.org/jira/browse/HDFS-14268 
> <https://issues.apache.org/jira/browse/HDFS-14268>
> 
> The jira description doesn’t make it clear why ECBlockGroupStats is modified.
> 
> +0 apart from that.
> 
> 
>> On Jun 1, 2019, at 8:40 PM, Brahma Reddy Battula > <mailto:bra...@apache.org>> wrote:
>> 
>> Dear Hadoop Developers
>> 
>> I would like to propose RBF Branch (HDFS-13891) merge into trunk. We have
>> been working on this feature from last several months.
>> This feature work received the contributions from different companies. All
>> of the feature development happened smoothly and collaboratively in JIRAs.
>> 
>> Kindly do take a look at the branch and raise issues/concerns that need to
>> be addressed before the merge.
>> 
>> *Highlights of HDFS-13891 Branch:*
>> =
>> 
>> Adding Security to RBF(1)
>> Adding Missing Client API's(2)
>> Improvements/Bug Fixing
>>  Critical - HDFS-13637, HDFS-13834
>> 
>> *Commits:*
>> 
>> 
>> No of JIRAs Resolved: 72
>> 
>> All this commits are in RBF Module. No changes in hdfs/common.
>> 
>> *Tested Cluster:*
>> =
>> 
>> Most of these changes verified at Uber,Microsoft,Huawei and some other
>> compaines.
>> 
>> *Uber*: Most changes are running in production @Uber including the critical
>> security changes, HDFS Clusters are 4000+ nodes with 8 HDFS Routers.
>> Zookeeper as a state store to hold delegation tokens were also stress
>> tested to hold more than 2 Million tokens. --CR Hota
>> 
>> *MicroSoft*: Most of these changes are currently running in production at
>> Microsoft.The security has also been tested in a 500 server cluster with 4
>> subclsuters. --Inigo Goiri
>> 
>> *Huawei* : Deployed all this changes in 20 node cluster with 3
>> routers.Planning deploy 10K production cluster.
>> 
>> *Contributors:*
>> ===
>> 
>> Many thanks to Akira Ajisaka,Mohammad Arshad,Takanobu Asanuma,Shubham
>> Dewan,CR Hota,Fei Hui,Inigo Goiri,Dibyendu Karmakar,Fengna Li,Gang
>> Li,Surendra Singh Lihore,Ranith Sardar,Ayush Saxena,He Xiaoqiao,Sherwood
>> Zheng,Daryn Sharp,VinayaKumar B,Anu Engineer for invloving discussions and
>> contributing to this.
>> 
>> *Future Tasks:*
>> 
>> 
>> will cleanup the jira's under this umbrella and contiue to work.
>> 
>> Reference:
>> 1) https://issues.apache.org/jira/browse/HDFS-13532 
>> <https://issues.apache.org/jira/browse/HDFS-13532>
>> 2) https://issues.apache.org/jira/browse/HDFS-13655 
>> <https://issues.apache.org/jira/browse/HDFS-13655>
>> 
>> 
>> 
>> 
>> --Brahma Reddy Battula
> 
> 
> 
> -- 
> 
> 
> 
> --Brahma Reddy Battula



Re: [DISCUSS] Merge HDFS-13891(RBF) to trunk

2019-06-10 Thread Arpit Agarwal
I scanned the merge payload for changes to non-RBF code. The changes are 
minimal, which is good.

The only commit that I didn’t understand was:
https://issues.apache.org/jira/browse/HDFS-14268 


The jira description doesn’t make it clear why ECBlockGroupStats is modified.

+0 apart from that.


> On Jun 1, 2019, at 8:40 PM, Brahma Reddy Battula  wrote:
> 
> Dear Hadoop Developers
> 
> I would like to propose RBF Branch (HDFS-13891) merge into trunk. We have
> been working on this feature from last several months.
> This feature work received the contributions from different companies. All
> of the feature development happened smoothly and collaboratively in JIRAs.
> 
> Kindly do take a look at the branch and raise issues/concerns that need to
> be addressed before the merge.
> 
> *Highlights of HDFS-13891 Branch:*
> =
> 
> Adding Security to RBF(1)
> Adding Missing Client API's(2)
> Improvements/Bug Fixing
>  Critical - HDFS-13637, HDFS-13834
> 
> *Commits:*
> 
> 
> No of JIRAs Resolved: 72
> 
> All this commits are in RBF Module. No changes in hdfs/common.
> 
> *Tested Cluster:*
> =
> 
> Most of these changes verified at Uber,Microsoft,Huawei and some other
> compaines.
> 
> *Uber*: Most changes are running in production @Uber including the critical
> security changes, HDFS Clusters are 4000+ nodes with 8 HDFS Routers.
> Zookeeper as a state store to hold delegation tokens were also stress
> tested to hold more than 2 Million tokens. --CR Hota
> 
> *MicroSoft*: Most of these changes are currently running in production at
> Microsoft.The security has also been tested in a 500 server cluster with 4
> subclsuters. --Inigo Goiri
> 
> *Huawei* : Deployed all this changes in 20 node cluster with 3
> routers.Planning deploy 10K production cluster.
> 
> *Contributors:*
> ===
> 
> Many thanks to Akira Ajisaka,Mohammad Arshad,Takanobu Asanuma,Shubham
> Dewan,CR Hota,Fei Hui,Inigo Goiri,Dibyendu Karmakar,Fengna Li,Gang
> Li,Surendra Singh Lihore,Ranith Sardar,Ayush Saxena,He Xiaoqiao,Sherwood
> Zheng,Daryn Sharp,VinayaKumar B,Anu Engineer for invloving discussions and
> contributing to this.
> 
> *Future Tasks:*
> 
> 
> will cleanup the jira's under this umbrella and contiue to work.
> 
> Reference:
> 1) https://issues.apache.org/jira/browse/HDFS-13532
> 2) https://issues.apache.org/jira/browse/HDFS-13655
> 
> 
> 
> 
> --Brahma Reddy Battula



[jira] [Resolved] (HDDS-1559) Include committedBytes to determine Out of Space in VolumeChoosingPolicy

2019-05-28 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1559.
-
  Resolution: Fixed
   Fix Version/s: 0.5.0
Target Version/s:   (was: 0.5.0)

I've committed this to trunk. Thanks for the contribution [~sdeka].

> Include committedBytes to determine Out of Space in VolumeChoosingPolicy
> 
>
> Key: HDDS-1559
> URL: https://issues.apache.org/jira/browse/HDDS-1559
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This is a follow-up from HDDS-1511 and HDDS-1535
> Currently  when creating a new Container, the DN invokes 
> RoundRobinVolumeChoosingPolicy:chooseVolume(). This routine checks for 
> (volume available space > container max size). If no eligible volume is 
> found, the policy throws a DiskOutOfSpaceException. This is the current 
> behaviour.
> However, the computation of available space does not take into consideration 
> the space
> that is going to be consumed by writes to existing containers which are still 
> Open and accepting chunk writes.
> This Jira proposes to enhance the space availability check in chooseVolume by 
> inclusion of committed space(committedBytes in HddsVolume) in the equation.
> The handling/management of the exception in Ratis will not be modified in 
> this Jira. That will be scoped separately as part of Datanode IO Failure 
> handling work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1592) TestReplicationManager failed in pre-commit run

2019-05-25 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1592:
---

 Summary: TestReplicationManager failed in pre-commit run
 Key: HDDS-1592
 URL: https://issues.apache.org/jira/browse/HDDS-1592
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


E.g. https://ci.anzix.net/job/ozone/16892/testReport/

Exception details in comment below.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1535) Space tracking for Open Containers : Handle Node Startup

2019-05-23 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1535.
-
  Resolution: Fixed
   Fix Version/s: 0.5.0
Target Version/s:   (was: 0.5.0)

I've committed this. Thanks for the contribution [~sdeka].

> Space tracking for Open Containers : Handle Node Startup
> 
>
> Key: HDDS-1535
> URL: https://issues.apache.org/jira/browse/HDDS-1535
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Supratim Deka
>Assignee: Supratim Deka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This is related to HDDS-1511
> Space tracking for Open Containers (committed space in the volume) relies on 
> usedBytes in the Container state. usedBytes is not persisted for every update 
> (chunkWrite). So on a node restart the value is stale.
> The proposal is to:
> iterate the block DB for each open container during startup and compute the 
> used space.
> The block DB process will be accelerated by spawning executors for each 
> container.
> This process will be carried out as part of building the container set during 
> startup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-994) Unable to start OM from secure docker compose

2019-05-15 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-994.

Resolution: Resolved

I am resolving this since our docker tests have seen significant changes since 
Jan.

Please reopen if you still see the same issue.

> Unable to start OM from secure docker compose
> -
>
> Key: HDDS-994
> URL: https://issues.apache.org/jira/browse/HDDS-994
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
>
> {code:java}
> om_1        | 2019-01-23 00:50:58 ERROR OzoneManager:418 - Unable to read key 
> pair for OM.
> om_1        | org.apache.hadoop.ozone.security.OzoneSecurityException: Error 
> reading private file for OzoneManager
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.readKeyPair(OzoneManager.java:460)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.startSecretManager(OzoneManager.java:416)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.startSecretManagerIfNecessary(OzoneManager.java:980)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:802)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:589)
> om_1        | Caused by: java.lang.NullPointerException
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.readKeyPair(OzoneManager.java:457)
> om_1        | ... 4 more
> om_1        | 2019-01-23 00:50:58 ERROR OzoneManager:593 - Failed to start 
> the OzoneManager.
> om_1        | java.lang.RuntimeException: 
> org.apache.hadoop.ozone.security.OzoneSecurityException: Error reading 
> private file for OzoneManager
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.startSecretManager(OzoneManager.java:419)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.startSecretManagerIfNecessary(OzoneManager.java:980)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.start(OzoneManager.java:802)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:589)
> om_1        | Caused by: 
> org.apache.hadoop.ozone.security.OzoneSecurityException: Error reading 
> private file for OzoneManager
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.readKeyPair(OzoneManager.java:460)
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.startSecretManager(OzoneManager.java:416)
> om_1        | ... 3 more
> om_1        | Caused by: java.lang.NullPointerException
> om_1        | at 
> org.apache.hadoop.ozone.om.OzoneManager.readKeyPair(OzoneManager.java:457)
> om_1        | ... 4 more
> om_1        | 2019-01-23 00:50:58 INFO  ExitUtil:210 - Exiting with status 1: 
> java.lang.RuntimeException: 
> org.apache.hadoop.ozone.security.OzoneSecurityException: Error reading 
> private file for OzoneManager{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Unprotect HDFS-13891 (HDFS RBF Branch)

2019-05-14 Thread Arpit Agarwal
The request is specific to HDFS-13891, correct?

We should not allow force push on trunk.


> On May 14, 2019, at 8:07 AM, Anu Engineer  
> wrote:
> 
> Is it possible to unprotect the branches and not the trunk? Generally, a
> force push to trunk indicates a mistake and we have had that in the past.
> This is just a suggestion,  even if this request is not met, I am still +1.
> 
> Thanks
> Anu
> 
> 
> 
> On Tue, May 14, 2019 at 4:58 AM Takanobu Asanuma 
> wrote:
> 
>> +1.
>> 
>> Thanks!
>> - Takanobu
>> 
>> 
>> From: Akira Ajisaka 
>> Sent: Tuesday, May 14, 2019 4:26:30 PM
>> To: Giovanni Matteo Fumarola
>> Cc: Iñigo Goiri; Brahma Reddy Battula; Hadoop Common; Hdfs-dev
>> Subject: Re: [VOTE] Unprotect HDFS-13891 (HDFS RBF Branch)
>> 
>> +1 to unprotect the branch.
>> 
>> Thanks,
>> Akira
>> 
>> On Tue, May 14, 2019 at 3:11 PM Giovanni Matteo Fumarola
>>  wrote:
>>> 
>>> +1 to unprotect the branches for rebases.
>>> 
>>> On Mon, May 13, 2019 at 11:01 PM Iñigo Goiri  wrote:
>>> 
 Syncing the branch to trunk should be a fairly standard task.
 Is there a way to do this without rebasing and forcing the push?
 As far as I know this has been the standard for other branches and I
>> don't
 know of any alternative.
 We should clarify the process as having to get PMC consensus to rebase
>> a
 branch seems a little overkill to me.
 
 +1 from my side to un protect the branch to do the rebase.
 
 On Mon, May 13, 2019, 22:46 Brahma Reddy Battula 
 wrote:
 
> Hi Folks,
> 
> INFRA-18181 made all the Hadoop branches are protected.
> Badly HDFS-13891 branch needs to rebased as we contribute core
>> patches
> trunk..So,currently we are stuck with rebase as it’s not allowed to
>> force
> push.Hence raised INFRA-18361.
> 
> Can we have a quick vote for INFRA sign-off to proceed as this is
 blocking
> all branch commits??
> 
> --
> 
> 
> 
> --Brahma Reddy Battula
> 
 
>> 
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> 
>> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1524) Ozone Developer Documentation Project

2019-05-13 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1524:
---

 Summary: Ozone Developer Documentation Project
 Key: HDDS-1524
 URL: https://issues.apache.org/jira/browse/HDDS-1524
 Project: Hadoop Distributed Data Store
  Issue Type: Task
Reporter: Arpit Agarwal


There are many great developer design documents for Ozone features and 
sub-task. Most of them are attached to different jiras as PDFs.

We should collect them, convert to Markdown and make them part of the Ozone 
documentation under a separate 'Design Docs' section for easy reference.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: VOTE: Hadoop Ozone 0.4.0-alpha RC2

2019-05-05 Thread Arpit Agarwal
Thanks for building this RC Ajay.

+1 binding.

- verified signatures and checksums
- built from source 
- deployed to 3 node cluster, tried out basic operations
- ran smoke tests
- ran unit tests
- LICENSE/NOTICE files look ok

There is an extra file in the source root named JenkinsFile.



> On Apr 29, 2019, at 9:04 PM, Ajay Kumar  wrote:
> 
> Hi All,
> 
> 
> 
> We have created the third release candidate (RC2) for Apache Hadoop Ozone 
> 0.4.0-alpha.
> 
> 
> 
> This release contains security payload for Ozone. Below are some important 
> features in it:
> 
> 
> 
>  *   Hadoop Delegation Tokens and Block Tokens supported for Ozone.
>  *   Transparent Data Encryption (TDE) Support - Allows data blocks to be 
> encrypted-at-rest.
>  *   Kerberos support for Ozone.
>  *   Certificate Infrastructure for Ozone  - Tokens use PKI instead of shared 
> secrets.
>  *   Datanode to Datanode communication secured via mutual TLS.
>  *   Ability secure ozone cluster that works with Yarn, Hive, and Spark.
>  *   Skaffold support to deploy Ozone clusters on K8s.
>  *   Support S3 Authentication Mechanisms like - S3 v4 Authentication 
> protocol.
>  *   S3 Gateway supports Multipart upload.
>  *   S3A file system is tested and supported.
>  *   Support for Tracing and Profiling for all Ozone components.
>  *   Audit Support - including Audit Parser tools.
>  *   Apache Ranger Support in Ozone.
>  *   Extensive failure testing for Ozone.
> 
> The RC artifacts are available at 
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc2/
> 
> 
> 
> The RC tag in git is ozone-0.4.0-alpha-RC2 (git hash 
> 4ea602c1ee7b5e1a5560c6cbd096de4b140f776b)
> 
> 
> 
> Please try 
> out,
>  vote, or just give us feedback.
> 
> 
> 
> The vote will run for 5 days, ending on May 4, 2019, 04:00 UTC.
> 
> 
> 
> Thank you very much,
> 
> Ajay


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1489) Unnecessary log messages on console with Ozone shell

2019-05-03 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1489:
---

 Summary: Unnecessary log messages on console with Ozone shell 
 Key: HDDS-1489
 URL: https://issues.apache.org/jira/browse/HDDS-1489
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone CLI
Reporter: Arpit Agarwal


The following log messages are printed on the console when running putkey
{code}
$ ozone sh key put /vol1/bucket1/key1 myfile
2019-05-03 23:25:15 INFO  GrpcClientProtocolClient:254 - 
client-9A5E39BD681D->96c8bede-ba3f-4e01-86d8-53f97957f140: receive 
RaftClientReply:client-9A5E39BD681D->96c8bede-ba3f-4e01-86d8-53f97957f140@group-8B0913807C4D,
 cid=0, SUCCESS, logIndex=1, commits[96c8bede-ba3f-4e01-86d8-53f97957f140:c2]
2019-05-03 23:25:16 INFO  GrpcClientProtocolClient:254 - 
client-9A5E39BD681D->96c8bede-ba3f-4e01-86d8-53f97957f140: receive 
RaftClientReply:client-9A5E39BD681D->96c8bede-ba3f-4e01-86d8-53f97957f140@group-8B0913807C4D,
 cid=1, SUCCESS, logIndex=3, commits[96c8bede-ba3f-4e01-86d8-53f97957f140:c4]
{code}

These are unnecessary noise and should be suppressed by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1473) DataNode ID file should be human readable

2019-04-26 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1473:
---

 Summary: DataNode ID file should be human readable
 Key: HDDS-1473
 URL: https://issues.apache.org/jira/browse/HDDS-1473
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Arpit Agarwal


The DataNode ID file should be human readable to make debugging easier. We 
should use YAML as we have used it elsewhere for meta files.

Currently it is a binary file whose contents are protobuf encoded. This is a 
tiny file read once on startup, so performance is not a concern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1465) Document the container replica state machine

2019-04-24 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1465:
---

 Summary: Document the container replica state machine
 Key: HDDS-1465
 URL: https://issues.apache.org/jira/browse/HDDS-1465
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: documentation
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Let's document the container states and the state transitions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: VOTE: Hadoop Ozone 0.4.0-alpha RC1

2019-04-22 Thread Arpit Agarwal
Thanks Ajay for putting together this RC.

Unfortunately HDDS-1425  looks 
like a blocker. We should make the docker experience smooth for anyone trying 
out 0.4.0.

I’ve just committed Marton’s patch for HDDS-1425 this morning. Let’s roll a new 
RC.



> On Apr 15, 2019, at 4:09 PM, Ajay Kumar  wrote:
> 
> Hi all,
> 
> We have created the second release candidate (RC1) for Apache Hadoop Ozone 
> 0.4.0-alpha.
> 
> This release contains security payload for Ozone. Below are some important 
> features in it:
> 
>  *   Hadoop Delegation Tokens and Block Tokens supported for Ozone.
>  *   Transparent Data Encryption (TDE) Support - Allows data blocks to be 
> encrypted-at-rest.
>  *   Kerberos support for Ozone.
>  *   Certificate Infrastructure for Ozone  - Tokens use PKI instead of shared 
> secrets.
>  *   Datanode to Datanode communication secured via mutual TLS.
>  *   Ability secure ozone cluster that works with Yarn, Hive, and Spark.
>  *   Skaffold support to deploy Ozone clusters on K8s.
>  *   Support S3 Authentication Mechanisms like - S3 v4 Authentication 
> protocol.
>  *   S3 Gateway supports Multipart upload.
>  *   S3A file system is tested and supported.
>  *   Support for Tracing and Profiling for all Ozone components.
>  *   Audit Support - including Audit Parser tools.
>  *   Apache Ranger Support in Ozone.
>  *   Extensive failure testing for Ozone.
> 
> The RC artifacts are available at 
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc1
> 
> The RC tag in git is ozone-0.4.0-alpha-RC1 (git hash 
> d673e16d14bb9377f27c9017e2ffc1bcb03eebfb)
> 
> Please try 
> out,
>  vote, or just give us feedback.
> 
> The vote will run for 5 days, ending on April 20, 2019, 19:00 UTC.
> 
> Thank you very much,
> 
> Ajay
> 
> 



[jira] [Resolved] (HDDS-1425) Ozone compose files are not compatible with the latest docker-compose

2019-04-22 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1425.
-
   Resolution: Fixed
Fix Version/s: 0.4.1

I've merged this to ozone-0.4. Thanks for the backport patch also [~elek].

If we decide to roll a new 0.4.0 RC let's update the fix version to 0.4.0.

> Ozone compose files are not compatible with the latest docker-compose
> -
>
> Key: HDDS-1425
> URL: https://issues.apache.org/jira/browse/HDDS-1425
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.4.1, 0.5.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I upgraded my docker-compose to the latest available one (1.24.0)
> But after the upgrade I can't start the docker-compose based cluster any more:
> {code}
> ./test.sh 
> -
> Executing test(s): [basic]
>   Cluster type:  ozone
>   Compose file:  
> /home/elek/projects/hadoop-review/hadoop-ozone/dist/target/ozone-0.4.0-SNAPSHOT/smoketest/../compose/ozone/docker-compose.yaml
>   Output dir:
> /home/elek/projects/hadoop-review/hadoop-ozone/dist/target/ozone-0.4.0-SNAPSHOT/smoketest/result
>   Command to rerun:  ./test.sh --keep --env ozone basic
> -
> ERROR: In file 
> /home/elek/projects/hadoop-review/hadoop-ozone/dist/target/ozone-0.4.0-SNAPSHOT/compose/ozone/docker-config:
>  environment variable name 'LOG4J2.PROPERTIES_appender.rolling.file 
> {code}
> It turned out that the line of LOG4J2.PROPERTIES_appender.rolling.file 
> contains an unnecessary space which is not accepted by the latest 
> docker-compose any more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-14441) Provide a way to query INodeId (fileId) via Hadoop shell

2019-04-18 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-14441.
--
Resolution: Duplicate

Thanks for the pointer [~ayushtkn]!

> Provide a way to query INodeId (fileId) via Hadoop shell
> 
>
> Key: HDFS-14441
> URL: https://issues.apache.org/jira/browse/HDFS-14441
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: shell
>    Reporter: Arpit Agarwal
>Priority: Major
>
> There doesn't seem to be a way to get the INodeId of a file using the Hadoop 
> shell e.g. stat.
> It can be obtained via a webhdfs LISTSTATUS request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14441) Provide a way to query INodeId (fileId) via Hadoop shell

2019-04-18 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-14441:


 Summary: Provide a way to query INodeId (fileId) via Hadoop shell
 Key: HDFS-14441
 URL: https://issues.apache.org/jira/browse/HDFS-14441
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: shell
Reporter: Arpit Agarwal


There doesn't seem to be a way to get the INodeId of a file using the Hadoop 
shell e.g. stat.

It can be obtained via a webhdfs LISTSTATUS request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1441) Remove usage of getRetryFailureException

2019-04-15 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1441:
---

 Summary: Remove usage of getRetryFailureException
 Key: HDDS-1441
 URL: https://issues.apache.org/jira/browse/HDDS-1441
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


Per [~szetszwo]'s comment on RATIS-518, we can remove the usage of 
getRetryFailureException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1387) ConcurrentModificationException in TestMiniChaosOzoneCluster

2019-04-12 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1387.
-
  Resolution: Fixed
   Fix Version/s: 0.5.0
Target Version/s:   (was: 0.5.0)

I've committed this. Thanks [~elek], [~nandakumar131].

> ConcurrentModificationException in TestMiniChaosOzoneCluster
> 
>
> Key: HDDS-1387
> URL: https://issues.apache.org/jira/browse/HDDS-1387
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Nanda kumar
>Assignee: Elek, Marton
>Priority: Major
>  Labels: ozone-flaky-test, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> TestMiniChaosOzoneCluster is failing with the below exception
> {noformat}
> [ERROR] org.apache.hadoop.ozone.TestMiniChaosOzoneCluster  Time elapsed: 
> 265.679 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.stop(MiniOzoneClusterImpl.java:350)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl.shutdown(MiniOzoneClusterImpl.java:325)
>   at 
> org.apache.hadoop.ozone.MiniOzoneChaosCluster.shutdown(MiniOzoneChaosCluster.java:130)
>   at 
> org.apache.hadoop.ozone.TestMiniChaosOzoneCluster.shutdown(TestMiniChaosOzoneCluster.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1429) Avoid using common fork join pool

2019-04-11 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1429:
---

 Summary: Avoid using common fork join pool
 Key: HDDS-1429
 URL: https://issues.apache.org/jira/browse/HDDS-1429
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


After enabling thread context in Ozone log messages, we see some Ratis 
operations being executed in common fork join pool. E.g.

{code}
2019-04-11 14:25:54,583 ForkJoinPool.commonPool-worker-3 INFO  
storage.RaftLogWorker (RaftLogWorker.java:rollLogSegment(303)) - 
3e59bcbc-e6f0-4681-b8a6-76e163e9ff19-RaftLogWorker: Rolling segment log-0_70 to 
index:70
{code}

and

{code}
2019-04-11 14:25:56,715 ForkJoinPool.commonPool-worker-1 INFO  
client.GrpcClientProtocolService 
(GrpcClientProtocolService.java:lambda$processClientRequest$0(264)) - Failed 
RaftClientRequest:client-B5832CAE4B89->8a182bb4-96a8-42e8-a7da-549c9663fc30@group-7136CD304607,
 cid=333, seq=0, Watch-ALL_COMMITTED(79), Message:, 
reply=RaftClientReply:client-B5832CAE4B89->8a182bb4-96a8-42e8-a7da-549c9663fc30@group-7136CD304607,
 cid=333, FAILED org.apache.ratis.protocol.NotLeaderException: Server 
8a182bb4-96a8-42e8-a7da-549c9663fc30 is not the leader 
(beff1b4d-05f9-4f7d-a6f2-1405b0950e8c:10.22.8.149:57253). Request must be sent 
to leader., logIndex=0, commits[8a182bb4-96a8-42e8-a7da-549c9663fc30:c129, 
48a1cbdc-86a2-40d3-9831-13e044923ee4:c72, 
beff1b4d-05f9-4f7d-a6f2-1405b0950e8c:c129]
{code}

It's better to use a dedicated ExecutorService or ForkJoinPool instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1428) Remove benign warning in handleCreateContainer

2019-04-11 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1428:
---

 Summary: Remove benign warning in handleCreateContainer
 Key: HDDS-1428
 URL: https://issues.apache.org/jira/browse/HDDS-1428
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


The following log message in KeyValueHandler#handleCreateContainer can be 
removed or moved to _debug_ level.

{code}
// The create container request for an already existing container can
// arrive in case the ContainerStateMachine reapplies the transaction
// on datanode restart. Just log a warning msg here.
LOG.warn("Container already exists." +
"container Id " + containerID);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1427) Differentiate log messages by service instance in test output

2019-04-11 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1427:
---

 Summary: Differentiate log messages by service instance in test 
output
 Key: HDDS-1427
 URL: https://issues.apache.org/jira/browse/HDDS-1427
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: test
Reporter: Arpit Agarwal


When running tests, the log output from multiple services is interleaved. This 
makes it very hard to follow the sequence of events.

This is especially seen with MiniOzoneChaosCluster which starts 20 DataNodes in 
the same process.

One way we can do this is by using [Log4j 
NDC|https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/NDC.html] or 
[slf4j|https://www.slf4j.org/api/org/slf4j/MDC.html] to print the PID and 
thread name/thread ID. It probably won't be a simple change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1426) Minor logging improvements for MiniOzoneChaosCluster

2019-04-11 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1426:
---

 Summary: Minor logging improvements for MiniOzoneChaosCluster
 Key: HDDS-1426
 URL: https://issues.apache.org/jira/browse/HDDS-1426
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


- Add log messages when starting/stopping services
- Change log file name so output files are sorted by date/time. Also {{:}} is 
not a valid file character on some platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1422) Exception during DataNode shutdown

2019-04-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1422:
---

 Summary: Exception during DataNode shutdown
 Key: HDDS-1422
 URL: https://issues.apache.org/jira/browse/HDDS-1422
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


The following exception during DN shutdown should be avoided, as it adds noise 
to the logs and is not a real issue.
{code}
2019-04-10 17:48:27,307 WARN  volume.VolumeSet 
(VolumeSet.java:getNodeReport(476)) - Failed to get scmUsed and remaining for 
container storage location 
/Users/agarwal/src/hadoop/hadoop-ozone/integration-test/target/test/data/MiniOzoneClusterImpl-f4d89966-146a-4690-8841-36af1993522f/datanode-17/data/containers
java.io.IOException: Volume Usage thread is not running. This error is usually 
seen during DataNode shutdown.
  at 
org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:119)
  at 
org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:472)
  at 
org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:238)
  at 
org.apache.hadoop.ozone.container.common.states.endpoint.RegisterEndpointTask.call(RegisterEndpointTask.java:115)
  at 
org.apache.hadoop.ozone.container.common.states.endpoint.RegisterEndpointTask.call(RegisterEndpointTask.java:47)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1421) Avoid unnecessary object allocations in TracingUtil

2019-04-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1421:
---

 Summary: Avoid unnecessary object allocations in TracingUtil
 Key: HDDS-1421
 URL: https://issues.apache.org/jira/browse/HDDS-1421
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: tracing
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Avoid unnecessary object allocations in TracingUtil#exportCurrentSpan and 
#exportSpan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1420) Tracing exception in DataNode HddsDispatcher

2019-04-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1420:
---

 Summary: Tracing exception in DataNode HddsDispatcher
 Key: HDDS-1420
 URL: https://issues.apache.org/jira/browse/HDDS-1420
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: tracing, Ozone Datanode
Reporter: Arpit Agarwal


The following exception is seen in some unit tests:
{code}
2019-04-10 13:00:27,537 WARN  
internal.PropagationRegistry$ExceptionCatchingExtractorDecorator 
(PropagationRegistry.java:extract(60)) - Error when extracting SpanContext from 
carrier. Handling gracefully.
io.jaegertracing.internal.exceptions.MalformedTracerStateStringException: 
String does not match tracer state format: 90041ce6-81f3-4733-8e2b-6aceaa697b77
at org.apache.hadoop.hdds.tracing.StringCodec.extract(StringCodec.java:49)
at org.apache.hadoop.hdds.tracing.StringCodec.extract(StringCodec.java:34)
at 
io.jaegertracing.internal.PropagationRegistry$ExceptionCatchingExtractorDecorator.extract(PropagationRegistry.java:57)
at io.jaegertracing.internal.JaegerTracer.extract(JaegerTracer.java:208)
at io.jaegertracing.internal.JaegerTracer.extract(JaegerTracer.java:61)
at io.opentracing.util.GlobalTracer.extract(GlobalTracer.java:143)
at 
org.apache.hadoop.hdds.tracing.TracingUtil.importAndCreateScope(TracingUtil.java:98)
at 
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:347)
at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:354)
at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$5(ContainerStateMachine.java:613)
at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1419) Fix shellcheck errors in start-chaos.sh

2019-04-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1419:
---

 Summary: Fix shellcheck errors in start-chaos.sh
 Key: HDDS-1419
 URL: https://issues.apache.org/jira/browse/HDDS-1419
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: test
Reporter: Arpit Agarwal


Fix the following shellcheck errors in start-chaos.sh:
{code}
hadoop-ozone/integration-test/src/test/bin/start-chaos.sh:18:6: note: Use $(..) 
instead of legacy `..`. [SC2006]
hadoop-ozone/integration-test/src/test/bin/start-chaos.sh:27:19: note: Double 
quote to prevent globbing and word splitting. [SC2086]
hadoop-ozone/integration-test/src/test/bin/start-chaos.sh:28:20: note: Double 
quote to prevent globbing and word splitting. [SC2086]
hadoop-ozone/integration-test/src/test/bin/start-chaos.sh:31:33: note: Double 
quote to prevent globbing and word splitting. [SC2086]
hadoop-ozone/integration-test/src/test/bin/start-chaos.sh:35:23: note: Double 
quote to prevent globbing and word splitting. [SC2086]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1418) Move bang line to the start of the start-chaos.sh script

2019-04-10 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1418.
-
   Resolution: Fixed
Fix Version/s: 0.5.0

> Move bang line to the start of the start-chaos.sh script
> 
>
> Key: HDDS-1418
> URL: https://issues.apache.org/jira/browse/HDDS-1418
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>        Reporter: Arpit Agarwal
>    Assignee: Arpit Agarwal
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The start-chaos.sh script has a bang line but it is not the first line in the 
> script.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1418) Move bang line to the start of the start-chaos.sh script

2019-04-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1418:
---

 Summary: Move bang line to the start of the start-chaos.sh script
 Key: HDDS-1418
 URL: https://issues.apache.org/jira/browse/HDDS-1418
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


The start-chaos.sh has a bang line but it is not the first line in the script.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1332) Skip flaky test - testStartStopDatanodeStateMachine

2019-03-23 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1332:
---

 Summary: Skip flaky test - testStartStopDatanodeStateMachine
 Key: HDDS-1332
 URL: https://issues.apache.org/jira/browse/HDDS-1332
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


testStartStopDatanodeStateMachine fails frequently in Jenkins. It also seems to 
have a timing issue which may be different from the Jenkins failure.

E.g. If I add a 10 second sleep as below I can get the test to fail 100%.

{code}
@@ -163,6 +163,7 @@ public void testStartStopDatanodeStateMachine() throws 
IOException,
 try (DatanodeStateMachine stateMachine =
 new DatanodeStateMachine(getNewDatanodeDetails(), conf, null)) {
   stateMachine.startDaemon();
+  Thread.sleep(10_000L);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1324) TestOzoneManagerHA seems to be flaky

2019-03-21 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1324:
---

 Summary: TestOzoneManagerHA seems to be flaky
 Key: HDDS-1324
 URL: https://issues.apache.org/jira/browse/HDDS-1324
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: test
Affects Versions: 0.5.0
Reporter: Arpit Agarwal


TestOzoneManagerHA failed once with the following error:
{code}
[ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 105.931 
s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestOzoneManagerHA
[ERROR] testOMRetryProxy(org.apache.hadoop.ozone.om.TestOzoneManagerHA)  Time 
elapsed: 21.781 s  <<< FAILURE!
java.lang.AssertionError: expected:<30> but was:<10>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.ozone.om.TestOzoneManagerHA.testOMRetryProxy(TestOzoneManagerHA.java:305)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1323) Ignore unit test TestFailureHandlingByClient

2019-03-21 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1323:
---

 Summary: Ignore unit test TestFailureHandlingByClient
 Key: HDDS-1323
 URL: https://issues.apache.org/jira/browse/HDDS-1323
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: test
Reporter: Arpit Agarwal


TestFailureHandlingByClient seems to be failing consistently. Let's ignore it 
for now until it can be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1322) Hugo errors when building Ozone

2019-03-21 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1322:
---

 Summary: Hugo errors when building Ozone
 Key: HDDS-1322
 URL: https://issues.apache.org/jira/browse/HDDS-1322
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Arpit Agarwal


I see some odd hugo errors when building Ozone, even though I am not building 
docs.
{code}
$ mvn -B -q clean compile install -DskipTests=true -Dmaven.javadoc.skip=true 
-Dmaven.site.skip=true -DskipShade -Phdds

Error: unknown command "0.4.0-SNAPSHOT" for "hugo"
Run 'hugo --help' for usage.
.../hadoop-hdds/docs/target
Error: unknown command "0.4.0-SNAPSHOT" for "hugo"
Run 'hugo --help' for usage.
.../hadoop-hdds/docs/target
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1321) TestOzoneManagerHttpServer depends on hard-coded port numbers

2019-03-21 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1321.
-
  Resolution: Fixed
   Fix Version/s: 0.4.0
Target Version/s:   (was: 0.4.0)

Thanks for the review [~ajayydv]. I've committed this.

> TestOzoneManagerHttpServer depends on hard-coded port numbers
> -
>
> Key: HDDS-1321
> URL: https://issues.apache.org/jira/browse/HDDS-1321
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>    Reporter: Arpit Agarwal
>    Assignee: Arpit Agarwal
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> TestOzoneManagerHttpServer depends on a hard-coded port number due to a bug 
> in config initialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1321) TestOzoneManagerHttpServer depends on hard-coded port numbers

2019-03-21 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1321:
---

 Summary: TestOzoneManagerHttpServer depends on hard-coded port 
numbers
 Key: HDDS-1321
 URL: https://issues.apache.org/jira/browse/HDDS-1321
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


TestOzoneManagerHttpServer depends on a hard-coded port number due to a bug in 
config initialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-19 Thread Arpit Agarwal
Hi Eric,

> Dockerfile is most likely to change to apply the security fix.

I am not sure this is always. Marton’s point about revising docker images 
independent of Hadoop versions is valid. 


> When maven release is automated through Jenkins, this is a breeze
> of clicking a button.  Jenkins even increment the target version
> automatically with option to edit. 

I did not understand this suggestion. Could you please explain in simpler terms 
or share a link to the description?


> I will make adjustment accordingly unless 7 more people comes
> out and say otherwise.

What adjustment is this?

Thanks,
Arpit


> On Mar 19, 2019, at 10:19 AM, Eric Yang  wrote:
> 
> Hi Marton,
> 
> Thank you for your input.  I agree with most of what you said with a few 
> exceptions.  Security fix should result in a different version of the image 
> instead of replace of a certain version.  Dockerfile is most likely to change 
> to apply the security fix.  If it did not change, the source has instability 
> over time, and result in non-buildable code over time.  When maven release is 
> automated through Jenkins, this is a breeze of clicking a button.  Jenkins 
> even increment the target version automatically with option to edit.  It 
> makes release manager's job easier than Homer Simpson's job.
> 
> If versioning is done correctly, older branches can have the same docker 
> subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We 
> don't generate timeline paradox to allow changing the history of Hadoop 
> 2.7.1.  That release has passed and let it stay that way.
> 
> There are mounting evidence that Hadoop community wants docker profile for 
> developer image.  Precommit build will not catch some build errors because 
> more codes are allowed to slip through using profile build process.  I will 
> make adjustment accordingly unless 7 more people comes out and say otherwise.
> 
> Regards,
> Eric
> 
> On 3/19/19, 1:18 AM, "Elek, Marton"  wrote:
> 
> 
> 
>Thank you Eric to describe the problem.
> 
>I have multiple small comments, trying to separate them.
> 
>I. separated vs in-build container image creation
> 
>> The disadvantages are:
>> 
>> 1.  Require developer to have access to docker.
>> 2.  Default build takes longer.
> 
> 
>These are not the only disadvantages (IMHO) as I wrote it in in the
>previous thread and the issue [1]
> 
>Using in-build container image creation doesn't enable:
> 
>1. to modify the image later (eg. apply security fixes to the container
>itself or apply improvements for the startup scripts)
>2. create images for older releases (eg. hadoop 2.7.1)
> 
>I think there are two kind of images:
> 
>a) images for released artifacts
>b) developer images
> 
>I would prefer to manage a) with separated branch repositories but b)
>with (optional!) in-build process.
> 
>II. Agree with Steve. I think it's better to make it optional as most of
>the time it's not required. I think it's better to support the default
>dev build with the default settings (=just enough to start)
> 
>III. Maven best practices
> 
>(https://dzone.com/articles/maven-profile-best-practices)
> 
>I think this is a good article. But this is not against profiles but
>creating multiple versions from the same artifact with the same name
>(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
>steps. I think it's fine as the maven lifecycle/phase model is very
>static (compare it with the tree based approach in Gradle).
> 
>Marton
> 
>[1]: https://issues.apache.org/jira/browse/HADOOP-16091
> 
>On 3/13/19 11:24 PM, Eric Yang wrote:
>> Hi Hadoop developers,
>> 
>> In the recent months, there were various discussions on creating docker 
>> build process for Hadoop.  There was convergence to make docker build 
>> process inline in the mailing list last month when Ozone team is planning 
>> new repository for Hadoop/ozone docker images.  New feature has started to 
>> add docker image build process inline in Hadoop build.
>> A few lessons learnt from making docker build inline in YARN-7129.  The 
>> build environment must have docker to have a successful docker build.  
>> BUILD.txt stated for easy build environment use Docker.  There is logic in 
>> place to ensure that absence of docker does not trigger docker build.  The 
>> inline process tries to be as non-disruptive as possible to existing 
>> development environment with one exception.  If docker’s presence is 
>> detected, but user does not have rights to run docker.  This will cause the 
>> build to fail.
>> 
>> Now, some developers are pushing back on inline docker build process because 
>> existing environment did not make docker build process mandatory.  However, 
>> there are benefits to use inline docker build process.  The listed benefits 
>> are:
>> 
>> 1.  Source code tag, maven repository artifacts and docker hub artifacts can 
>> all be pr

[jira] [Created] (HDDS-1306) TestContainerStateManagerIntegration fails in Ratis shutdown

2019-03-18 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1306:
---

 Summary: TestContainerStateManagerIntegration fails in Ratis 
shutdown
 Key: HDDS-1306
 URL: https://issues.apache.org/jira/browse/HDDS-1306
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal


TestContainerStateManagerIntegration occasionally fails in Ratis shutdown. 
Other test cases like TestScmChillMode may be failing due to the same error.

Full stack trace in a comment below since it's a lot of text.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-1284) Adjust default values of pipline recovery for more resilient service restart

2019-03-18 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDDS-1284:
-

I've reopened this since it caused HDDS-1297.

> Adjust default values of pipline recovery for more resilient service restart
> 
>
> Key: HDDS-1284
> URL: https://issues.apache.org/jira/browse/HDDS-1284
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As of now we have a following algorithm to handle node failures:
> 1. In case of a missing node the leader of the pipline or the scm can 
> detected the missing heartbeats.
> 2. SCM will start to close the pipeline (CLOSING state) and try to close the 
> containers with the remaining nodes in the pipeline
> 3. After 5 minutes the pipeline will be destroyed (CLOSED) and a new pipeline 
> can be created from the healthy nodes (one node can be part only one pipwline 
> in the same time).
> While this algorithm can work well with a big cluster it doesn't provide very 
> good usability on small clusters:
> Use case1:
> Given 3 nodes, in case of a service restart, if the restart takes more than 
> 90s, the pipline will be moved to the CLOSING state. For the next 5 minutes 
> (ozone.scm.pipeline.destroy.timeout) the container will remain in the CLOSING 
> state. As there are no more nodes and we can't assign the same node to two 
> different pipeline, the cluster will be unavailable for 5 minutes.
> Use case2:
> Given 90 nodes and 30 pipelines where all the pipelines are spread across 3 
> racks. Let's stop one rack. As all the pipelines are affected, all the 
> pipelines will be moved to the CLOSING state. We have no free nodes, 
> therefore we need to wait for 5 minutes to write any data to the cluster.
> These problems can be solved in multiple ways:
> 1.) Instead of waiting 5 minutes, destroy the pipeline when all the 
> containers are reported to be closed. (Most of the time it's enough, but some 
> container report can be missing)
> 2.) Support multi-raft and open a pipeline as soon as we have enough nodes 
> (even if the nodes already have a CLOSING pipelines).
> Both the options require more work on the pipeline management side. For 0.4.0 
> we can adjust the following parameters to get better user experience:
> {code}
>   
> ozone.scm.pipeline.destroy.timeout
> 60s
> OZONE, SCM, PIPELINE
> 
>   Once a pipeline is closed, SCM should wait for the above configured time
>   before destroying a pipeline.
> 
>   
> ozone.scm.stale.node.interval
> 90s
> OZONE, MANAGEMENT
> 
>   The interval for stale node flagging. Please
>   see ozone.scm.heartbeat.thread.interval before changing this value.
> 
>   
>  {code}
> First of all, we can be more optimistic and mark node to stale only after 5 
> mins instead of 90s. 5 mins should be enough most of the time to recover the 
> nodes.
> Second: we can decrease the time of ozone.scm.pipeline.destroy.timeout. 
> Ideally the close command is sent by the scm to the datanode with a HB. 
> Between two HB we have enough time to close all the containers via ratis. 
> With the next HB, datanode can report the successful datanode. (If the 
> containers can be closed the scm can manage the QUASI_CLOSED containers)
> We need to wait 29 seconds (worst case) for the next HB, and 29+30 seconds 
> for the confirmation. --> 66 seconds seems to be a safe choice (assuming that 
> 6 seconds is enough to process the report about the successful closing)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1244) ozone sh put not working in docker container

2019-03-10 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1244.
-
Resolution: Invalid

[~msingh] helped me out. This was a side effect of running some docker disk 
space issue.

HDDS-1243 should improve the error propagation so this is easier to debug.

> ozone sh put not working in docker container
> 
>
> Key: HDDS-1244
> URL: https://issues.apache.org/jira/browse/HDDS-1244
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>        Reporter: Arpit Agarwal
>Priority: Blocker
> Attachments: client-logs.txt
>
>
> Steps to repro:
> # Bring up docker cluster with
> {code}
> docker-compose up -d --scale datanode=3
> {code}
> # ssh to datanode
> {code}
> docker-compose exec datanode /bin/bash
> {code}
> # Try to put a key
> {code}
> ozone sh volume create /vol1
> ozone sh bucket create /vol1/bucket1
> ozone sh key put /vol1/bucket1/key1 /tmp/hadoop-hadoop-datanode.pid
> {code}
> This gives the following error.
> {code}
> 2019-03-10 21:58:09 ERROR BlockOutputStream:558 - Unexpected Storage 
> Container Exception:
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  ContainerID 1 does not exist
>   at 
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:573)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:556)
>   at 
> org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:634)
>   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> {code}
> Debug client logs attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1244) ozone sh put not working in docker container

2019-03-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1244:
---

 Summary: ozone sh put not working in docker container
 Key: HDDS-1244
 URL: https://issues.apache.org/jira/browse/HDDS-1244
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Arpit Agarwal
 Attachments: client-logs.txt

Steps to repro:
# Bring up docker cluster with
```
docker-compose up -d --scale datanode=3
```
# ssh to datanode
```
docker-compose exec datanode /bin/bash
```
# Try to put a key
```
ozone sh volume create /vol1
ozone sh bucket create /vol1/bucket1
ozone sh key put /vol1/bucket1/key1 /tmp/hadoop-hadoop-datanode.pid
```

This gives the following error.
```
2019-03-10 21:58:09 ERROR BlockOutputStream:558 - Unexpected Storage Container 
Exception:
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: 
ContainerID 1 does not exist
at 
org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:573)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:556)
at 
org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:634)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
```

Debug client logs attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1243) DataNodes should propagate exceptions to clients

2019-03-10 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1243:
---

 Summary: DataNodes should propagate exceptions to clients
 Key: HDDS-1243
 URL: https://issues.apache.org/jira/browse/HDDS-1243
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Client, Ozone Datanode
Reporter: Arpit Agarwal


DataNodes should propagate full exceptions to clients instead of just a status 
code to simplify debugging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-453) OM and SCM should use picocli to parse arguments

2019-03-04 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-453.

Resolution: Won't Fix

Makes sense. Let's pass on this. Thanks for looking into it Aravindan and 
Vinicius.

> OM and SCM should use picocli to parse arguments
> 
>
> Key: HDDS-453
> URL: https://issues.apache.org/jira/browse/HDDS-453
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager, SCM
>Reporter: Arpit Agarwal
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: alpha2, newbie
>
> SCM and OM can use the picocli to parse command-line arguments.
> Suggested in HDDS-415 by [~anu].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1198) Rename chill mode to safe mode

2019-02-28 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1198:
---

 Summary: Rename chill mode to safe mode
 Key: HDDS-1198
 URL: https://issues.apache.org/jira/browse/HDDS-1198
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


Let's go back to calling it safe mode. HDFS admins already understand what it 
means.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1123) Add timeout when submitting request to Raft server on leader OM

2019-02-20 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1123.
-
Resolution: Not A Problem

We'll continue to use Raft client on the OM so resolving this as 'Not a 
Problem'.

> Add timeout when submitting request to Raft server on leader OM
> ---
>
> Key: HDDS-1123
> URL: https://issues.apache.org/jira/browse/HDDS-1123
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> Add timeout when submitting request to Raft server on leader OM.
> As raft server keeps on trying, even if the followers are down, so we need to 
> add some time out logic when submitting request in OM, to send back response 
> to client



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.2 - RC1

2019-02-05 Thread Arpit Agarwal
+1 binding for updated source package.

  - Rechecked signatures and checksums
  - Source matches release git tag
  - Built from source


> On Feb 5, 2019, at 10:50 AM, Sunil G  wrote:
> 
> Thanks Billie for pointing out.
> I have updated source by removing patchprocess and extra line create
> release.
> 
> Also updated checksum as well.
> 
> @bil...@apache.org   @Wangda Tan 
> please help to verify this changed bit once.
> 
> Thanks
> Sunil
> 
> On Tue, Feb 5, 2019 at 5:23 AM Billie Rinaldi 
> wrote:
> 
>> Hey Sunil and Wangda, thanks for the RC. The source tarball has a
>> patchprocess directory with some yetus code in it. Also, the file
>> dev-support/bin/create-release file has the following line added:
>>  export GPG_AGENT_INFO="/home/sunilg/.gnupg/S.gpg-agent:$(pgrep
>> gpg-agent):1"
>> 
>> I think we are probably due for an overall review of LICENSE and NOTICE. I
>> saw some idiosyncrasies there but nothing that looked like a blocker.
>> 
>> On Mon, Jan 28, 2019 at 10:20 PM Sunil G  wrote:
>> 
>>> Hi Folks,
>>> 
>>> On behalf of Wangda, we have an RC1 for Apache Hadoop 3.1.2.
>>> 
>>> The artifacts are available here:
>>> http://home.apache.org/~sunilg/hadoop-3.1.2-RC1/
>>> 
>>> The RC tag in git is release-3.1.2-RC1:
>>> https://github.com/apache/hadoop/commits/release-3.1.2-RC1
>>> 
>>> The maven artifacts are available via repository.apache.org at
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1215
>>> 
>>> This vote will run 5 days from now.
>>> 
>>> 3.1.2 contains 325 [1] fixed JIRA issues since 3.1.1.
>>> 
>>> We have done testing with a pseudo cluster and distributed shell job.
>>> 
>>> My +1 to start.
>>> 
>>> Best,
>>> Wangda Tan and Sunil Govindan
>>> 
>>> [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.2)
>>> ORDER BY priority DESC
>>> 
>> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.2 - RC1

2019-02-04 Thread Arpit Agarwal
+1 (binding)

- Verified signatures
- Verified checksums
- Built from source
- Verified Maven artifacts on staging repo
- Deployed 3 node cluster
- Tried out HDFS commands, MapReduce jobs.

Confirmed the issues Billie pointed out. Not sure if you need to spin up a new 
RC or you can update the tarball - contents of the git tag look fine.


> On Jan 28, 2019, at 10:19 PM, Sunil G  wrote:
> 
> Hi Folks,
> 
> On behalf of Wangda, we have an RC1 for Apache Hadoop 3.1.2.
> 
> The artifacts are available here:
> http://home.apache.org/~sunilg/hadoop-3.1.2-RC1/
> 
> The RC tag in git is release-3.1.2-RC1:
> https://github.com/apache/hadoop/commits/release-3.1.2-RC1
> 
> The maven artifacts are available via repository.apache.org at
> https://repository.apache.org/content/repositories/orgapachehadoop-1215
> 
> This vote will run 5 days from now.
> 
> 3.1.2 contains 325 [1] fixed JIRA issues since 3.1.1.
> 
> We have done testing with a pseudo cluster and distributed shell job.
> 
> My +1 to start.
> 
> Best,
> Wangda Tan and Sunil Govindan
> 
> [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.1.2)
> ORDER BY priority DESC


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: proposed new repository for hadoop/ozone docker images (+update on docker works)

2019-01-29 Thread Arpit Agarwal
I’ve requested a new repo hadoop-docker-ozone.git in gitbox.


> On Jan 22, 2019, at 4:59 AM, Elek, Marton  wrote:
> 
> 
> 
> TLDR;
> 
> I proposed to create a separated git repository for ozone docker images
> in HDDS-851 (hadoop-docker-ozone.git)
> 
> If there is no objections in the next 3 days I will ask an Apache Member
> to create the repository.
> 
> 
> 
> 
> LONG VERSION:
> 
> In HADOOP-14898 multiple docker containers and helper scripts are
> created for Hadoop.
> 
> The main goal was to:
> 
> 1.) help the development with easy-to-use docker images
> 2.) provide official hadoop images to make it easy to test new features
> 
> As of now we have:
> 
> - apache/hadoop-runner image (which contains the required dependency
> but no hadoop)
> - apache/hadoop:2 and apache/hadoop:3 images (to try out latest hadoop
> from 2/3 lines)
> 
> The base image to run hadoop (apache/hadoop-runner) is also heavily used
> for Ozone distribution/development.
> 
> The Ozone distribution contains docker-compose based cluster definitions
> to start various type of clusters and scripts to do smoketesting. (See
> HADOOP-16063 for more details).
> 
> Note: I personally believe that these definitions help a lot to start
> different type of clusters. For example it could be tricky to try out
> router based federation as it requires multiple HA clusters. But with a
> simple docker-compose definition [1] it could be started under 3
> minutes. (HADOOP-16063 is about creating these definitions for various
> hdfs/yarn use cases)
> 
> As of now we have dedicated branches in the hadoop git repository for
> the docker images (docker-hadoop-runner, docker-hadoop-2,
> docker-hadoop-3). It turns out that a separated repository would be more
> effective as the dockerhub can use only full branch names as tags.
> 
> We would like to provide ozone docker images to make the evaluation as
> easy as 'docker run -d apache/hadoop-ozone:0.3.0', therefore in HDDS-851
> we agreed to create a separated repository for the hadoop-ozone docker
> images.
> 
> If this approach works well we can also move out the existing
> docker-hadoop-2/docker-hadoop-3/docker-hadoop-runner branches from
> hadoop.git to an other separated hadoop-docker.git repository)
> 
> Please let me know if you have any comments,
> 
> Thanks,
> Marton
> 
> 1: see
> https://github.com/flokkr/runtime-compose/tree/master/hdfs/routerfeder
> as an example
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1026) Reads should fail over to alternate replica

2019-01-28 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1026:
---

 Summary: Reads should fail over to alternate replica
 Key: HDDS-1026
 URL: https://issues.apache.org/jira/browse/HDDS-1026
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


Read requests should fail over to an alternate replica when one replica is bad.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1023) Avoid holding container lock during disk updates

2019-01-28 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDDS-1023:
---

 Summary: Avoid holding container lock during disk updates
 Key: HDDS-1023
 URL: https://issues.apache.org/jira/browse/HDDS-1023
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Arpit Agarwal


We can avoid holding the container lock during disk updates by applying the 
in-memory state change after the on-disk state change has completed.

We should also serialize multiple updates wrt each other (we get this for 
'free' today since we hold the lock while updating).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



  1   2   3   4   5   6   7   >