[jira] [Created] (HDFS-12308) Erasure Coding: Provide DistributedFileSystem & DFSClient API to return the effective EC policy on a directory or file, including the replication policy

2017-08-15 Thread SammiChen (JIRA)
SammiChen created HDFS-12308:


 Summary: Erasure Coding: Provide DistributedFileSystem &  
DFSClient API to return the effective EC policy on a directory or file, 
including the replication policy
 Key: HDFS-12308
 URL: https://issues.apache.org/jira/browse/HDFS-12308
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
 Environment: Provide DistributedFileSystem &  DFSClient API to return 
the effective EC policy on a directory or file, including the replication 
policy. Policy name will like {{getNominalErasureCodingPolicy(PATH)}} and 
{{getAllNominalErasureCodingPolicies}}. 
Reporter: SammiChen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12307) Ozone: TestKeys#testPutAndGetKeyWithDnRestart fails

2017-08-15 Thread Weiwei Yang (JIRA)
Weiwei Yang created HDFS-12307:
--

 Summary: Ozone: TestKeys#testPutAndGetKeyWithDnRestart fails
 Key: HDFS-12307
 URL: https://issues.apache.org/jira/browse/HDFS-12307
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Weiwei Yang


It seems this UT constantly fails with following error

{noformat}
org.apache.hadoop.ozone.web.exceptions.OzoneException: Exception getting 
XceiverClient.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:119)
at 
com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createUsingDefault(StdValueInstantiator.java:243)
at 
com.fasterxml.jackson.databind.deser.std.ThrowableDeserializer.deserializeFromObject(ThrowableDeserializer.java:146)
at 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:133)
at 
com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:1579)
at 
com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1200)
at 
org.apache.hadoop.ozone.web.exceptions.OzoneException.parse(OzoneException.java:248)
at 
org.apache.hadoop.ozone.web.client.OzoneBucket.executeGetKey(OzoneBucket.java:395)
at 
org.apache.hadoop.ozone.web.client.OzoneBucket.getKey(OzoneBucket.java:321)
at 
org.apache.hadoop.ozone.web.client.TestKeys.runTestPutAndGetKeyWithDnRestart(TestKeys.java:288)
at 
org.apache.hadoop.ozone.web.client.TestKeys.testPutAndGetKeyWithDnRestart(TestKeys.java:265)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12306) [branch-2]Separate class InnerNode from class NetworkTopology and make it extendable

2017-08-15 Thread Chen Liang (JIRA)
Chen Liang created HDFS-12306:
-

 Summary: [branch-2]Separate class InnerNode from class 
NetworkTopology and make it extendable
 Key: HDFS-12306
 URL: https://issues.apache.org/jira/browse/HDFS-12306
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Chen Liang
Assignee: Chen Liang


This JIRA is to backport HDFS-11430 to branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Are binary artifacts are part of a release?

2017-08-15 Thread Ravi Prakash
bq. My stance is that if we're going to publish something, it should be
good, or we shouldn't publish it at all.

I agree

On Tue, Aug 15, 2017 at 2:57 AM, Steve Loughran 
wrote:

>
> > On 15 Aug 2017, at 07:14, Andrew Wang  wrote:
> >
> > To close the thread on this, I'll try to summarize the LEGAL JIRA. I
> wasn't
> > able to convince anyone to make changes to the apache.org docs.
> >
> > Convenience binary artifacts are not official release artifacts and thus
> > are not voted on. However, since they are distributed by Apache, they are
> > still subject to the same distribution requirements as official release
> > artifacts. This means they need to have a LICENSE and NOTICE file, follow
> > ASF licensing rules, etc. The PMC needs to ensure that binary artifacts
> > meet these requirements.
> >
> > However, being a "convenience" artifact doesn't mean it isn't important.
> > The appropriate level of quality for binary artifacts is left up to the
> > project. An OpenOffice person mentioned the quality of their binary
> > artifacts is super important since very few of their users will compile
> > their own office suite.
> >
> > I don't know if we've discussed the topic of binary artifact quality in
> > Hadoop. My stance is that if we're going to publish something, it should
> be
> > good, or we shouldn't publish it at all. I think we do want to publish
> > binary tarballs (it's the easiest way for new users to get started with
> > Hadoop), so it's fair to consider them when evaluating a release.
> >
> > Best,
> > Andrew
> >
>
>
> Given we publish the artifacts to the m2 repo, which is very much a
> downstream distribution mechanism. For other redist mechanisms (yum,
> apt-get) its implicitly handled by whoever manages those repos.
>
> > On Mon, Jul 31, 2017 at 8:43 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> > wrote:
> >
> >> It does not. Just adding historical references, as Andrew raised the
> >> question.
> >>
> >> On Mon, Jul 31, 2017 at 7:38 PM, Allen Wittenauer <
> >> a...@effectivemachines.com> wrote:
> >>
> >>>
> >>> ... that doesn't contradict anything I said.
> >>>
>  On Jul 31, 2017, at 7:23 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> >>> wrote:
> 
>  The issue was discussed on several occasions in the past.
>  Took me a while to dig this out as an example:
>  http://mail-archives.apache.org/mod_mbox/hadoop-general/2011
> >>> 11.mbox/%3C4EB0827C.6040204%40apache.org%3E
> 
>  Doug Cutting:
>  "Folks should not primarily evaluate binaries when voting. The ASF
> >>> primarily produces and publishes source-code
>  so voting artifacts should be optimized for evaluation of that."
> 
>  Thanks,
>  --Konst
> 
>  On Mon, Jul 31, 2017 at 4:51 PM, Allen Wittenauer <
> >>> a...@effectivemachines.com> wrote:
> 
> > On Jul 31, 2017, at 4:18 PM, Andrew Wang 
> >>> wrote:
> >
> > Forking this off to not distract from release activities.
> >
> > I filed https://issues.apache.org/jira/browse/LEGAL-323 to get
> >>> clarity on the matter. I read the entire webpage, and it could be
> improved
> >>> one way or the other.
> 
> 
> IANAL, my read has always lead me to believe:
> 
> * An artifact is anything that is uploaded to dist.a.o
> >>> and repository.a.o
> * A release consists of one or more artifacts
> >>> ("Releases are, by definition, anything that is published beyond the
> group
> >>> that owns it. In our case, that means any publication outside the
> group of
> >>> people on the product dev list.")
> * One of those artifacts MUST be source
> * (insert voting rules here)
> * They must be built on a machine in control of the RM
> * There are no exceptions for alpha, nightly, etc
> * (various other requirements)
> 
> i.e., release != artifact  it's more like release =
> >>> artifact * n .
> 
> Do you have to have binaries?  No (e.g., Apache SpamAssassin
> >>> has no binaries to create).  But if you place binaries in dist.a.o or
> >>> repository.a.o, they are effectively part of your release and must
> follow
> >>> the same rules.  (Votes, etc.)
> 
> 
> >>>
> >>>
> >>
>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


[jira] [Created] (HDFS-12305) Ozone: SCM: Add StateMachine for pipeline/container

2017-08-15 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDFS-12305:
-

 Summary: Ozone: SCM: Add StateMachine for pipeline/container
 Key: HDFS-12305
 URL: https://issues.apache.org/jira/browse/HDFS-12305
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: HDFS-7240
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


Add a template class that can be shared by pipeline and container state machine.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12304) Remove unused parameter from FsDatasetImpl#addVolume

2017-08-15 Thread Chen Liang (JIRA)
Chen Liang created HDFS-12304:
-

 Summary: Remove unused parameter from FsDatasetImpl#addVolume
 Key: HDFS-12304
 URL: https://issues.apache.org/jira/browse/HDFS-12304
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Chen Liang
Assignee: Chen Liang
Priority: Minor


FsDatasetImpl has this method
{code}
  private void addVolume(Collection dataLocations,
  Storage.StorageDirectory sd) throws IOException
{code}
Parameter {{dataLocations}} was introduced in HDFS-6740, this variable was used 
to get storage type info. But HDFS-10637 has changed the way of getting storage 
type in this method, making dataLocations no longer being used at all here. We 
should probably remove dataLocations for a cleaner interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Why aren't delegation token operations audit logged?

2017-08-15 Thread Xiao Chen
Thanks Allen for bringing more history to this, and Erik for the discussion.

True they're logged in NN logs, but with different log rotation and level
of convenience for investigation, so I can see value adding these.
Will pursue the implementation on HDFS-12300 - more discussions welcome!

-Xiao

On Tue, Aug 15, 2017 at 8:10 AM, Erik Krogen  wrote:

> Given that the current audit log also includes the majority of read-only
> operations (getfileinfo, liststatus, etc.) it seems to me that the audit
> log's purpose has changed to be more of a record of both modifications and
> queries against the file system's metadata. The delegation token related
> operations match closely with what is currently in the audit log. Our team
> was also surprised to find that they were not currently present. Especially
> given that we have HDFS-6888 to limit the size of the audit log by omitting
> common operations, it does not seem harmful to add these token ops.
>
> Erik
>
> On 8/14/17, 5:44 PM, "Allen Wittenauer"  wrote:
>
> [You don't often get email from a...@apache.org. Learn why this is
> important at http://aka.ms/LearnAboutSenderIdentification.]
>
> On 2017-08-14 11:52, Xiao Chen  wrote:
>
> > When inspecting the code, I found that the following methods in
> > FSNamesystem are not audit logged:
>
> ...
>
> > I checked with ATM hoping for some history, but no known to him.
> Anyone
> > know the reason to not audit log these?
>
> The audit log was designed for keeping track of things that
> actually change the contents/metadata of the file system. Other HDFS
> operations were getting logged to the NN log or some other more appropriate
> to limit the noise.
>
> https://effectivemachines.com/2017/03/08/unofficial-history-
> of-the-hdfs-audit-log/
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-08-15 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/493/

[Aug 14, 2017 5:27:47 PM] (lei) HDFS-12221. Replace xcerces in XmlEditsVisitor. 
(Ajay Kumar via lei)
[Aug 14, 2017 5:51:30 PM] (jianhe) YARN-6959. RM may allocate wrong AM 
Container for new attempt.
[Aug 14, 2017 6:10:00 PM] (subru) YARN-6996. Change javax.cache library 
implementation from JSR107 to
[Aug 14, 2017 6:32:49 PM] (aengineer) HDFS-12162. Update listStatus document to 
describe the behavior when the
[Aug 14, 2017 6:41:11 PM] (vrushali) YARN-6905 Multiple HBaseTimelineStorage 
test failures due to missing
[Aug 14, 2017 6:55:33 PM] (templedf) YARN-6881. LOG is unused in 
AllocationConfiguration (Contributed by
[Aug 14, 2017 7:40:08 PM] (jlowe) YARN-6987. Log app attempt during 
InvalidStateTransition. Contributed by
[Aug 14, 2017 8:31:34 PM] (jlowe) YARN-6917. Queue path is recomputed from 
scratch on every allocation.
[Aug 14, 2017 10:53:35 PM] (arp) HADOOP-14732. ProtobufRpcEngine should use 
Time.monotonicNow to measure
[Aug 14, 2017 11:22:10 PM] (arp) HADOOP-14673. Remove leftover 
hadoop_xml_escape from functions.
[Aug 15, 2017 2:46:17 AM] (Arun Suresh) YARN-5978. ContainerScheduler and 
ContainerManager changes to support
[Aug 15, 2017 4:57:20 AM] (cdouglas) HADOOP-14726. Mark FileStatus::isDir as 
final




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs-client 
   Possible exposure of partially initialized object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:[line 2906] 
   org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:[line 105] 

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method Dereferenced at 
JournalNode.java:org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus()
 due to return value of called method Dereferenced at JournalNode.java:[line 
302] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId At HdfsServerConstants.java:clusterId 
At HdfsServerConstants.java:[line 193] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force At HdfsServerConstants.java:force At 
HdfsServerConstants.java:[line 217] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat At 
HdfsServerConstants.java:isForceFormat At HdfsServerConstants.java:[line 229] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat At 
HdfsServerConstants.java:isInteractiveFormat At HdfsServerConstants.java:[line 
237] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at 
DataStorage.java:org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File,
 File, int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at DataStorage.java:[line 1339] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:[line 258] 
   Useless condition:argv.length >= 1 at this point At DFSAdmin.java:[line 
2100] 
   Useless condition:numBlocks == -1 at this point At 
ImageLoaderCurrent.java:[line 727] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 

[jira] [Resolved] (HDFS-11814) Benchmark and tune for prefered default cell size

2017-08-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-11814.

Resolution: Done

Thanks Wei! Let's resolve this one and continue with changing the defaults in 
HDFS-12303.

> Benchmark and tune for prefered default cell size
> -
>
> Key: HDFS-11814
> URL: https://issues.apache.org/jira/browse/HDFS-11814
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: Wei Zhou
>  Labels: hdfs-ec-3.0-must-do
> Attachments: RS-6-3-Concurrent.png, RS-Read.png, RS-Write.png
>
>
> Doing some benchmarking to see which cell size is more desirable, other than 
> current 64k



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Are binary artifacts are part of a release?

2017-08-15 Thread Andrew Wang
To close the thread on this, I'll try to summarize the LEGAL JIRA. I wasn't
able to convince anyone to make changes to the apache.org docs.

Convenience binary artifacts are not official release artifacts and thus
are not voted on. However, since they are distributed by Apache, they are
still subject to the same distribution requirements as official release
artifacts. This means they need to have a LICENSE and NOTICE file, follow
ASF licensing rules, etc. The PMC needs to ensure that binary artifacts
meet these requirements.

However, being a "convenience" artifact doesn't mean it isn't important.
The appropriate level of quality for binary artifacts is left up to the
project. An OpenOffice person mentioned the quality of their binary
artifacts is super important since very few of their users will compile
their own office suite.

I don't know if we've discussed the topic of binary artifact quality in
Hadoop. My stance is that if we're going to publish something, it should be
good, or we shouldn't publish it at all. I think we do want to publish
binary tarballs (it's the easiest way for new users to get started with
Hadoop), so it's fair to consider them when evaluating a release.

Best,
Andrew

On Mon, Jul 31, 2017 at 8:43 PM, Konstantin Shvachko 
wrote:

> It does not. Just adding historical references, as Andrew raised the
> question.
>
> On Mon, Jul 31, 2017 at 7:38 PM, Allen Wittenauer <
> a...@effectivemachines.com> wrote:
>
>>
>> ... that doesn't contradict anything I said.
>>
>> > On Jul 31, 2017, at 7:23 PM, Konstantin Shvachko 
>> wrote:
>> >
>> > The issue was discussed on several occasions in the past.
>> > Took me a while to dig this out as an example:
>> > http://mail-archives.apache.org/mod_mbox/hadoop-general/2011
>> 11.mbox/%3C4EB0827C.6040204%40apache.org%3E
>> >
>> > Doug Cutting:
>> > "Folks should not primarily evaluate binaries when voting. The ASF
>> primarily produces and publishes source-code
>> > so voting artifacts should be optimized for evaluation of that."
>> >
>> > Thanks,
>> > --Konst
>> >
>> > On Mon, Jul 31, 2017 at 4:51 PM, Allen Wittenauer <
>> a...@effectivemachines.com> wrote:
>> >
>> > > On Jul 31, 2017, at 4:18 PM, Andrew Wang 
>> wrote:
>> > >
>> > > Forking this off to not distract from release activities.
>> > >
>> > > I filed https://issues.apache.org/jira/browse/LEGAL-323 to get
>> clarity on the matter. I read the entire webpage, and it could be improved
>> one way or the other.
>> >
>> >
>> > IANAL, my read has always lead me to believe:
>> >
>> > * An artifact is anything that is uploaded to dist.a.o
>> and repository.a.o
>> > * A release consists of one or more artifacts
>> ("Releases are, by definition, anything that is published beyond the group
>> that owns it. In our case, that means any publication outside the group of
>> people on the product dev list.")
>> > * One of those artifacts MUST be source
>> > * (insert voting rules here)
>> > * They must be built on a machine in control of the RM
>> > * There are no exceptions for alpha, nightly, etc
>> > * (various other requirements)
>> >
>> > i.e., release != artifact  it's more like release =
>> artifact * n .
>> >
>> > Do you have to have binaries?  No (e.g., Apache SpamAssassin
>> has no binaries to create).  But if you place binaries in dist.a.o or
>> repository.a.o, they are effectively part of your release and must follow
>> the same rules.  (Votes, etc.)
>> >
>> >
>>
>>
>