[jira] [Comment Edited] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)

2019-06-21 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869253#comment-16869253
 ] 

Wei-Chiu Chuang edited comment on HDFS-12345 at 6/21/19 7:11 AM:
-

I think it's good. I've taken part of the code to develop a KMS audit replay 
tool and helped us debug a tricky issue.

The only thing I noticed is that the Dynamometer audit replay tool assumes 
NameNode audit log format is in 12-hour format, UTC time.
{code}
  private static final DateFormat AUDIT_DATE_FORMAT = new 
SimpleDateFormat(
  "-MM-dd hh:mm:ss,SSS");
  static {
AUDIT_DATE_FORMAT.setTimeZone(TimeZone.getTimeZone("UTC"));
  }
{code}
But in reality, NameNode audit log format is not fixed, can be changed by 
log4j.properties.

I found this during developing the KMS audit replay tool, because KMS exports 
24-hour time format.


was (Author: jojochuang):
I think it's good. I've taken part of the code to develop a KMS audit replay 
tool and helped us debug a tricky issue.

The only thing I noticed is that the Dynamometer audit replay tool assumes 
NameNode audit log format is in 12-hour format, UTC time.
{code}
  private static final DateFormat AUDIT_DATE_FORMAT = new 
SimpleDateFormat(
  "-MM-dd hh:mm:ss,SSS");
  static {
AUDIT_DATE_FORMAT.setTimeZone(TimeZone.getTimeZone("UTC"));
  }
{code}
But in reality, NameNode audit log format is not fixed, can be changed by 
log4j.properties.

> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-12345.000.patch, HDFS-12345.001.patch, 
> HDFS-12345.002.patch, HDFS-12345.003.patch, HDFS-12345.004.patch, 
> HDFS-12345.005.patch, HDFS-12345.006.patch, HDFS-12345.007.patch
>
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)

2019-03-13 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791392#comment-16791392
 ] 

Yiqun Lin edited comment on HDFS-12345 at 3/13/19 7:13 AM:
---

I'd like to help fix some checkstyle warnings to let codebase of Dynamometer 
follow the Hadoop code format.
Attach the v03 patch.


was (Author: linyiqun):
I'd like to help some checkstyle warnings to let codes follow the Hadoop code 
format.
Attach the v03 patch.

> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Siyao Meng
>Priority: Major
> Attachments: HDFS-12345.000.patch, HDFS-12345.001.patch, 
> HDFS-12345.002.patch, HDFS-12345.003.patch
>
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)

2019-02-28 Thread Siyao Meng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780957#comment-16780957
 ] 

Siyao Meng edited comment on HDFS-12345 at 2/28/19 10:15 PM:
-

We might have a little dependency issue here on JUnit.
I noticed that in the trunk (3.3.0), junit jar has been removed from the Hadoop 
tarball at some point (it was under ./share/hadoop/mapreduce/lib/junit-4.11.jar 
in Hadoop 3.2.1 and before).
Since dynamometer-infra uses MiniDFSCluster that depends on JUnit, we won't be 
able to run dynamometer-infra jar in trunk. It can still be compiled, just the 
JUnit jar is not included in the distribution tarball.
I've tried to add junit jar back under somewhere in the tarball like 
./share/hadoop/tools/lib/ but no luck so far. Question: is maven 
"copy-dependencies" the right way to do it? Need some help here.


was (Author: smeng):
We might have a little dependency issue here on JUnit.
I noticed that in the trunk (3.3.0), junit jar has been removed from the Hadoop 
tarball at some point (it was under ./share/hadoop/mapreduce/lib/junit-4.11.jar 
in Hadoop 3.2.1 and before).
Since dynamometer-infra uses MiniDFSCluster that depends on JUnit, we won't be 
able to run dynamometer-infra jar in trunk. It can still be compiled, just not 
included in the distribution tarball.
I've tried to add junit jar back under somewhere in the tarball like 
./share/hadoop/tools/lib/ but no luck so far. Question: is maven 
"copy-dependencies" the right way to do it? Need some help here.

> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-12345.000.patch, HDFS-12345.001.patch
>
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12345) Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)

2019-02-13 Thread Siyao Meng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767673#comment-16767673
 ] 

Siyao Meng edited comment on HDFS-12345 at 2/13/19 10:55 PM:
-

[~xkrogen] Excited to see Dynamometer merging to upstream!

I fixed two small issues when applying rev 000 patch to my local trunk and 
compiling it.
1. fsimage_0061740 is binary, content of which is not included in 
patch rev 000. I grabbed the file 
[here|https://github.com/xkrogen/dynamometer/raw/ekrogen-hadoop-3-support/dynamometer-infra/src/test/resources/hadoop_3_1/fsimage_0061740]
 and generated the patch with "git diff --binary"
2. pom.xml missing mockito version. I grabbed ver 1.10.19 from 
[build.gradle|https://github.com/xkrogen/dynamometer/blob/ekrogen-hadoop-3-support/dynamometer-infra/build.gradle].
{code:title=mvn package -Pdist -DskipTests -Dtar -e}
[ERROR]   The project org.apache.hadoop:hadoop-dynamometer-infra:3.3.0-SNAPSHOT 
(/Users/smeng/repo/trunk/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra/pom.xml)
 has 1 error
[ERROR] 'dependencies.dependency.version' for org.mockito:mockito-all:jar 
is missing. @ line 75, column 17
[ERROR]
[ERROR]   The project 
org.apache.hadoop:hadoop-dynamometer-blockgen:3.3.0-SNAPSHOT 
(/Users/smeng/repo/trunk/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen/pom.xml)
 has 1 error
[ERROR] 'dependencies.dependency.version' for org.mockito:mockito-all:jar 
is missing. @ line 37, column 17
{code}

 [^HDFS-12345.001.patch] 


was (Author: smeng):
[~xkrogen] Excited to see Dynamometer merging to upstream!

I fixed two small issues when applying rev 000 patch to my local trunk and 
compiling it.
1. fsimage_0061740 is binary, content of which is not included in 
patch rev 000. I grabbed the file 
[here](https://github.com/xkrogen/dynamometer/raw/ekrogen-hadoop-3-support/dynamometer-infra/src/test/resources/hadoop_3_1/fsimage_0061740)
 and generated the patch with "git diff --binary"
2. pom.xml missing mockito version. I grabbed ver 1.10.19 from 
[build.gradle](https://github.com/xkrogen/dynamometer/blob/ekrogen-hadoop-3-support/dynamometer-infra/build.gradle).
{code:title=mvn package -Pdist -DskipTests -Dtar -e}
[ERROR]   The project org.apache.hadoop:hadoop-dynamometer-infra:3.3.0-SNAPSHOT 
(/Users/smeng/repo/trunk/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra/pom.xml)
 has 1 error
[ERROR] 'dependencies.dependency.version' for org.mockito:mockito-all:jar 
is missing. @ line 75, column 17
[ERROR]
[ERROR]   The project 
org.apache.hadoop:hadoop-dynamometer-blockgen:3.3.0-SNAPSHOT 
(/Users/smeng/repo/trunk/hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen/pom.xml)
 has 1 error
[ERROR] 'dependencies.dependency.version' for org.mockito:mockito-all:jar 
is missing. @ line 37, column 17
{code}

 [^HDFS-12345.001.patch] 

> Scale testing HDFS NameNode with real metadata and workloads (Dynamometer)
> --
>
> Key: HDFS-12345
> URL: https://issues.apache.org/jira/browse/HDFS-12345
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode, test
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-12345.000.patch, HDFS-12345.001.patch
>
>
> Dynamometer has now been open sourced on our [GitHub 
> page|https://github.com/linkedin/dynamometer]. Read more at our [recent blog 
> post|https://engineering.linkedin.com/blog/2018/02/dynamometer--scale-testing-hdfs-on-minimal-hardware-with-maximum].
> To encourage getting the tool into the open for others to use as quickly as 
> possible, we went through our standard open sourcing process of releasing on 
> GitHub. However we are interested in the possibility of donating this to 
> Apache as part of Hadoop itself and would appreciate feedback on whether or 
> not this is something that would be supported by the community.
> Also of note, previous [discussions on the dev mail 
> lists|http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201707.mbox/%3c98fceffa-faff-4cf1-a14d-4faab6567...@gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org