[jira] [Resolved] (HADOOP-15207) Hadoop performance Issues
[ https://issues.apache.org/jira/browse/HADOOP-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Fabbri resolved HADOOP-15207. --- Resolution: Invalid Can someone please ban this user? Similar SEO spam has been pasted on other sites I can see. We should remove the link and/or delete the Jira too. > Hadoop performance Issues > - > > Key: HADOOP-15207 > URL: https://issues.apache.org/jira/browse/HADOOP-15207 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: HADOOP-13345 >Reporter: nicole wells >Priority: Minor > > I am doing a hadoop project where I am working with 100MB, 500MB, 1GB files. > A multinode hadoop cluster with 4 nodes is implemented for the purpose. The > time taken for running the mapreduce program in multinode cluster is much > larger than the time taken in running single node cluster setup. Also, it is > shocking to observe that the basic Java program(without [Hadoop > BigData|https://mindmajix.com/hadoop-training]) finishes the operation faster > than both the single and multi node clusters. Here is the code for the mapper > class: > > {code:java} > public class myMapperClass extends MapReduceBase implements > Mapper> { > private final static IntWritable one = new IntWritable(1); > private final static IntWritable two = new IntWritable(2); > private final static IntWritable three = new IntWritable(3); > private final static IntWritable four = new IntWritable(4); > private final static IntWritable five = new IntWritable(5); > private final static IntWritable six = new IntWritable(6); > private final static IntWritable seven = new IntWritable(7); > private final static IntWritable eight = new IntWritable(8); > private final static IntWritable nine= new IntWritable(9); > private Text srcIP,srcIPN; > private Text dstIP,dstIPN; > private Text srcPort,srcPortN; > private Text dstPort,dstPortN; > private Text counter1,counter2,counter3,counter4,counter5 ; > //private Text total_records; > int ddos_line = 0; > //map method that performs the tokenizer job and framing the initial key > value pairs > @Override > public void map(LongWritable key, Text value, OutputCollector IntWritable> output, Reporter reporter) throws IOException > { > String line1 = value.toString(); > ddos_line++; > int pos1=0; > int lineno=0; > int[] count = {10, 10, 10, 10, 10}; > int[] lineIndex = {0, 0, 0, 0, 0}; > for(int i=0;i<9;i++) > { > pos1 = line1.indexOf("|",pos1+1); > } > srcIP = new Text( line1.substring(0,line1.indexOf("|")) ); > String srcIPP = srcIP.toString(); > dstIP = new Text(line1.substring( > srcIPP.length()+1,line1.indexOf("|",srcIPP.length()+1)) ) ; > srcPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) > ); > pos1 = line1.indexOf("|",pos1+1); > dstPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) > ); > //BufferedReader br = new BufferedReader(new > FileReader("/home/yogi/Desktop/normal_small")); > FileSystem fs = FileSystem.get(new Configuration()); > FileStatus[] status = fs.listStatus(new > Path("hdfs://master:54310/usr/local/hadoop/input/normal_small")); > BufferedReader br=new BufferedReader(new > InputStreamReader(fs.open(status[0].getPath(; > String line=br.readLine(); > lineno++; > boolean bool = true; > while (bool) { > for(int i=0; i<5;i++) > { > if(bool==false) > break; > int pos=0; > int temp; > for(int j=0;j<9;j++) > { > pos = line.indexOf("|",pos+1); > } > srcIPN = new Text( line.substring(0,line.indexOf("|")) ); > String srcIPP2 = srcIPN.toString(); > dstIPN = new Text(line.substring( > srcIPP2.length()+1,line.indexOf("|",srcIPP2.length()+1)) ) ; > srcPortN = new Text( > line.substring(pos+1,line.indexOf("|",pos+1)) ); > pos = line.indexOf("|",pos+1); > dstPortN = new Text( > line.substring(pos+1,line.indexOf("|",pos+1)) ); > if(srcIP.equals(srcIPN) && dstIP.equals(dstIPN)) > { > int tmp, tmp2; > tmp = Integer.parseInt(srcPort.toString()) - > Integer.parseInt(srcPortN.toString()); > if(tmp<0) > tmp*=-1; > tmp2 = Integer.parseInt(dstPort.toString()) - > Integer.parseInt(dstPortN.toString());
[jira] [Created] (HADOOP-15207) Hadoop performance Issues
nicole wells created HADOOP-15207: - Summary: Hadoop performance Issues Key: HADOOP-15207 URL: https://issues.apache.org/jira/browse/HADOOP-15207 Project: Hadoop Common Issue Type: Bug Components: common Affects Versions: HADOOP-13345 Reporter: nicole wells I am doing a hadoop project where I am working with 100MB, 500MB, 1GB files. A multinode hadoop cluster with 4 nodes is implemented for the purpose. The time taken for running the mapreduce program in multinode cluster is much larger than the time taken in running single node cluster setup. Also, it is shocking to observe that the basic Java program(without [Hadoop BigData|https://mindmajix.com/hadoop-training]) finishes the operation faster than both the single and multi node clusters. Here is the code for the mapper class: {code:java} public class myMapperClass extends MapReduceBase implements Mapper{ private final static IntWritable one = new IntWritable(1); private final static IntWritable two = new IntWritable(2); private final static IntWritable three = new IntWritable(3); private final static IntWritable four = new IntWritable(4); private final static IntWritable five = new IntWritable(5); private final static IntWritable six = new IntWritable(6); private final static IntWritable seven = new IntWritable(7); private final static IntWritable eight = new IntWritable(8); private final static IntWritable nine= new IntWritable(9); private Text srcIP,srcIPN; private Text dstIP,dstIPN; private Text srcPort,srcPortN; private Text dstPort,dstPortN; private Text counter1,counter2,counter3,counter4,counter5 ; //private Text total_records; int ddos_line = 0; //map method that performs the tokenizer job and framing the initial key value pairs @Override public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { String line1 = value.toString(); ddos_line++; int pos1=0; int lineno=0; int[] count = {10, 10, 10, 10, 10}; int[] lineIndex = {0, 0, 0, 0, 0}; for(int i=0;i<9;i++) { pos1 = line1.indexOf("|",pos1+1); } srcIP = new Text( line1.substring(0,line1.indexOf("|")) ); String srcIPP = srcIP.toString(); dstIP = new Text(line1.substring( srcIPP.length()+1,line1.indexOf("|",srcIPP.length()+1)) ) ; srcPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) ); pos1 = line1.indexOf("|",pos1+1); dstPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) ); //BufferedReader br = new BufferedReader(new FileReader("/home/yogi/Desktop/normal_small")); FileSystem fs = FileSystem.get(new Configuration()); FileStatus[] status = fs.listStatus(new Path("hdfs://master:54310/usr/local/hadoop/input/normal_small")); BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(status[0].getPath(; String line=br.readLine(); lineno++; boolean bool = true; while (bool) { for(int i=0; i<5;i++) { if(bool==false) break; int pos=0; int temp; for(int j=0;j<9;j++) { pos = line.indexOf("|",pos+1); } srcIPN = new Text( line.substring(0,line.indexOf("|")) ); String srcIPP2 = srcIPN.toString(); dstIPN = new Text(line.substring( srcIPP2.length()+1,line.indexOf("|",srcIPP2.length()+1)) ) ; srcPortN = new Text( line.substring(pos+1,line.indexOf("|",pos+1)) ); pos = line.indexOf("|",pos+1); dstPortN = new Text( line.substring(pos+1,line.indexOf("|",pos+1)) ); if(srcIP.equals(srcIPN) && dstIP.equals(dstIPN)) { int tmp, tmp2; tmp = Integer.parseInt(srcPort.toString()) - Integer.parseInt(srcPortN.toString()); if(tmp<0) tmp*=-1; tmp2 = Integer.parseInt(dstPort.toString()) - Integer.parseInt(dstPortN.toString()); if(tmp2<0) tmp2*=-1; temp=tmp+tmp2; if(count[4] > temp) { count[4] = temp; lineIndex[4]=lineno; } for(int k=0;k<5;k++) { for(int j=0;j<4;j++) { if(count[j] > count[j+1])
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/124/ [Feb 1, 2018 6:38:39 PM] (inigoiri) HDFS-13043. RBF: Expose the state of the Routers in the federation. -1 overall The following subsystems voted -1: docker Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK9 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/ No changes -1 overall The following subsystems voted -1: compile findbugs mvninstall mvnsite shadedclient unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace mvninstall: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-mvninstall-root.txt [1.7M] compile: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-compile-root.txt [48K] cc: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-compile-root.txt [48K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-compile-root.txt [48K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-checkstyle-root.txt [3.1M] mvnsite: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-mvnsite-root.txt [48K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-patch-shelldocs.txt [12K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/whitespace-eol.txt [9.2M] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/whitespace-tabs.txt [292K] xml: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/xml.txt [8.0K] findbugs: https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-annotations.txt [260K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth-examples.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-kms.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-minikdc.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-nfs.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-nfs.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt [36K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-common.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs-plugins.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt [20K] https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt [12K]
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/676/ [Feb 1, 2018 6:14:09 PM] (jlowe) Revert "YARN-7677. Docker image cannot set HADOOP_CONF_DIR. Contributed [Feb 1, 2018 6:37:14 PM] (inigoiri) HDFS-13043. RBF: Expose the state of the Routers in the federation. [Feb 1, 2018 6:45:34 PM] (xyao) HDFS-12997. Move logging to slf4j in BlockPoolSliceStorage and Storage. [Feb 1, 2018 8:28:17 PM] (hanishakoneru) HDFS-13062. Provide support for JN to use separate journal disk per [Feb 1, 2018 11:33:52 PM] (xiao) HADOOP-15197. Remove tomcat from the Hadoop-auth test bundle. - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK9 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/8/ [Oct 29, 2017 11:44:16 PM] (yufei) YARN-6747. TestFSAppStarvation.testPreemptionEnable fails [Oct 30, 2017 1:54:33 AM] (templedf) YARN-7374. Improve performance of DRF comparisons for resource types in [Oct 30, 2017 4:49:22 PM] (cdouglas) HADOOP-14992. Upgrade Avro patch version. Contributed by Bharat [Oct 30, 2017 6:04:22 PM] (templedf) YARN-6927. Add support for individual resource types requests in [Oct 30, 2017 7:41:28 PM] (templedf) YARN-7336. Unsafe cast from long to int Resource.hashCode() method [Oct 30, 2017 10:16:51 PM] (junping_du) HADOOP-14990. Clean up jdiff xml files added for 2.8.2 release. [Oct 31, 2017 4:49:15 AM] (aajisaka) HADOOP-14980. [JDK9] Upgrade maven-javadoc-plugin to 3.0.0-M1. [Oct 31, 2017 4:50:28 AM] (aajisaka) Revert "HADOOP-14980. [JDK9] Upgrade maven-javadoc-plugin to 3.0.0-M1. [Oct 31, 2017 4:51:26 AM] (aajisaka) HADOOP-14980. [JDK9] Upgrade maven-javadoc-plugin to 3.0.0-M1. [Oct 31, 2017 7:36:02 AM] (aajisaka) YARN-7407. Moving logging APIs over to slf4j in [Oct 31, 2017 8:09:45 AM] (aajisaka) YARN-7379. Moving logging APIs over to slf4j in hadoop-yarn-client. [Oct 31, 2017 2:30:13 PM] (jlowe) HADOOP-14919. BZip2 drops records when reading data in splits. [Oct 31, 2017 2:34:01 PM] (mackrorysd) HDFS-206. Support for head in FSShell. Contributed by Gabor Bota. [Oct 31, 2017 4:44:01 PM] (cdouglas) HDFS-7878. API - expose a unique file identifier. [Oct 31, 2017 5:21:42 PM] (inigoiri) HDFS-12699. TestMountTable fails with Java 7. Contributed by Inigo [Oct 31, 2017 5:23:00 PM] (arp) HDFS-12499. dfs.namenode.shared.edits.dir property is currently namenode [Oct 31, 2017 5:46:10 PM] (wang) Revert "HDFS-12499. dfs.namenode.shared.edits.dir property is currently [Oct 31, 2017 7:05:43 PM] (subru) YARN-6413. FileSystem based Yarn Registry implementation. (Ellen Hui via [Nov 1, 2017 4:58:14 AM] (lei) HDFS-12482. Provide a configuration to adjust the weight of EC recovery [Nov 1, 2017 5:44:16 AM] (jzhuge) HDFS-12714. Hadoop 3 missing fix for HDFS-5169. Contributed by Joe [Nov 1, 2017 6:37:08 AM] (yqlin) HDFS-12219. Javadoc for FSNamesystem#getMaxObjects is incorrect. [Nov 1, 2017 8:41:45 AM] (wwei) HDFS-12744. More logs when short-circuit read is failed and disabled. [Nov 1, 2017 8:26:37 PM] (inigoiri) YARN-7276 addendum to add timeline service depencies. Contributed by [Nov 1, 2017 9:48:16 PM] (junping_du) YARN-7400. Incorrect log preview displayed in jobhistory server ui. [Nov 1, 2017 10:39:56 PM] (eyang) YARN-7412. Fix unit test for docker mount check on ubuntu. (Contributed [Nov 2, 2017 12:00:32 AM] (jianhe) YARN-7396. NPE when accessing container logs due to null dirsHandler. [Nov 2, 2017 8:25:19 AM] (rohithsharmaks) addendum patch for YARN-7289. [Nov 2, 2017 8:43:08 AM] (aajisaka) MAPREDUCE-6983. Moving logging APIs over to slf4j in [Nov 2, 2017 9:12:04 AM] (sammi.chen) HADOOP-14997. Add hadoop-aliyun as dependency of hadoop-cloud-storage. [Nov 2, 2017 9:32:24 AM] (aajisaka) MAPREDUCE-6999. Fix typo onf in DynamicInputChunk.java. Contributed by [Nov 2, 2017 2:37:17 PM] (jlowe) YARN-7286. Add support for docker to have no capabilities. Contributed [Nov 2, 2017 4:51:28 PM] (wangda) YARN-7364. Queue dash board in new YARN UI has incorrect values. (Sunil [Nov 2, 2017 5:37:33 PM] (epayne) YARN-7370: Preemption properties should be refreshable. Contrubted by [Nov 3, 2017 12:15:33 AM] (arun suresh) HADOOP-15013. Fix ResourceEstimator findbugs issues. (asuresh) [Nov 3, 2017 12:39:23 AM] (subru) YARN-7432. Fix DominantResourceFairnessPolicy serializable findbugs [Nov 3, 2017 1:55:29 AM] (sunilg) YARN-7410. Cleanup FixedValueResource to avoid dependency to [Nov 3, 2017 4:27:35 AM] (xiao) HDFS-12682. ECAdmin -listPolicies will always show [Nov 3, 2017 4:29:53 AM] (inigoiri) YARN-7434. Router getApps REST invocation fails with multiple RMs. [Nov 3, 2017 4:53:13 AM] (xiao) HDFS-12725. BlockPlacementPolicyRackFaultTolerant fails with very uneven [Nov 3, 2017 6:15:50 AM] (sunilg) YARN-7392. Render cluster information on new YARN web ui. Contributed by [Nov 3, 2017 7:05:45 PM] (xiao) HDFS-11467. Support ErasureCoding section in OIV XML/ReverseXML. [Nov 3, 2017 8:16:46 PM] (kihwal) HDFS-12771. Add genstamp and block size to metasave Corrupt blocks list. [Nov 3, 2017 9:30:57 PM] (cdouglas) HDFS-12681. Fold HdfsLocatedFileStatus into HdfsFileStatus. [Nov 3, 2017 11:10:37 PM] (xyao) HADOOP-14987. Improve KMSClientProvider log around delegation token [Nov 4, 2017 3:34:40 AM] (xyao) HDFS-10528. Add logging to successful standby checkpointing. Contributed [Nov 4, 2017 4:01:56 AM] (liuml07) HADOOP-15015. TestConfigurationFieldsBase to use SLF4J for logging. [Nov 6, 2017 7:28:38 AM] (naganarasimha_gr) MAPREDUCE-6975. Logging task counters. Contributed by Prabhu Joseph. [Nov 6, 2017 5:09:10 PM] (bibinchundatt) Add containerId to Localizer failed logs. Contributed by
Re: Apache Hadoop 3.0.1 Release plan
Sounds good to me, ATM. On Thu, Feb 1, 2018 at 2:34 PM, Aaron T. Myerswrote: > Hey Anu, > > My feeling on HDFS-12990 is that we've discussed it quite a bit already and > it doesn't seem at this point like either side is going to budge. I'm > certainly happy to have a phone call about it, but I don't expect that we'd > make much progress. > > My suggestion is that we simply include the patch posted to HDFS-12990 in > the 3.0.1 RC and call this issue out clearly in the subsequent VOTE thread > for the 3.0.1 release. Eddy, are you up for that? > > Best, > Aaron > > On Thu, Feb 1, 2018 at 1:13 PM, Lei Xu wrote: >> >> +Xiao >> >> My understanding is that we will have this for 3.0.1. Xiao, could >> you give your inputs here? >> >> On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer >> wrote: >> > Hi Eddy, >> > >> > Thanks for driving this release. Just a quick question, do we have time >> > to close this issue? >> > https://issues.apache.org/jira/browse/HDFS-12990 >> > >> > or are we abandoning it? I believe that this is the last window for us >> > to fix this issue. >> > >> > Should we have a call and get this resolved one way or another? >> > >> > Thanks >> > Anu >> > >> > On 2/1/18, 10:51 AM, "Lei Xu" wrote: >> > >> > Hi, All >> > >> > I just cut branch-3.0.1 from branch-3.0. Please make sure all >> > patches >> > targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1. >> > >> > Thanks! >> > Eddy >> > >> > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu wrote: >> > > Hi, All >> > > >> > > We have released Apache Hadoop 3.0.0 in December [1]. To further >> > > improve the quality of release, we plan to cut branch-3.0.1 branch >> > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The >> > focus >> > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug >> > fixes >> > > [2]. No new features and improvement should be included. >> > > >> > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on >> > Feb >> > > 1st, targeting for Feb 9th release. >> > > >> > > Please feel free to share your insights. >> > > >> > > [1] >> > https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html >> > > [2] https://issues.apache.org/jira/issues/?filter=12342842 >> > > >> > > Best, >> > > -- >> > > Lei (Eddy) Xu >> > > Software Engineer, Cloudera >> > >> > >> > >> > -- >> > Lei (Eddy) Xu >> > Software Engineer, Cloudera >> > >> > >> > - >> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org >> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org >> > >> > >> > >> >> >> >> -- >> Lei (Eddy) Xu >> Software Engineer, Cloudera >> >> - >> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >> > -- Lei (Eddy) Xu Software Engineer, Cloudera - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: Apache Hadoop 3.0.1 Release plan
Hey Anu, My feeling on HDFS-12990 is that we've discussed it quite a bit already and it doesn't seem at this point like either side is going to budge. I'm certainly happy to have a phone call about it, but I don't expect that we'd make much progress. My suggestion is that we simply include the patch posted to HDFS-12990 in the 3.0.1 RC and call this issue out clearly in the subsequent VOTE thread for the 3.0.1 release. Eddy, are you up for that? Best, Aaron On Thu, Feb 1, 2018 at 1:13 PM, Lei Xuwrote: > +Xiao > > My understanding is that we will have this for 3.0.1. Xiao, could > you give your inputs here? > > On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer > wrote: > > Hi Eddy, > > > > Thanks for driving this release. Just a quick question, do we have time > to close this issue? > > https://issues.apache.org/jira/browse/HDFS-12990 > > > > or are we abandoning it? I believe that this is the last window for us > to fix this issue. > > > > Should we have a call and get this resolved one way or another? > > > > Thanks > > Anu > > > > On 2/1/18, 10:51 AM, "Lei Xu" wrote: > > > > Hi, All > > > > I just cut branch-3.0.1 from branch-3.0. Please make sure all > patches > > targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1. > > > > Thanks! > > Eddy > > > > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu wrote: > > > Hi, All > > > > > > We have released Apache Hadoop 3.0.0 in December [1]. To further > > > improve the quality of release, we plan to cut branch-3.0.1 branch > > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The > focus > > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug > fixes > > > [2]. No new features and improvement should be included. > > > > > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on > Feb > > > 1st, targeting for Feb 9th release. > > > > > > Please feel free to share your insights. > > > > > > [1] https://www.mail-archive.com/general@hadoop.apache.org/ > msg07757.html > > > [2] https://issues.apache.org/jira/issues/?filter=12342842 > > > > > > Best, > > > -- > > > Lei (Eddy) Xu > > > Software Engineer, Cloudera > > > > > > > > -- > > Lei (Eddy) Xu > > Software Engineer, Cloudera > > > > > - > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > > > > > > > > -- > Lei (Eddy) Xu > Software Engineer, Cloudera > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Re: Apache Hadoop 3.0.1 Release plan
+Xiao My understanding is that we will have this for 3.0.1. Xiao, could you give your inputs here? On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineerwrote: > Hi Eddy, > > Thanks for driving this release. Just a quick question, do we have time to > close this issue? > https://issues.apache.org/jira/browse/HDFS-12990 > > or are we abandoning it? I believe that this is the last window for us to fix > this issue. > > Should we have a call and get this resolved one way or another? > > Thanks > Anu > > On 2/1/18, 10:51 AM, "Lei Xu" wrote: > > Hi, All > > I just cut branch-3.0.1 from branch-3.0. Please make sure all patches > targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1. > > Thanks! > Eddy > > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu wrote: > > Hi, All > > > > We have released Apache Hadoop 3.0.0 in December [1]. To further > > improve the quality of release, we plan to cut branch-3.0.1 branch > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes > > [2]. No new features and improvement should be included. > > > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb > > 1st, targeting for Feb 9th release. > > > > Please feel free to share your insights. > > > > [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html > > [2] https://issues.apache.org/jira/issues/?filter=12342842 > > > > Best, > > -- > > Lei (Eddy) Xu > > Software Engineer, Cloudera > > > > -- > Lei (Eddy) Xu > Software Engineer, Cloudera > > - > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > -- Lei (Eddy) Xu Software Engineer, Cloudera - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small
Aki Tanaka created HADOOP-15206: --- Summary: BZip2 drops and duplicates records when input split size is small Key: HADOOP-15206 URL: https://issues.apache.org/jira/browse/HADOOP-15206 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.8.3 Reporter: Aki Tanaka BZip2 can drop and duplicate record when input split file is small. I confirmed that this issue happens when the input split size is between 1byte and 4bytes. I am seeing the following 2 problem behaviors. 1. Drop record: BZip2 skips the first record in the input file when the input split size is small Set the split size to 3 and tested to load 100 records (0, 1, 2..99) {code:java} 2018-02-01 10:52:33,502 INFO [Thread-17] mapred.TestTextInputFormat (TestTextInputFormat.java:verifyPartitions(317)) - splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3 count=99{code} > The input format read only 99 records but not 100 records 2. Duplicate Record: 2 input splits has same BZip2 records when the input split size is small Set the split size to 1 and tested to load 100 records (0, 1, 2..99) {code:java} 2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat (TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file /work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1 count=99 2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat (TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 at position 8 {code} I experienced this error when I execute Spark (SparkSQL) job under the following conditions: * The file size of the input files are small (around 1KB) * Hadoop cluster has many slave nodes (able to launch many executor tasks) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: Apache Hadoop 3.0.1 Release plan
Hi Eddy, Thanks for driving this release. Just a quick question, do we have time to close this issue? https://issues.apache.org/jira/browse/HDFS-12990 or are we abandoning it? I believe that this is the last window for us to fix this issue. Should we have a call and get this resolved one way or another? Thanks Anu On 2/1/18, 10:51 AM, "Lei Xu"wrote: Hi, All I just cut branch-3.0.1 from branch-3.0. Please make sure all patches targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1. Thanks! Eddy On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu wrote: > Hi, All > > We have released Apache Hadoop 3.0.0 in December [1]. To further > improve the quality of release, we plan to cut branch-3.0.1 branch > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes > [2]. No new features and improvement should be included. > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb > 1st, targeting for Feb 9th release. > > Please feel free to share your insights. > > [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html > [2] https://issues.apache.org/jira/issues/?filter=12342842 > > Best, > -- > Lei (Eddy) Xu > Software Engineer, Cloudera -- Lei (Eddy) Xu Software Engineer, Cloudera - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Re: Apache Hadoop 3.0.1 Release plan
Hi, All I just cut branch-3.0.1 from branch-3.0. Please make sure all patches targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1. Thanks! Eddy On Tue, Jan 9, 2018 at 11:17 AM, Lei Xuwrote: > Hi, All > > We have released Apache Hadoop 3.0.0 in December [1]. To further > improve the quality of release, we plan to cut branch-3.0.1 branch > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes > [2]. No new features and improvement should be included. > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb > 1st, targeting for Feb 9th release. > > Please feel free to share your insights. > > [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html > [2] https://issues.apache.org/jira/issues/?filter=12342842 > > Best, > -- > Lei (Eddy) Xu > Software Engineer, Cloudera -- Lei (Eddy) Xu Software Engineer, Cloudera - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/123/ [Jan 31, 2018 5:14:31 PM] (inigoiri) HDFS-13044. RBF: Add a safe mode for the Router. Contributed by Inigo -1 overall The following subsystems voted -1: docker Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/675/ [Jan 31, 2018 5:37:54 PM] (jlowe) YARN-7677. Docker image cannot set HADOOP_CONF_DIR. Contributed by Jim [Jan 31, 2018 6:47:02 PM] (xyao) HDFS-13061. SaslDataTransferClient#checkTrustAndSend should not trust a [Jan 31, 2018 7:05:17 PM] (hanishakoneru) HDFS-13092. Reduce verbosity for ThrottledAsyncChecker#schedule. [Jan 31, 2018 9:45:30 PM] (epayne) MAPREDUCE-7033: Map outputs implicitly rely on permissive umask for [Feb 1, 2018 1:51:40 AM] (eyang) YARN-7816. Allow same application name submitted by multiple users. [Feb 1, 2018 6:39:51 AM] (xyao) HDFS-13060. Adding a BlacklistBasedTrustedChannelResolver for [Feb 1, 2018 6:50:25 AM] (xiao) HDFS-12897. getErasureCodingPolicy should handle .snapshot dir better. - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-15205) maven release: missing source attachments for hadoop-mapreduce-client-core
Zoltan Haindrich created HADOOP-15205: - Summary: maven release: missing source attachments for hadoop-mapreduce-client-core Key: HADOOP-15205 URL: https://issues.apache.org/jira/browse/HADOOP-15205 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0, 2.7.5 Reporter: Zoltan Haindrich I wanted to use the source attachment; however it looks like since 2.7.5 that artifact is not present at maven central ; it looks like the last release which had source attachments / javadocs was 2.7.4 http://central.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.4/ http://central.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.5/ this seems to be not limited to mapreduce; as the same change is present for yarn-common as well http://central.maven.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.7.4/ http://central.maven.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.7.5/ and also hadoop-common http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.7.4/ http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.7.5/ http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/3.0.0/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org