[jira] [Resolved] (HADOOP-15207) Hadoop performance Issues

2018-02-01 Thread Aaron Fabbri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri resolved HADOOP-15207.
---
Resolution: Invalid

Can someone please ban this user? Similar SEO spam has been pasted on other 
sites I can see.  We should remove the link and/or delete the Jira too.

> Hadoop performance Issues
> -
>
> Key: HADOOP-15207
> URL: https://issues.apache.org/jira/browse/HADOOP-15207
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: HADOOP-13345
>Reporter: nicole wells
>Priority: Minor
>
> I am doing a hadoop project where I am working with 100MB, 500MB, 1GB files. 
> A multinode hadoop cluster with 4 nodes is implemented for the purpose. The 
> time taken for running the mapreduce program in multinode cluster is much 
> larger than the time taken in running single node cluster setup. Also, it is 
> shocking to observe that the basic Java program(without [Hadoop 
> BigData|https://mindmajix.com/hadoop-training]) finishes the operation faster 
> than both the single and multi node clusters. Here is the code for the mapper 
> class:
>  
> {code:java}
> public class myMapperClass extends MapReduceBase implements 
> Mapper
> {
>  private final static IntWritable one = new IntWritable(1);
>  private final static IntWritable two = new IntWritable(2);
>  private final static IntWritable three = new IntWritable(3);
>  private final static IntWritable four = new IntWritable(4);
>  private final static IntWritable five = new IntWritable(5);
>  private final static IntWritable six = new IntWritable(6);
>  private final static IntWritable seven = new IntWritable(7);
>  private final static IntWritable eight = new IntWritable(8);
>  private final static IntWritable nine= new IntWritable(9);
>   private Text srcIP,srcIPN;
>   private Text dstIP,dstIPN;
>   private Text srcPort,srcPortN;
>   private Text dstPort,dstPortN;
>   private Text counter1,counter2,counter3,counter4,counter5 ;
>   //private Text total_records;
>   int ddos_line = 0;
>   //map method that performs the tokenizer job and framing the initial key 
> value pairs
>   @Override
> public void map(LongWritable key, Text value, OutputCollector IntWritable> output, Reporter reporter) throws IOException
>   {
> String line1 = value.toString();
> ddos_line++;
> int pos1=0;
> int lineno=0;
> int[] count = {10, 10, 10, 10, 10};
> int[] lineIndex = {0, 0, 0, 0, 0};
> for(int i=0;i<9;i++)
> {
> pos1 = line1.indexOf("|",pos1+1);
> }
> srcIP =  new Text( line1.substring(0,line1.indexOf("|")) );
> String srcIPP = srcIP.toString();
> dstIP = new Text(line1.substring( 
> srcIPP.length()+1,line1.indexOf("|",srcIPP.length()+1)) ) ;
> srcPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) 
> );
> pos1 = line1.indexOf("|",pos1+1);
> dstPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) 
> );
> //BufferedReader br = new BufferedReader(new 
> FileReader("/home/yogi/Desktop/normal_small"));
> FileSystem fs = FileSystem.get(new Configuration());
> FileStatus[] status = fs.listStatus(new 
> Path("hdfs://master:54310/usr/local/hadoop/input/normal_small"));
> BufferedReader br=new BufferedReader(new 
> InputStreamReader(fs.open(status[0].getPath(;   
> String line=br.readLine();
> lineno++;
> boolean bool = true;
> while (bool) {
> for(int i=0; i<5;i++)
> {
> if(bool==false)
> break;
> int pos=0;
> int temp;
> for(int j=0;j<9;j++)
> {
> pos = line.indexOf("|",pos+1);
> }
> srcIPN =  new Text( line.substring(0,line.indexOf("|")) );
> String srcIPP2 = srcIPN.toString();
> dstIPN = new Text(line.substring( 
> srcIPP2.length()+1,line.indexOf("|",srcIPP2.length()+1)) ) ;
> srcPortN = new Text( 
> line.substring(pos+1,line.indexOf("|",pos+1)) );
> pos = line.indexOf("|",pos+1);
> dstPortN = new Text( 
> line.substring(pos+1,line.indexOf("|",pos+1)) );
> if(srcIP.equals(srcIPN) && dstIP.equals(dstIPN))
> {
> int tmp, tmp2;
> tmp = Integer.parseInt(srcPort.toString()) - 
> Integer.parseInt(srcPortN.toString());
> if(tmp<0)
> tmp*=-1;
> tmp2 = Integer.parseInt(dstPort.toString()) - 
> Integer.parseInt(dstPortN.toString());

[jira] [Created] (HADOOP-15207) Hadoop performance Issues

2018-02-01 Thread nicole wells (JIRA)
nicole wells created HADOOP-15207:
-

 Summary: Hadoop performance Issues
 Key: HADOOP-15207
 URL: https://issues.apache.org/jira/browse/HADOOP-15207
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: HADOOP-13345
Reporter: nicole wells


I am doing a hadoop project where I am working with 100MB, 500MB, 1GB files. A 
multinode hadoop cluster with 4 nodes is implemented for the purpose. The time 
taken for running the mapreduce program in multinode cluster is much larger 
than the time taken in running single node cluster setup. Also, it is shocking 
to observe that the basic Java program(without [Hadoop 
BigData|https://mindmajix.com/hadoop-training]) finishes the operation faster 
than both the single and multi node clusters. Here is the code for the mapper 
class:

 
{code:java}
public class myMapperClass extends MapReduceBase implements 
Mapper
{

 private final static IntWritable one = new IntWritable(1);
 private final static IntWritable two = new IntWritable(2);
 private final static IntWritable three = new IntWritable(3);
 private final static IntWritable four = new IntWritable(4);
 private final static IntWritable five = new IntWritable(5);
 private final static IntWritable six = new IntWritable(6);
 private final static IntWritable seven = new IntWritable(7);
 private final static IntWritable eight = new IntWritable(8);
 private final static IntWritable nine= new IntWritable(9);

  private Text srcIP,srcIPN;
  private Text dstIP,dstIPN;
  private Text srcPort,srcPortN;
  private Text dstPort,dstPortN;
  private Text counter1,counter2,counter3,counter4,counter5 ;
  //private Text total_records;

  int ddos_line = 0;
  //map method that performs the tokenizer job and framing the initial key 
value pairs
  @Override
public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException
  {
String line1 = value.toString();
ddos_line++;
int pos1=0;
int lineno=0;

int[] count = {10, 10, 10, 10, 10};
int[] lineIndex = {0, 0, 0, 0, 0};

for(int i=0;i<9;i++)
{
pos1 = line1.indexOf("|",pos1+1);
}

srcIP =  new Text( line1.substring(0,line1.indexOf("|")) );
String srcIPP = srcIP.toString();
dstIP = new Text(line1.substring( 
srcIPP.length()+1,line1.indexOf("|",srcIPP.length()+1)) ) ;

srcPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) );
pos1 = line1.indexOf("|",pos1+1);
dstPort = new Text( line1.substring(pos1+1,line1.indexOf("|",pos1+1)) );

//BufferedReader br = new BufferedReader(new 
FileReader("/home/yogi/Desktop/normal_small"));

FileSystem fs = FileSystem.get(new Configuration());
FileStatus[] status = fs.listStatus(new 
Path("hdfs://master:54310/usr/local/hadoop/input/normal_small"));
BufferedReader br=new BufferedReader(new 
InputStreamReader(fs.open(status[0].getPath(;   

String line=br.readLine();

lineno++;
boolean bool = true;
while (bool) {

for(int i=0; i<5;i++)
{
if(bool==false)
break;
int pos=0;
int temp;
for(int j=0;j<9;j++)
{
pos = line.indexOf("|",pos+1);
}


srcIPN =  new Text( line.substring(0,line.indexOf("|")) );
String srcIPP2 = srcIPN.toString();
dstIPN = new Text(line.substring( 
srcIPP2.length()+1,line.indexOf("|",srcIPP2.length()+1)) ) ;

srcPortN = new Text( 
line.substring(pos+1,line.indexOf("|",pos+1)) );
pos = line.indexOf("|",pos+1);
dstPortN = new Text( 
line.substring(pos+1,line.indexOf("|",pos+1)) );


if(srcIP.equals(srcIPN) && dstIP.equals(dstIPN))
{
int tmp, tmp2;

tmp = Integer.parseInt(srcPort.toString()) - 
Integer.parseInt(srcPortN.toString());
if(tmp<0)
tmp*=-1;

tmp2 = Integer.parseInt(dstPort.toString()) - 
Integer.parseInt(dstPortN.toString());
if(tmp2<0)  
tmp2*=-1;

temp=tmp+tmp2;


if(count[4] > temp)
{
count[4] = temp;
lineIndex[4]=lineno;
} 


for(int k=0;k<5;k++)
{
for(int j=0;j<4;j++)
{   
if(count[j] > count[j+1]) 
   

Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2018-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/124/

[Feb 1, 2018 6:38:39 PM] (inigoiri) HDFS-13043. RBF: Expose the state of the 
Routers in the federation.




-1 overall


The following subsystems voted -1:
docker


Powered by Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK9 on Linux/x86

2018-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/

No changes




-1 overall


The following subsystems voted -1:
compile findbugs mvninstall mvnsite shadedclient unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


   mvninstall:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-mvninstall-root.txt
  [1.7M]

   compile:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-compile-root.txt
  [48K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-compile-root.txt
  [48K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-compile-root.txt
  [48K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-checkstyle-root.txt
  [3.1M]

   mvnsite:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/patch-mvnsite-root.txt
  [48K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/whitespace-eol.txt
  [9.2M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/whitespace-tabs.txt
  [292K]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/xml.txt
  [8.0K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-annotations.txt
  [260K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-auth-examples.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-kms.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-minikdc.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-common-project_hadoop-nfs.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-nfs.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
  [36K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-common.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs-plugins.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [20K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/9/artifact/out/branch-findbugs-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-nativetask.txt
  [12K]
   

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/676/

[Feb 1, 2018 6:14:09 PM] (jlowe) Revert "YARN-7677. Docker image cannot set 
HADOOP_CONF_DIR. Contributed
[Feb 1, 2018 6:37:14 PM] (inigoiri) HDFS-13043. RBF: Expose the state of the 
Routers in the federation.
[Feb 1, 2018 6:45:34 PM] (xyao) HDFS-12997. Move logging to slf4j in 
BlockPoolSliceStorage and Storage.
[Feb 1, 2018 8:28:17 PM] (hanishakoneru) HDFS-13062. Provide support for JN to 
use separate journal disk per
[Feb 1, 2018 11:33:52 PM] (xiao) HADOOP-15197. Remove tomcat from the 
Hadoop-auth test bundle.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK9 on Linux/x86

2018-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java9-linux-x86/8/

[Oct 29, 2017 11:44:16 PM] (yufei) YARN-6747. 
TestFSAppStarvation.testPreemptionEnable fails
[Oct 30, 2017 1:54:33 AM] (templedf) YARN-7374. Improve performance of DRF 
comparisons for resource types in
[Oct 30, 2017 4:49:22 PM] (cdouglas) HADOOP-14992. Upgrade Avro patch version. 
Contributed by Bharat
[Oct 30, 2017 6:04:22 PM] (templedf) YARN-6927. Add support for individual 
resource types requests in
[Oct 30, 2017 7:41:28 PM] (templedf) YARN-7336. Unsafe cast from long to int 
Resource.hashCode() method
[Oct 30, 2017 10:16:51 PM] (junping_du) HADOOP-14990. Clean up jdiff xml files 
added for 2.8.2 release.
[Oct 31, 2017 4:49:15 AM] (aajisaka) HADOOP-14980. [JDK9] Upgrade 
maven-javadoc-plugin to 3.0.0-M1.
[Oct 31, 2017 4:50:28 AM] (aajisaka) Revert "HADOOP-14980. [JDK9] Upgrade 
maven-javadoc-plugin to 3.0.0-M1.
[Oct 31, 2017 4:51:26 AM] (aajisaka) HADOOP-14980. [JDK9] Upgrade 
maven-javadoc-plugin to 3.0.0-M1.
[Oct 31, 2017 7:36:02 AM] (aajisaka) YARN-7407. Moving logging APIs over to 
slf4j in
[Oct 31, 2017 8:09:45 AM] (aajisaka) YARN-7379. Moving logging APIs over to 
slf4j in hadoop-yarn-client.
[Oct 31, 2017 2:30:13 PM] (jlowe) HADOOP-14919. BZip2 drops records when 
reading data in splits.
[Oct 31, 2017 2:34:01 PM] (mackrorysd) HDFS-206. Support for head in FSShell. 
Contributed by Gabor Bota.
[Oct 31, 2017 4:44:01 PM] (cdouglas) HDFS-7878. API - expose a unique file 
identifier.
[Oct 31, 2017 5:21:42 PM] (inigoiri) HDFS-12699. TestMountTable fails with Java 
7. Contributed by Inigo
[Oct 31, 2017 5:23:00 PM] (arp) HDFS-12499. dfs.namenode.shared.edits.dir 
property is currently namenode
[Oct 31, 2017 5:46:10 PM] (wang) Revert "HDFS-12499. 
dfs.namenode.shared.edits.dir property is currently
[Oct 31, 2017 7:05:43 PM] (subru) YARN-6413. FileSystem based Yarn Registry 
implementation. (Ellen Hui via
[Nov 1, 2017 4:58:14 AM] (lei) HDFS-12482. Provide a configuration to adjust 
the weight of EC recovery
[Nov 1, 2017 5:44:16 AM] (jzhuge) HDFS-12714. Hadoop 3 missing fix for 
HDFS-5169. Contributed by Joe
[Nov 1, 2017 6:37:08 AM] (yqlin) HDFS-12219. Javadoc for 
FSNamesystem#getMaxObjects is incorrect.
[Nov 1, 2017 8:41:45 AM] (wwei) HDFS-12744. More logs when short-circuit read 
is failed and disabled.
[Nov 1, 2017 8:26:37 PM] (inigoiri) YARN-7276 addendum to add timeline service 
depencies. Contributed by
[Nov 1, 2017 9:48:16 PM] (junping_du) YARN-7400. Incorrect log preview 
displayed in jobhistory server ui.
[Nov 1, 2017 10:39:56 PM] (eyang) YARN-7412. Fix unit test for docker mount 
check on ubuntu.  (Contributed
[Nov 2, 2017 12:00:32 AM] (jianhe) YARN-7396. NPE when accessing container logs 
due to null dirsHandler.
[Nov 2, 2017 8:25:19 AM] (rohithsharmaks) addendum patch for YARN-7289.
[Nov 2, 2017 8:43:08 AM] (aajisaka) MAPREDUCE-6983. Moving logging APIs over to 
slf4j in
[Nov 2, 2017 9:12:04 AM] (sammi.chen) HADOOP-14997. Add hadoop-aliyun as 
dependency of hadoop-cloud-storage.
[Nov 2, 2017 9:32:24 AM] (aajisaka) MAPREDUCE-6999. Fix typo onf in 
DynamicInputChunk.java. Contributed by
[Nov 2, 2017 2:37:17 PM] (jlowe) YARN-7286. Add support for docker to have no 
capabilities. Contributed
[Nov 2, 2017 4:51:28 PM] (wangda) YARN-7364. Queue dash board in new YARN UI 
has incorrect values. (Sunil
[Nov 2, 2017 5:37:33 PM] (epayne) YARN-7370: Preemption properties should be 
refreshable. Contrubted by
[Nov 3, 2017 12:15:33 AM] (arun suresh) HADOOP-15013. Fix ResourceEstimator 
findbugs issues. (asuresh)
[Nov 3, 2017 12:39:23 AM] (subru) YARN-7432. Fix DominantResourceFairnessPolicy 
serializable findbugs
[Nov 3, 2017 1:55:29 AM] (sunilg) YARN-7410. Cleanup FixedValueResource to 
avoid dependency to
[Nov 3, 2017 4:27:35 AM] (xiao) HDFS-12682. ECAdmin -listPolicies will always 
show
[Nov 3, 2017 4:29:53 AM] (inigoiri) YARN-7434. Router getApps REST invocation 
fails with multiple RMs.
[Nov 3, 2017 4:53:13 AM] (xiao) HDFS-12725. 
BlockPlacementPolicyRackFaultTolerant fails with very uneven
[Nov 3, 2017 6:15:50 AM] (sunilg) YARN-7392. Render cluster information on new 
YARN web ui. Contributed by
[Nov 3, 2017 7:05:45 PM] (xiao) HDFS-11467. Support ErasureCoding section in 
OIV XML/ReverseXML.
[Nov 3, 2017 8:16:46 PM] (kihwal) HDFS-12771. Add genstamp and block size to 
metasave Corrupt blocks list.
[Nov 3, 2017 9:30:57 PM] (cdouglas) HDFS-12681. Fold HdfsLocatedFileStatus into 
HdfsFileStatus.
[Nov 3, 2017 11:10:37 PM] (xyao) HADOOP-14987. Improve KMSClientProvider log 
around delegation token
[Nov 4, 2017 3:34:40 AM] (xyao) HDFS-10528. Add logging to successful standby 
checkpointing. Contributed
[Nov 4, 2017 4:01:56 AM] (liuml07) HADOOP-15015. TestConfigurationFieldsBase to 
use SLF4J for logging.
[Nov 6, 2017 7:28:38 AM] (naganarasimha_gr) MAPREDUCE-6975. Logging task 
counters. Contributed by Prabhu Joseph.
[Nov 6, 2017 5:09:10 PM] (bibinchundatt) Add containerId to Localizer failed 
logs. Contributed by 

Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Lei Xu
Sounds good to me, ATM.

On Thu, Feb 1, 2018 at 2:34 PM, Aaron T. Myers  wrote:
> Hey Anu,
>
> My feeling on HDFS-12990 is that we've discussed it quite a bit already and
> it doesn't seem at this point like either side is going to budge. I'm
> certainly happy to have a phone call about it, but I don't expect that we'd
> make much progress.
>
> My suggestion is that we simply include the patch posted to HDFS-12990 in
> the 3.0.1 RC and call this issue out clearly in the subsequent VOTE thread
> for the 3.0.1 release. Eddy, are you up for that?
>
> Best,
> Aaron
>
> On Thu, Feb 1, 2018 at 1:13 PM, Lei Xu  wrote:
>>
>> +Xiao
>>
>> My understanding is that we will have this for 3.0.1.   Xiao, could
>> you give your inputs here?
>>
>> On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer 
>> wrote:
>> > Hi Eddy,
>> >
>> > Thanks for driving this release. Just a quick question, do we have time
>> > to close this issue?
>> > https://issues.apache.org/jira/browse/HDFS-12990
>> >
>> > or are we abandoning it? I believe that this is the last window for us
>> > to fix this issue.
>> >
>> > Should we have a call and get this resolved one way or another?
>> >
>> > Thanks
>> > Anu
>> >
>> > On 2/1/18, 10:51 AM, "Lei Xu"  wrote:
>> >
>> > Hi, All
>> >
>> > I just cut branch-3.0.1 from branch-3.0.  Please make sure all
>> > patches
>> > targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.
>> >
>> > Thanks!
>> > Eddy
>> >
>> > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
>> > > Hi, All
>> > >
>> > > We have released Apache Hadoop 3.0.0 in December [1]. To further
>> > > improve the quality of release, we plan to cut branch-3.0.1 branch
>> > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The
>> > focus
>> > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug
>> > fixes
>> > > [2].  No new features and improvement should be included.
>> > >
>> > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on
>> > Feb
>> > > 1st, targeting for Feb 9th release.
>> > >
>> > > Please feel free to share your insights.
>> > >
>> > > [1]
>> > https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
>> > > [2] https://issues.apache.org/jira/issues/?filter=12342842
>> > >
>> > > Best,
>> > > --
>> > > Lei (Eddy) Xu
>> > > Software Engineer, Cloudera
>> >
>> >
>> >
>> > --
>> > Lei (Eddy) Xu
>> > Software Engineer, Cloudera
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> >
>> >
>> >
>>
>>
>>
>> --
>> Lei (Eddy) Xu
>> Software Engineer, Cloudera
>>
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>
>



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Aaron T. Myers
Hey Anu,

My feeling on HDFS-12990 is that we've discussed it quite a bit already and
it doesn't seem at this point like either side is going to budge. I'm
certainly happy to have a phone call about it, but I don't expect that we'd
make much progress.

My suggestion is that we simply include the patch posted to HDFS-12990 in
the 3.0.1 RC and call this issue out clearly in the subsequent VOTE thread
for the 3.0.1 release. Eddy, are you up for that?

Best,
Aaron

On Thu, Feb 1, 2018 at 1:13 PM, Lei Xu  wrote:

> +Xiao
>
> My understanding is that we will have this for 3.0.1.   Xiao, could
> you give your inputs here?
>
> On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer 
> wrote:
> > Hi Eddy,
> >
> > Thanks for driving this release. Just a quick question, do we have time
> to close this issue?
> > https://issues.apache.org/jira/browse/HDFS-12990
> >
> > or are we abandoning it? I believe that this is the last window for us
> to fix this issue.
> >
> > Should we have a call and get this resolved one way or another?
> >
> > Thanks
> > Anu
> >
> > On 2/1/18, 10:51 AM, "Lei Xu"  wrote:
> >
> > Hi, All
> >
> > I just cut branch-3.0.1 from branch-3.0.  Please make sure all
> patches
> > targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.
> >
> > Thanks!
> > Eddy
> >
> > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
> > > Hi, All
> > >
> > > We have released Apache Hadoop 3.0.0 in December [1]. To further
> > > improve the quality of release, we plan to cut branch-3.0.1 branch
> > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The
> focus
> > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug
> fixes
> > > [2].  No new features and improvement should be included.
> > >
> > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on
> Feb
> > > 1st, targeting for Feb 9th release.
> > >
> > > Please feel free to share your insights.
> > >
> > > [1] https://www.mail-archive.com/general@hadoop.apache.org/
> msg07757.html
> > > [2] https://issues.apache.org/jira/issues/?filter=12342842
> > >
> > > Best,
> > > --
> > > Lei (Eddy) Xu
> > > Software Engineer, Cloudera
> >
> >
> >
> > --
> > Lei (Eddy) Xu
> > Software Engineer, Cloudera
> >
> > 
> -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
> >
>
>
>
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Lei Xu
+Xiao

My understanding is that we will have this for 3.0.1.   Xiao, could
you give your inputs here?

On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer  wrote:
> Hi Eddy,
>
> Thanks for driving this release. Just a quick question, do we have time to 
> close this issue?
> https://issues.apache.org/jira/browse/HDFS-12990
>
> or are we abandoning it? I believe that this is the last window for us to fix 
> this issue.
>
> Should we have a call and get this resolved one way or another?
>
> Thanks
> Anu
>
> On 2/1/18, 10:51 AM, "Lei Xu"  wrote:
>
> Hi, All
>
> I just cut branch-3.0.1 from branch-3.0.  Please make sure all patches
> targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.
>
> Thanks!
> Eddy
>
> On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
> > Hi, All
> >
> > We have released Apache Hadoop 3.0.0 in December [1]. To further
> > improve the quality of release, we plan to cut branch-3.0.1 branch
> > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> > [2].  No new features and improvement should be included.
> >
> > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> > 1st, targeting for Feb 9th release.
> >
> > Please feel free to share your insights.
> >
> > [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> > [2] https://issues.apache.org/jira/issues/?filter=12342842
> >
> > Best,
> > --
> > Lei (Eddy) Xu
> > Software Engineer, Cloudera
>
>
>
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>
>



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15206) BZip2 drops and duplicates records when input split size is small

2018-02-01 Thread Aki Tanaka (JIRA)
Aki Tanaka created HADOOP-15206:
---

 Summary: BZip2 drops and duplicates records when input split size 
is small
 Key: HADOOP-15206
 URL: https://issues.apache.org/jira/browse/HADOOP-15206
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.8.3
Reporter: Aki Tanaka


BZip2 can drop and duplicate record when input split file is small. I confirmed 
that this issue happens when the input split size is between 1byte and 4bytes.

I am seeing the following 2 problem behaviors.

 

1. Drop record:

BZip2 skips the first record in the input file when the input split size is 
small

 

Set the split size to 3 and tested to load 100 records (0, 1, 2..99)
{code:java}
2018-02-01 10:52:33,502 INFO  [Thread-17] mapred.TestTextInputFormat 
(TestTextInputFormat.java:verifyPartitions(317)) - 
splits[1]=file:/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+3
 count=99{code}
> The input format read only 99 records but not 100 records

 

2. Duplicate Record:

2 input splits has same BZip2 records when the input split size is small

 

Set the split size to 1 and tested to load 100 records (0, 1, 2..99)

 
{code:java}
2018-02-01 11:18:49,309 INFO [Thread-17] mapred.TestTextInputFormat 
(TestTextInputFormat.java:verifyPartitions(318)) - splits[3]=file 
/work/count-mismatch2/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/test-dir/TestTextInputFormat/test.bz2:3+1
 count=99
2018-02-01 11:18:49,310 WARN [Thread-17] mapred.TestTextInputFormat 
(TestTextInputFormat.java:verifyPartitions(308)) - conflict with 1 in split 4 
at position 8
{code}
 

I experienced this error when I execute Spark (SparkSQL) job under the 
following conditions:

* The file size of the input files are small (around 1KB)

* Hadoop cluster has many slave nodes (able to launch many executor tasks)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Anu Engineer
Hi Eddy,

Thanks for driving this release. Just a quick question, do we have time to 
close this issue? 
https://issues.apache.org/jira/browse/HDFS-12990

or are we abandoning it? I believe that this is the last window for us to fix 
this issue.

Should we have a call and get this resolved one way or another?

Thanks
Anu

On 2/1/18, 10:51 AM, "Lei Xu"  wrote:

Hi, All

I just cut branch-3.0.1 from branch-3.0.  Please make sure all patches
targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.

Thanks!
Eddy

On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Lei Xu
Hi, All

I just cut branch-3.0.1 from branch-3.0.  Please make sure all patches
targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.

Thanks!
Eddy

On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2018-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/123/

[Jan 31, 2018 5:14:31 PM] (inigoiri) HDFS-13044. RBF: Add a safe mode for the 
Router. Contributed by Inigo




-1 overall


The following subsystems voted -1:
docker


Powered by Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-02-01 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/675/

[Jan 31, 2018 5:37:54 PM] (jlowe) YARN-7677. Docker image cannot set 
HADOOP_CONF_DIR. Contributed by Jim
[Jan 31, 2018 6:47:02 PM] (xyao) HDFS-13061. 
SaslDataTransferClient#checkTrustAndSend should not trust a
[Jan 31, 2018 7:05:17 PM] (hanishakoneru) HDFS-13092. Reduce verbosity for 
ThrottledAsyncChecker#schedule.
[Jan 31, 2018 9:45:30 PM] (epayne) MAPREDUCE-7033: Map outputs implicitly rely 
on permissive umask for
[Feb 1, 2018 1:51:40 AM] (eyang) YARN-7816.  Allow same application name 
submitted by multiple users. 
[Feb 1, 2018 6:39:51 AM] (xyao) HDFS-13060. Adding a 
BlacklistBasedTrustedChannelResolver for
[Feb 1, 2018 6:50:25 AM] (xiao) HDFS-12897. getErasureCodingPolicy should 
handle .snapshot dir better.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-15205) maven release: missing source attachments for hadoop-mapreduce-client-core

2018-02-01 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HADOOP-15205:
-

 Summary: maven release: missing source attachments for 
hadoop-mapreduce-client-core
 Key: HADOOP-15205
 URL: https://issues.apache.org/jira/browse/HADOOP-15205
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.0.0, 2.7.5
Reporter: Zoltan Haindrich


I wanted to use the source attachment; however it looks like since 2.7.5 that 
artifact is not present at maven central ; it looks like the last release which 
had source attachments / javadocs was 2.7.4
http://central.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.4/
http://central.maven.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.5/

this seems to be not limited to mapreduce; as the same change is present for 
yarn-common as well
http://central.maven.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.7.4/
http://central.maven.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.7.5/

and also hadoop-common
http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.7.4/
http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.7.5/
http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/3.0.0/





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org