[jira] [Commented] (MAPREDUCE-6417) MapReduceClient's primitives.h is toxic and should be extirpated

2015-12-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048950#comment-15048950
 ] 

Colin Patrick McCabe commented on MAPREDUCE-6417:
-

I have to admit, it would be nice to get rid of all those reimplementations of 
standard library functions.  [~decster], [~tlipcon], [~clockfly], any objection 
to this patch?

> MapReduceClient's primitives.h is toxic and should be extirpated
> 
>
> Key: MAPREDUCE-6417
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6417
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Alan Burlison
>Assignee: Alan Burlison
>Priority: Blocker
> Attachments: MAPREDUCE-6417.001.patch
>
>
> MapReduceClient's primitives.h attempts to provide optimised versions of 
> standard library memory copy and comparison functions. It has been the 
> subject of several portability-related bugs:
> * HADOOP-11505 hadoop-mapreduce-client-nativetask uses bswap where be32toh is 
> needed, doesn't work on non-x86
> * HADOOP-11665 Provide and unify cross platform byteorder support in native 
> code
> * MAPREDUCE-6397 MAPREDUCE makes many endian-dependent assumptions
> * HADOOP-11484 hadoop-mapreduce-client-nativetask fails to build on ARM 
> AARCH64 due to x86 asm statements
> At present it only works on x86 and ARM64 as it lacks definitions for bswap 
> and bswap64 for any platforms other than those.
> However it has even more serious problems on non-x86 architectures, for 
> example on SPARC simple_memcpy simply doesn't work at all:
> {code}
> $ cat bang.cc
> #include 
> #define SIMPLE_MEMCPY
> #include "primitives.h"
> int main(int argc, char **argv)
> {
> char b1[9];
> char b2[9];
> simple_memcpy(b2, b1, sizeof(b1));
> }
> $ gcc -o bang bang.cc && ./bang
> Bus Error (core dumped)
> {code}
> That's because simple_memcpy does pointer fiddling that results in misaligned 
> accesses, which are illegal on SPARC.
> fmemcmp is also broken. Even if a definition of bswap is provided, on 
> big-endian architectures the result is simply wrong because of its 
> unconditional use of bswap:
> {code}
> $ cat thud.cc
> #include 
> #include 
> #include "primitives.h"
> int main(int argc, char **argv)
> {
> char a[] = { 0,1,2,0 };
> char b[] = { 0,2,1,0 };
> printf("%lld %d\n", fmemcmp(a, b, sizeof(a), memcmp(a, b, sizeof(a;
> }
> $ g++ -o thud thud.cc && ./thud
> 65280 -1
> {code}
> And in addition fmemcmp suffers from the same misalignment issues as 
> simple_memcpy and coredumps on SPARC when asked to compare odd-sized buffers.
> primitives.h provides the following functions:
> * bswap - used in 12 files in MRC but as HADOOP-11505 points out, mostly 
> incorrectly as it takes no account of platform endianness
> * bswap64 - used in 4 files in MRC, same comments as per bswap apply
> * simple_memcpy - used in 3 files in MRC, should be replaced with the 
> standard memcpy
> * fmemcmp - used in 1 file, should be replaced with the standard memcmp
> * fmemeq - used in 1 file, should be replaced with the standard memcmp
> * frmemeq - not used at all, should just be removed
> *Summary*: primitives.h should simply be deleted and replaced with the 
> standard memory copy & compare functions, or with thin wrappers around them 
> where the APIs are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6538) Deprecate hadoop-pipes

2015-11-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994608#comment-14994608
 ] 

Colin Patrick McCabe commented on MAPREDUCE-6538:
-

bq. \[The Java client APIs provide significant advantages that neither 
streaming nor pipes provide\]... is a false statement. Partitioning, for 
example, can't be done natively in streaming code but can in pipes. In 
streaming, you can only provide a Java class.

I agree that supporting partitioning is an advantage of pipes that streaming 
doesn't have.  There are still advantages that the Java API has over both, 
which is the point I was making.  I also don't see a fundamental reason why 
streaming couldn't be extended to provide this, which would be beneficial to 
languages like Python that can't use pipes.

bq. Correct. Because if the code is being written MR in C++, why would one use 
the less functional streaming API? If one believes that MR jobs consist of 
nothing but reading and writing KVs I could see that, but there's a lot more 
going on under the hood in more advanced jobs. That functionality is just 
flat-out not available in streaming.

I would personally prefer to either use a JVM language or deal with the simple 
and clean stdout/stdin paradigm of streaming, than deal with pipes.

There is a lot of technical debt in pipes.  It is hardcoded to output log 
messages to stderr using {{fprintf}}.  Keys and values need to be serialized to 
C++ {{std::string}} objects.  It doesn't follow the same coding style as the 
other C++ code in Hadoop.  It builds at {{\-O0}} and doesn't generate a 
{{.so}}, just a {{.a}}.  There is no unit test suite, no concept of what the 
API is or how it's allowed to change over time, and very little documentation.

[~aw], since you are committed to keeping pipes around, can you please file 
follow-on JIRAs for fixing these issues and link them to this JIRA?  I will 
close this as WONTFIX.  We can always revisit this later if things change.

> Deprecate hadoop-pipes
> --
>
> Key: MAPREDUCE-6538
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6538
> Project: Hadoop Map/Reduce
>  Issue Type: Wish
>  Components: pipes
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6538) Deprecate hadoop-pipes

2015-11-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved MAPREDUCE-6538.
-
Resolution: Won't Fix

> Deprecate hadoop-pipes
> --
>
> Key: MAPREDUCE-6538
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6538
> Project: Hadoop Map/Reduce
>  Issue Type: Wish
>  Components: pipes
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> Development appears to have stopped on hadoop-pipes upstream for the last few 
> years, aside from very basic maintenance.  Hadoop streaming seems to be a 
> better alternative, since it supports more programming languages and is 
> better implemented.
> There were no responses to a message on the mailing list asking for users of 
> Hadoop pipes... and in my experience, I have never seen anyone use this.  We 
> should remove it to reduce our maintenance burden and build times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6241) Native compilation fails for Checksum.cc due to an incompatibility of assembler register constraint for PowerPC

2015-09-10 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740050#comment-14740050
 ] 

Colin Patrick McCabe commented on MAPREDUCE-6241:
-

We should understand why the code is there even if it copied from somewhere 
else (actually, especially if it was copied).

In general, the organization of bulk_crc32.c could be improved.  Having so many 
ifdefs makes it difficult to figure out what code is actually being called and 
do reviews.  I would like to see the hardware-specific parts into separate 
files rather than having so many ifdefs in the code.

> Native compilation fails for Checksum.cc due to an  incompatibility of 
> assembler register constraint for PowerPC
> 
>
> Key: MAPREDUCE-6241
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6241
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0, 2.6.0
> Environment: Debian/Jessie, kernel 3.18.5,  ppc64 GNU/Linux
> gcc (Debian 4.9.1-19)
> protobuf 2.6.1
> OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-2)
> OpenJDK Zero VM (build 24.65-b04, interpreted mode)
> source was cloned (and updated) from Apache-Hadoop's git repository 
>Reporter: Stephan Drescher
>Assignee: Binglin Chang
>  Labels: BB2015-05-TBR, features
> Attachments: MAPREDUCE-6241.001.patch, MAPREDUCE-6241.002.patch, 
> MAPREDUCE-6241.003.patch
>
>
> Issue when using assembler code for performance optimization on the powerpc 
> platform (compiled for 32bit)
> mvn compile -Pnative -DskipTests
> [exec] /usr/bin/c++   -Dnativetask_EXPORTS -m32  -DSIMPLE_MEMCPY 
> -fno-strict-aliasing -Wall -Wno-sign-compare -g -O2 -DNDEBUG -fPIC 
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native/javah
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native
>  -I/home/hadoop/Java/java7/include -I/home/hadoop/Java/java7/include/linux 
> -isystem 
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/gtest/include
> -o CMakeFiles/nativetask.dir/main/native/src/util/Checksum.cc.o -c 
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc
>  [exec] CMakeFiles/nativetask.dir/build.make:744: recipe for target 
> 'CMakeFiles/nativetask.dir/main/native/src/util/Checksum.cc.o' failed
>  [exec] make[2]: Leaving directory 
> '/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native'
>  [exec] CMakeFiles/Makefile2:95: recipe for target 
> 'CMakeFiles/nativetask.dir/all' failed
>  [exec] make[1]: Leaving directory 
> '/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native'
>  [exec] Makefile:76: recipe for target 'all' failed
>  [exec] 
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc:
>  In function ‘void NativeTask::init_cpu_support_flag()’:
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc:611:14:
>  error: impossible register constraint in ‘asm’
> -->
> "popl %%ebx" : "=a" (eax), [ebx] "=r"(ebx), "=c"(ecx), "=d"(edx) : "a" 
> (eax_in) : "cc");
> <--



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6241) Native compilation fails for Checksum.cc due to an incompatibility of assembler register constraint for PowerPC

2015-09-10 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14739390#comment-14739390
 ] 

Colin Patrick McCabe commented on MAPREDUCE-6241:
-

Why is the patch checking for __GNUC__?

> Native compilation fails for Checksum.cc due to an  incompatibility of 
> assembler register constraint for PowerPC
> 
>
> Key: MAPREDUCE-6241
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6241
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.0.0, 2.6.0
> Environment: Debian/Jessie, kernel 3.18.5,  ppc64 GNU/Linux
> gcc (Debian 4.9.1-19)
> protobuf 2.6.1
> OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-2)
> OpenJDK Zero VM (build 24.65-b04, interpreted mode)
> source was cloned (and updated) from Apache-Hadoop's git repository 
>Reporter: Stephan Drescher
>Assignee: Binglin Chang
>  Labels: BB2015-05-TBR, features
> Attachments: MAPREDUCE-6241.001.patch, MAPREDUCE-6241.002.patch, 
> MAPREDUCE-6241.003.patch
>
>
> Issue when using assembler code for performance optimization on the powerpc 
> platform (compiled for 32bit)
> mvn compile -Pnative -DskipTests
> [exec] /usr/bin/c++   -Dnativetask_EXPORTS -m32  -DSIMPLE_MEMCPY 
> -fno-strict-aliasing -Wall -Wno-sign-compare -g -O2 -DNDEBUG -fPIC 
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native/javah
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/lib
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/test
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src
>  
> -I/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native
>  -I/home/hadoop/Java/java7/include -I/home/hadoop/Java/java7/include/linux 
> -isystem 
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/gtest/include
> -o CMakeFiles/nativetask.dir/main/native/src/util/Checksum.cc.o -c 
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc
>  [exec] CMakeFiles/nativetask.dir/build.make:744: recipe for target 
> 'CMakeFiles/nativetask.dir/main/native/src/util/Checksum.cc.o' failed
>  [exec] make[2]: Leaving directory 
> '/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native'
>  [exec] CMakeFiles/Makefile2:95: recipe for target 
> 'CMakeFiles/nativetask.dir/all' failed
>  [exec] make[1]: Leaving directory 
> '/home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/target/native'
>  [exec] Makefile:76: recipe for target 'all' failed
>  [exec] 
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc:
>  In function ‘void NativeTask::init_cpu_support_flag()’:
> /home/hadoop/Developer/hadoop/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/util/Checksum.cc:611:14:
>  error: impossible register constraint in ‘asm’
> -->
> "popl %%ebx" : "=a" (eax), [ebx] "=r"(ebx), "=c"(ecx), "=d"(edx) : "a" 
> (eax_in) : "cc");
> <--



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6407) Migrate MAPREDUCE nativetask build to new CMake framework

2015-06-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-6407:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Migrate MAPREDUCE nativetask build to new CMake framework
> -
>
> Key: MAPREDUCE-6407
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6407
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Alan Burlison
>Assignee: Alan Burlison
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-6407.001.patch, MAPREDUCE-6407.002.patch, 
> MAPREDUCE-6407.003.patch
>
>
> As per HADOOP-12036, the CMake infrastructure should be refactored and made 
> common across all Hadoop components. This bug covers the migration of 
> MAPREDUCE to the new CMake infrastructure. This change will also add support 
> for building MAPREDUCE Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6407) Migrate MAPREDUCE nativetask build to new CMake framework

2015-06-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-6407:

Affects Version/s: (was: 2.7.0)
   3.0.0
 Target Version/s: 3.0.0

this is a trunk issue only, since the nativetask code is not in earlier 
branches.

> Migrate MAPREDUCE nativetask build to new CMake framework
> -
>
> Key: MAPREDUCE-6407
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6407
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.0.0
>Reporter: Alan Burlison
>Assignee: Alan Burlison
> Attachments: MAPREDUCE-6407.001.patch, MAPREDUCE-6407.002.patch, 
> MAPREDUCE-6407.003.patch
>
>
> As per HADOOP-12036, the CMake infrastructure should be refactored and made 
> common across all Hadoop components. This bug covers the migration of 
> MAPREDUCE to the new CMake infrastructure. This change will also add support 
> for building MAPREDUCE Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAPREDUCE-6407) Migrate MAPREDUCE nativetask build to new CMake framework

2015-06-30 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609229#comment-14609229
 ] 

Colin Patrick McCabe commented on MAPREDUCE-6407:
-

+1, thanks

> Migrate MAPREDUCE nativetask build to new CMake framework
> -
>
> Key: MAPREDUCE-6407
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6407
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.7.0
>Reporter: Alan Burlison
>Assignee: Alan Burlison
> Attachments: MAPREDUCE-6407.001.patch, MAPREDUCE-6407.002.patch, 
> MAPREDUCE-6407.003.patch
>
>
> As per HADOOP-12036, the CMake infrastructure should be refactored and made 
> common across all Hadoop components. This bug covers the migration of 
> MAPREDUCE to the new CMake infrastructure. This change will also add support 
> for building MAPREDUCE Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6407) Migrate MAPREDUCE nativetask build to new CMake framework

2015-06-30 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-6407:

Summary: Migrate MAPREDUCE nativetask build to new CMake framework  (was: 
Migrate MAPREDUCE native build to new CMake framework)

> Migrate MAPREDUCE nativetask build to new CMake framework
> -
>
> Key: MAPREDUCE-6407
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6407
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.7.0
>Reporter: Alan Burlison
>Assignee: Alan Burlison
> Attachments: MAPREDUCE-6407.001.patch, MAPREDUCE-6407.002.patch, 
> MAPREDUCE-6407.003.patch
>
>
> As per HADOOP-12036, the CMake infrastructure should be refactored and made 
> common across all Hadoop components. This bug covers the migration of 
> MAPREDUCE to the new CMake infrastructure. This change will also add support 
> for building MAPREDUCE Native components on Solaris.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-5430) TestMRApps#testSetClasspathWithArchives is failing

2013-07-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-5430:


Resolution: Duplicate
  Assignee: Colin Patrick McCabe
Status: Resolved  (was: Patch Available)

Andrew is rolling this into his new HADOOP-9652 patch

> TestMRApps#testSetClasspathWithArchives is failing
> --
>
> Key: MAPREDUCE-5430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, mrv2
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>Assignee: Colin Patrick McCabe
>
> TestMRApps#testSetClasspathWithArchives is failing, stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5430) TestMRApps#testSetClasspathWithArchives is failing

2013-07-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723210#comment-13723210
 ] 

Colin Patrick McCabe commented on MAPREDUCE-5430:
-

I have managed to isolate the problem.  The old code stripped the fragment in 
{{FileSystem#resolvePath}}; the new code does not.

Here's a unit test which only passes on the previous code:
{code}
  @Test (timeout = 12)
  public void testResolvePath() throws Exception {
FileSystem fs = FileSystem.get(URI.create("file:///tmp/"), new 
Configuration());
URI preUri = new URI("file:///tmp#fragment");
Path pre = new Path(preUri);
Path post = fs.resolvePath(pre);
Assert.assertEquals("resolvePath did not strip the fragment",
"file:/tmp", post.toString());
  }
{code}

I guess we can continue this discussion on HADOOP-9652, now that we've decided 
to roll this fix into that one.

> TestMRApps#testSetClasspathWithArchives is failing
> --
>
> Key: MAPREDUCE-5430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, mrv2
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>
> TestMRApps#testSetClasspathWithArchives is failing, stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-5430) TestMRApps#testSetClasspathWithArchives is failing

2013-07-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723027#comment-13723027
 ] 

Colin Patrick McCabe commented on MAPREDUCE-5430:
-

I think what's going on here is that {{DistributedCache}} is expecting 
{{LocalFileSystem}} to preserve "fragments" (i.e., the things that come after 
the hash mark in URIs) that are found in paths.

Example:
{code}
conf.set(MRJobConfig.CACHE_ARCHIVES, testTGZQualifiedPath + "#testTGZ");
{code}

I'll remove my earlier patch.

> TestMRApps#testSetClasspathWithArchives is failing
> --
>
> Key: MAPREDUCE-5430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, mrv2
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>
> TestMRApps#testSetClasspathWithArchives is failing, stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5430) TestMRApps#testSetClasspathWithArchives is failing

2013-07-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-5430:


Attachment: (was: HDFS-5027.001.patch)

> TestMRApps#testSetClasspathWithArchives is failing
> --
>
> Key: MAPREDUCE-5430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, mrv2
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>
> TestMRApps#testSetClasspathWithArchives is failing, stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5430) TestMRApps#testSetClasspathWithArchives is failing

2013-07-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-5430:


Attachment: HDFS-5027.001.patch

Shouldn't it be checking for "test.tgz" in the CLASSPATH, not "testTGZ"?  Here 
is a patch which does that, after which the test passes.

> TestMRApps#testSetClasspathWithArchives is failing
> --
>
> Key: MAPREDUCE-5430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, mrv2
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
> Attachments: HDFS-5027.001.patch
>
>
> TestMRApps#testSetClasspathWithArchives is failing, stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5430) TestMRApps#testSetClasspathWithArchives is failing

2013-07-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-5430:


Status: Patch Available  (was: Open)

> TestMRApps#testSetClasspathWithArchives is failing
> --
>
> Key: MAPREDUCE-5430
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5430
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: distributed-cache, mrv2
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
> Attachments: HDFS-5027.001.patch
>
>
> TestMRApps#testSetClasspathWithArchives is failing, stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4953) HadoopPipes misuses fprintf

2013-02-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13571039#comment-13571039
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4953:
-

looks good to me

> HadoopPipes misuses fprintf
> ---
>
> Key: MAPREDUCE-4953
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4953
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: pipes
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: mapreduce-4953.txt
>
>
> {code}
>  [exec] 
> /mnt/trunk/hadoop-tools/hadoop-pipes/src/main/native/pipes/impl/HadoopPipes.cc:130:58:
>  warning: format not a string literal and no format arguments 
> [-Wformat-security]
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4654) TestDistCp is @ignored

2012-09-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463943#comment-13463943
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4654:
-

Looks good to me.  It would be nice to have a test for {{\-skipcrccheck}}, but 
I guess we can do that later (there wasn't one in the original TestDistCp)

> TestDistCp is @ignored
> --
>
> Key: MAPREDUCE-4654
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4654
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Sandy Ryza
>Priority: Critical
> Attachments: MAPREDUCE-4654.patch
>
>
> We should fix TestDistCp so that it actually runs, rather than being ignored.
> {code}
> @ignore
> public class TestDistCp {
>   private static final Log LOG = LogFactory.getLog(TestDistCp.class);
>   private static List pathList = new ArrayList();
>   ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4644) mapreduce-client-jobclient-tests do not run from dist tarball

2012-09-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455325#comment-13455325
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4644:
-

Is there a workaround?  Not being able to run the mapreduce client tests is 
frustrating.

> mapreduce-client-jobclient-tests do not run from dist tarball
> -
>
> Key: MAPREDUCE-4644
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4644
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build, test
>Affects Versions: 2.0.2-alpha
>Reporter: Jason Lowe
>Priority: Blocker
>
> The mapreduce jobclient tests rely on junit which is missing from the dist 
> tarball.  This prevents running often-used tests like sleep jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4656) Can't run TestDFSIO due to junit dependency

2012-09-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved MAPREDUCE-4656.
-

Resolution: Duplicate

> Can't run TestDFSIO due to junit dependency
> ---
>
> Key: MAPREDUCE-4656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4656
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>
> TestDFSIO can't be run from the tarball any more.  The tarball does not 
> bundle junit, and TestDFSIO makes use of that library.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4656) Can't run TestDFSIO due to junit dependency

2012-09-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455284#comment-13455284
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4656:
-

Full exception text:

{code}
 cmccabe@keter:/h> ./bin/hadoop jar
./share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT-tests.jar
./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-S
NAPSHOT-tests.jar TestDFSIO -write -nrFiles 4
Exception in thread "main" java.lang.ClassNotFoundException:
//share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3/0/0-SNAPSHOT-tests/jar
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:201)
cmccabe@keter:/h> ./bin/hadoop jar
./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar
TestDFSIO -write -nrFiles 4
java.lang.NoClassDefFoundError: junit/framework/TestCase
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at 
org.apache.hadoop.test.MapredTestDriver.(MapredTestDriver.java:60)
at 
org.apache.hadoop.test.MapredTestDriver.(MapredTestDriver.java:54)
at 
org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.ClassNotFoundException: junit.framework.TestCase
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 20 more
Unknown program 'TestDFSIO' chosen.
Valid program names are:
{code}

Note that there is nothing that appears after TestDFSIO (i.e., no valid program 
names are listed.)

> Can't run TestDFSIO due to junit dependency
> ---
>
> Key: MAPREDUCE-4656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4656
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>
> TestDFSIO can't be run from the tarball any more.  The tarball does not 
> bundle junit, and TestDFSIO makes use of that library.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4656) Can't run TestDFSIO due to junit dependency

2012-09-13 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created MAPREDUCE-4656:
---

 Summary: Can't run TestDFSIO due to junit dependency
 Key: MAPREDUCE-4656
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4656
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Colin Patrick McCabe


TestDFSIO can't be run from the tarball any more.  The tarball does not bundle 
junit, and TestDFSIO makes use of that library.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4489) fuse_dfs: incorrect configuration value checked for connection expiry timer period

2012-07-26 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created MAPREDUCE-4489:
---

 Summary: fuse_dfs: incorrect configuration value checked for 
connection expiry timer period
 Key: MAPREDUCE-4489
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4489
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


In fuse_dfs, we check an incorrect hdfs configuration value checked for the 
connection expiry timer period.

{code}
gTimerPeriod = FUSE_CONN_DEFAULT_TIMER_PERIOD;
ret = hdfsConfGetInt(HADOOP_FUSE_CONNECTION_TIMEOUT, &gTimerPeriod);
if (ret) {
  fprintf(stderr, "Unable to determine the configured value for %s.",
HADOOP_FUSE_TIMER_PERIOD);
  return -EINVAL
}
{code}

We should be checking HADOOP_FUSE_TIMER_PERIOD.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4485) container-executor should deal with stdout, stderr better

2012-07-25 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-4485:


  Description: 
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
fclose(stdout);
  }
  if (ERRORFILE != stderr) {
fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available 
number.  So if {{stdout}} (fd number 1) has been closed, and you do 
open("/my/important/file"), you'll get assigned file descriptor 1.  This means 
that any printf statements in the program will be now printing to 
/my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, 
but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at the 
moment.  However, this is a latent bug that should be fixed in case these are 
ever made configurable (which seems to have been the intent).

  was:
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
fclose(stdout);
  }
  if (ERRORFILE != stderr) {
fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available 
number.  So if {{stdout}} (fd number 1) has been closed, and you do 
open("/my/important/file"), you'll get assigned file descriptor 1.  This means 
that any printf statements in the program will be now printing to 
/my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, 
but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file 
descriptors we don't need.  Because container-executor was forked off of the 
JVM, any file that was open at the time the JVM called fork() will also be open 
for us.  These FDs will continue to be open even after the {{execve}}, unless 
we close them manually.  This could be both a resource leak and a security 
breach.

 Target Version/s: 2.0.1-alpha
Affects Version/s: 2.0.1-alpha
  Summary: container-executor should deal with stdout, stderr 
better  (was: container-executor should deal with file descriptors better)

> container-executor should deal with stdout, stderr better
> -
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.1-alpha
>Reporter: Colin Patrick McCabe
>Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
> fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
> fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available 
> number.  So if {{stdout}} (fd number 1) has been closed, and you do 
> open("/my/important/file"), you'll get assigned file descriptor 1.  This 
> means that any printf statements in the program will be now printing to 
> /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, 
> but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at 
> the moment.  However, this is a latent bug that should be fixed in case these 
> are ever made configurable (which seems to have been the intent).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4485) container-executor should deal with file descriptors better

2012-07-25 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422821#comment-13422821
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4485:
-

@Todd: good catch.  I'll update the JIRA description.

@Andy: Yeah, it appears that LOGFILE and ERRORFILE can't be changed.  So it's a 
latent bug.

> container-executor should deal with file descriptors better
> ---
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Colin Patrick McCabe
>Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
> fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
> fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available 
> number.  So if {{stdout}} (fd number 1) has been closed, and you do 
> open("/my/important/file"), you'll get assigned file descriptor 1.  This 
> means that any printf statements in the program will be now printing to 
> /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, 
> but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file 
> descriptors we don't need.  Because container-executor was forked off of the 
> JVM, any file that was open at the time the JVM called fork() will also be 
> open for us.  These FDs will continue to be open even after the {{execve}}, 
> unless we close them manually.  This could be both a resource leak and a 
> security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4485) container-executor should deal with file descriptors better

2012-07-25 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-4485:


Description: 
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
fclose(stdout);
  }
  if (ERRORFILE != stderr) {
fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available 
number.  So if {{stdout}} (fd number 1) has been closed, and you do 
open("/my/important/file"), you'll get assigned file descriptor 1.  This means 
that any printf statements in the program will be now printing to 
/my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, 
but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file 
descriptors we don't need.  Because container-executor was forked off of the 
JVM, any file that was open at the time the JVM called fork() will also be open 
for us.  These FDs will continue to be open even after the {{execve}}, unless 
we close them manually.  This could be both a resource leak and a security 
breach.

  was:
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
fclose(stdout);
  }
  if (ERRORFILE != stderr) {
fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available 
number.  So if {{stdout}} (fd number 1) has been closed, and you do 
open("/my/important/file"), you'll get assigned file descriptor 1.  This means 
that any printf statements in the program will be now printing to 
/my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, 
but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file 
descriptors we don't need.  Because container-executor was forked off of the 
JVM, any file that was open at the time the JVM called fork() will also be open 
for us.  These FDs will contain to be open even after the {{execve}}, unless we 
close them manually.  This could be both a resource leak and a security breach.


> container-executor should deal with file descriptors better
> ---
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Colin Patrick McCabe
>Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
> fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
> fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available 
> number.  So if {{stdout}} (fd number 1) has been closed, and you do 
> open("/my/important/file"), you'll get assigned file descriptor 1.  This 
> means that any printf statements in the program will be now printing to 
> /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, 
> but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file 
> descriptors we don't need.  Because container-executor was forked off of the 
> JVM, any file that was open at the time the JVM called fork() will also be 
> open for us.  These FDs will continue to be open even after the {{execve}}, 
> unless we close them manually.  This could be both a resource leak and a 
> security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-4485) container-executor should deal with file descriptors better

2012-07-25 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created MAPREDUCE-4485:
---

 Summary: container-executor should deal with file descriptors 
better
 Key: MAPREDUCE-4485
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Reporter: Colin Patrick McCabe
Priority: Minor


container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
fclose(stdout);
  }
  if (ERRORFILE != stderr) {
fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available 
number.  So if {{stdout}} (fd number 1) has been closed, and you do 
open("/my/important/file"), you'll get assigned file descriptor 1.  This means 
that any printf statements in the program will be now printing to 
/my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, 
but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file 
descriptors we don't need.  Because container-executor was forked off of the 
JVM, any file that was open at the time the JVM called fork() will also be open 
for us.  These FDs will contain to be open even after the {{execve}}, unless we 
close them manually.  This could be both a resource leak and a security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh

2012-07-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421198#comment-13421198
 ] 

Colin Patrick McCabe commented on MAPREDUCE-2374:
-

I just thought of something.  Suppose that the JVM is holding blahblahblah.sh 
open for write, and meanwhile another thread forks a bash process (or 
something).  After the fork completes, that process will hold blahblahblah.sh 
open for write with O_WRONLY.  At the very least, this is a race condition that 
could lead to "mysterious" failures, since you don't know when the fork'ed 
process will next get scheduled in relation to the parent process.

The O_CLOEXEC flag was introduced in Linux 2.6.23 to solve this problem, by 
atomically closing the FDs on a fork.  However, I didn't see it being used in 
the strace output you posted.  And it's certainly not around on RHEL5 and 
earlier.

If this is true, then I guess the solution Andy posted earlier is probably the 
best way to go.  Just get rid of the -c and this behavior will be masked.

> Should not use PrintWriter to write taskjvm.sh
> --
>
> Key: MAPREDUCE-2374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.1
>
> Attachments: failed_taskjvmsh.strace, mapreduce-2374-on-20sec.txt, 
> mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh

2012-07-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421196#comment-13421196
 ] 

Colin Patrick McCabe commented on MAPREDUCE-2374:
-

Thanks for the strace output, Shrinivas.  Unfortunately, it doesn't seem to 
show the place where you're opening up 
/local5/sj_mrv1_trunk/hadoop-local/ttprivate/taskTracker/root/jobcache/job_201207131350_0001/attempt_201207131350_0001_m_000201_0/taskjvm.sh
 for writing.

If there is no file descriptor leak, you'd expect to see something like this:
{code}
open("blahblahblah.sh", {st_mode=S_IFREG|0700, ...}) = 5
close(5) = 0
...
execve("blahblahblah.sh") ...
{code}

On the other hand, if there is a leak, there should be no corresponding close() 
call.  Things can get more complicated than that because of dup() and stuff 
like that, but that's the basic idea...

In the absence of that, we can't really draw any conclusions either way.  It 
may be helpful to use strace -f to follow forks.  I realize there's a lot of 
output, but those are the lines we need, I think.

> Should not use PrintWriter to write taskjvm.sh
> --
>
> Key: MAPREDUCE-2374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.1
>
> Attachments: failed_taskjvmsh.strace, mapreduce-2374-on-20sec.txt, 
> mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh

2012-07-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420970#comment-13420970
 ] 

Colin Patrick McCabe commented on MAPREDUCE-2374:
-

Ah, you're absolutely right.  Without the -c, bash won't execve the file 
itself.  It will read the file into an already running interpreter.  So no 
ETXTBSY.

I'm a little nervous about this proposed "fix" because it doesn't really seem 
to fix the problem.  Are we absolutely sure that the file is closed and not 
still being written to?  If it's not, we could be getting file descriptor 
leaks, partially written scripts, and all the usual evils.

I guess Shrinivas did some testing with lsof.  Shrinivas, can you try 
deliberately keeping the file open for write and verifying that your lsof test 
detects it?  Also, it would be interesting to see if the same problem shows up 
when using the raw FileChannel API (as opposed to the FileWriter API).

If this is a real kernel bug, I guess we need to file it upstream on the red 
hat bug tracker and/or kernel mailing list.

> Should not use PrintWriter to write taskjvm.sh
> --
>
> Key: MAPREDUCE-2374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.1
>
> Attachments: mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, 
> mapreduce-2374.txt
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh

2012-07-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420918#comment-13420918
 ] 

Colin Patrick McCabe commented on MAPREDUCE-2374:
-

bq. We can avoid the ETXTBSY by avoiding the execve. If I'm reading launchTask 
correctly, the script we're execing isn't even a valid shell script anyways – 
it's just a sequence of shell commands, without the leading "#!/bin/sh" header. 
By running it bash -c "/path/to/script" we're relying on the ancient pre-Bourne 
shell script convention that if execve() fails with ENOEXEC, the shell tries to 
interpret the file as a script.

bq. Instead, we can ask bash to directly run the script as a script by running 
bash "/path/to/script" leaving out the -c. This avoids the code path that 
triggers the ETXTBSY failure and is slightly less reliant on random backwards 
compatibility kludges. And it doesn't break if we do have the #!/bin/sh line 
since that's just a comment.

That's a very interesting analysis, Andy.

I agree that it would be simpler without the -c, but I don't see how this 
"avoids the code path that triggers the ETXTBSY."  We're still calling execve 
at some point, and if someone has that FD open for write it will squawk.  (All 
the other exec flavors are just wrappers around execve and return the same 
error codes).  Am I missing something?

As you pointed out, it's possible that SELinux is doing something odd.  
Shrinivas, can you confirm that seLinux is off during your testing?

Just this command as root and then we will be sure:
{code}
/usr/sbin/setenforce 0
{code}

It will stay in effect until you reboot.

> Should not use PrintWriter to write taskjvm.sh
> --
>
> Key: MAPREDUCE-2374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.1
>
> Attachments: mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, 
> mapreduce-2374.txt
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2374) Should not use PrintWriter to write taskjvm.sh

2012-07-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420830#comment-13420830
 ] 

Colin Patrick McCabe commented on MAPREDUCE-2374:
-

In OpenJDK 6 at least, FileWriter#close doesn't always close the file 
descriptor.

Starting here:
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/io/FileWriter.java

I went to OutputStreamWriter.  OutputStreamWriter#close calls 
StreamEncoder#close, which has:

{code}
void implClose() throws IOException {
flushLeftoverChar(null, true);
try {
for (;;) {
CoderResult cr = encoder.flush(bb);
if (cr.isUnderflow())
break;
if (cr.isOverflow()) {
assert bb.position() > 0;
writeBytes();
continue;
}
cr.throwException();
}
if (bb.position() > 0)
writeBytes();
if (ch != null)
ch.close();
else
out.close();
} catch (IOException x) {
encoder.reset();
throw x;
}
}
{code}

So you can see that in the case of exceptions being thrown, the FileChannel is 
not closed.  The finalizer probably takes care of it, but that's cold comfort 
to you in this case.

Maybe it would be best to use the FileChannel API directly rather than using 
FileWriter.

> Should not use PrintWriter to write taskjvm.sh
> --
>
> Key: MAPREDUCE-2374
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.1
>
> Attachments: mapreduce-2374-on-20sec.txt
>
>
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that 
> class swallows all IO exceptions. We're not currently checking for errors, 
> which I'm seeing result in occasional task failures with the message "Text 
> file busy" - assumedly because the close() call is failing silently for some 
> reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4267) mavenize pipes

2012-06-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-4267:


Attachment: MAPREDUCE-4267.002.trimmed.patch

* fix pom.xml to skip 'make install' and just call 'make' (it's harmless to 
call make install here, but unecessary)

* set VERBOSE=1 for the make command, so that we can easily identify all errors.

rm file is the same as previously.

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch, MAPREDUCE-4267.002.trimmed.patch
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-4267) mavenize pipes

2012-06-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288934#comment-13288934
 ] 

Colin Patrick McCabe commented on MAPREDUCE-4267:
-

MAPREDUCE-4267.001.trimmed.patch is slightly cleaned up beyond what was in 
HADOOP-8368.018.trimmed.patch.  I think the biggest improvement is that it now 
builds both static and shared versions of hadooputils and hadooppipes.  It also 
handles the 32-bit build correctly, and uses make rather than make install.

Good luck.  I'm looking forward to a fully mavenized mapreduce proj!

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-4267) mavenize pipes

2012-06-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated MAPREDUCE-4267:


Attachment: MAPREDUCE-4267.001.trimmed.patch
MAPREDUCE-4267.001.rm.patch

Hi Thomas,

This patch might serve as a good starting point for MAPREDUCE-4267.  It doesn't 
get rid of all the ant build stuff, but it does move the native mapreduce build 
to use CMake and Maven.

Let me know if you think this should be in a separate JIRA, or if it belongs 
here.

> mavenize pipes
> --
>
> Key: MAPREDUCE-4267
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4267
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.3
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Critical
> Attachments: MAPREDUCE-4267.001.rm.patch, 
> MAPREDUCE-4267.001.trimmed.patch
>
>
> We are still building pipes out of the old mrv1 directories using ant.  Move 
> it over to the mrv2 dir structure.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira