[jira] [Created] (MAPREDUCE-6171) The visibilities of the distributed cache files and archives should be determined by both their permissions and if they are located in HDFS encryption zone

2014-11-24 Thread Dian Fu (JIRA)
Dian Fu created MAPREDUCE-6171:
--

 Summary: The visibilities of the distributed cache files and 
archives should be determined by both their permissions and if they are located 
in HDFS encryption zone
 Key: MAPREDUCE-6171
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6171
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: security
Reporter: Dian Fu


The visibilities of the distributed cache files and archives are currently 
determined by the permission of these files or archives. 
The following is the logic of method isPublic() in class 
ClientDistributedCacheManager:
{code}
static boolean isPublic(Configuration conf, URI uri,
  Map statCache) throws IOException {
FileSystem fs = FileSystem.get(uri, conf);
Path current = new Path(uri.getPath());
//the leaf level file should be readable by others
if (!checkPermissionOfOther(fs, current, FsAction.READ, statCache)) {
  return false;
}
return ancestorsHaveExecutePermissions(fs, current.getParent(), statCache);
  }
{code}
At NodeManager side, it will use "yarn" user to download public files and use 
the user who submits the job to download private files. In normal cases, there 
is no problem with this. However, if the files are located in an encryption 
zone(HDFS-6134) and yarn user are configured to be disallowed to fetch the 
DataEncryptionKey(DEK) of this encryption zone by KMS, the download process of 
this file will fail. 

You can reproduce this issue with the following steps (assume you submit job 
with user "testUser"): 
# create a clean cluster which has HDFS cryptographic FileSystem feature
# create directory "/data/" in HDFS and make it as an encryption zone with 
keyName "testKey"
# configure KMS to only allow user "testUser" can decrypt DEK of key "testKey" 
in KMS
{code}
  
key.acl.testKey.DECRYPT_EEK
testUser
  
{code}
# execute job "teragen" with user "testUser":
{code}
su -s /bin/bash testUser -c "hadoop jar hadoop-mapreduce-examples*.jar teragen 
1 /data/terasort-input" 
{code}
# execute job "terasort" with user "testUser":
{code}
su -s /bin/bash testUser -c "hadoop jar hadoop-mapreduce-examples*.jar terasort 
/data/terasort-input /data/terasort-output"
{code}

You will see logs like this at the job submitter's console:
{code}
INFO mapreduce.Job: Job job_1416860917658_0002 failed with state FAILED due to: 
Application application_1416860917658_0002 failed 2 times due to AM Container 
for appattempt_1416860917658_0002_02 exited with  exitCode: -1000 due to: 
org.apache.hadoop.security.authorize.AuthorizationException: User [yarn] is not 
authorized to perform [DECRYPT_EEK] on key with ACL name [testKey]!!
{code}

The initial idea to solve this issue is to modify the logic in 
ClientDistributedCacheManager.isPublic to consider also whether this file is in 
an encryption zone. If it is in an encryption zone, this file should be 
considered as private.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hadoop-Mapreduce-trunk-Java8 - Build # 15 - Still Failing

2014-11-24 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/15/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 31052 lines...]
  
TestMRIntermediateDataEncryption.testSingleReducer:55->doEncryptionTest:69->doEncryptionTest:88->runMergeTest:149->verifyOutput:188
 expected:<3000> but was:<0>
  TestMiniMRChildTask.testTaskTempDir:415->launchTest:194 null
  TestMiniMRChildTask.testMapRedExecutionEnv:475->launchTest:194 null

Tests in error: 
  TestLazyOutput.testLazyOutput:156->runTestLazyOutput:120 » IO Job failed!
  
TestClusterMRNotification>NotificationTestCase.testMR:156->NotificationTestCase.launchWordCount:241
 » IO
  TestClusterMapReduceTestCase.testMapReduceRestarting:93->_testMapReduce:67 » 
IO
  TestClusterMapReduceTestCase.testMapReduce:89->_testMapReduce:67 » IO Job 
fail...
  TestReduceFetchFromPartialMem.testReduceFromPartialMem:93->runJob:300 » IO 
Job...
  TestJobName.testComplexNameWithRegex:89 » IO Job failed!
  TestJobName.testComplexName:55 » IO Job failed!
  TestReduceFetchFromPartialMem.testReduceFromPartialMem:93->runJob:300 » IO 
Job...

Tests run: 514, Failures: 39, Errors: 8, Skipped: 11

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] hadoop-mapreduce-client ... SUCCESS [  2.587 s]
[INFO] hadoop-mapreduce-client-core .. SUCCESS [ 59.325 s]
[INFO] hadoop-mapreduce-client-common  SUCCESS [ 27.675 s]
[INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [  4.342 s]
[INFO] hadoop-mapreduce-client-app ... SUCCESS [10:00 min]
[INFO] hadoop-mapreduce-client-hs  SUCCESS [05:11 min]
[INFO] hadoop-mapreduce-client-jobclient . FAILURE [  01:57 h]
[INFO] hadoop-mapreduce-client-hs-plugins  SKIPPED
[INFO] hadoop-mapreduce-client-nativetask  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:13 h
[INFO] Finished at: 2014-11-24T15:37:24+00:00
[INFO] Final Memory: 43M/591M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-mapreduce-client-jobclient
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Sending artifact delta relative to Hadoop-Mapreduce-trunk-Java8 #12
Archived 2 artifacts
Archive block size is 32768
Received 1 blocks and 20215467 bytes
Compression is 0.2%
Took 11 sec
Updating HDFS-7403
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

Hadoop-Mapreduce-trunk - Build # 1967 - Still Failing

2014-11-24 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1967/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 31098 lines...]
  TestMiniMRChildTask.testMapRedExecutionEnv:475->launchTest:194 null

Tests in error: 
  TestReduceFetchFromPartialMem.testReduceFromPartialMem:93->runJob:300 » IO 
Job...
  
TestClusterMRNotification>NotificationTestCase.testMR:156->NotificationTestCase.launchWordCount:241
 » IO
  TestJobName.testComplexNameWithRegex:89 » IO Job failed!
  TestJobName.testComplexName:55 » IO Job failed!
  TestJavaSerialization.testMapReduceJob:127 » IO Job failed!
  TestJavaSerialization.testWriteToSequencefile:179 » IO Job failed!
  TestClusterMapReduceTestCase.testMapReduceRestarting:93->_testMapReduce:67 » 
IO
  TestClusterMapReduceTestCase.testMapReduce:89->_testMapReduce:67 » IO Job 
fail...
  TestLazyOutput.testLazyOutput:156->runTestLazyOutput:120 » IO Job failed!
  TestReduceFetchFromPartialMem.testReduceFromPartialMem:93->runJob:300 » IO 
Job...

Tests run: 514, Failures: 39, Errors: 10, Skipped: 11

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] hadoop-mapreduce-client ... SUCCESS [  2.734 s]
[INFO] hadoop-mapreduce-client-core .. SUCCESS [01:49 min]
[INFO] hadoop-mapreduce-client-common  SUCCESS [ 27.863 s]
[INFO] hadoop-mapreduce-client-shuffle ... SUCCESS [  4.348 s]
[INFO] hadoop-mapreduce-client-app ... SUCCESS [11:16 min]
[INFO] hadoop-mapreduce-client-hs  SUCCESS [05:12 min]
[INFO] hadoop-mapreduce-client-jobclient . FAILURE [  01:57 h]
[INFO] hadoop-mapreduce-client-hs-plugins  SKIPPED
[INFO] hadoop-mapreduce-client-nativetask  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:16 h
[INFO] Finished at: 2014-11-24T15:40:07+00:00
[INFO] Final Memory: 42M/603M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-mapreduce-client-jobclient
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Sending artifact delta relative to Hadoop-Mapreduce-trunk #1870
Archived 2 artifacts
Archive block size is 32768
Received 0 blocks and 2024 bytes
Compression is 0.0%
Took 10 sec
Updating HDFS-7403
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (MAPREDUCE-6172) TestDbClasses timeouts are too aggressive

2014-11-24 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6172:
-

 Summary: TestDbClasses timeouts are too aggressive
 Key: MAPREDUCE-6172
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6172
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Jason Lowe
Priority: Minor


Some of the TestDbClasses test timeouts are only 1 second, and some of those 
tests perform disk I/O which could easily exceed the test timeout if the disk 
is busy or there's some other hiccup on the system at the time.  We should 
increase these timeouts to something more reasonable (i.e.: 10 or 20 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (MAPREDUCE-5785) Derive heap size or mapreduce.*.memory.mb automatically

2014-11-24 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reopened MAPREDUCE-5785:
-

> Derive heap size or mapreduce.*.memory.mb automatically
> ---
>
> Key: MAPREDUCE-5785
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mr-am, task
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, 
> MAPREDUCE-5785.v03.patch, mr-5785-4.patch, mr-5785-5.patch, mr-5785-6.patch
>
>
> Currently users have to set 2 memory-related configs per Job / per task type. 
>  One first chooses some container size map reduce.\*.memory.mb and then a 
> corresponding maximum Java heap size Xmx < map reduce.\*.memory.mb. This 
> makes sure that the JVM's C-heap (native memory + Java heap) does not exceed 
> this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be 
> - allocating big containers whereas the JVM will only use the default 
> -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical 
> ratio that can be adjusted. Xmx is not changed automatically if provided by 
> the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)