[jira] [Created] (HADOOP-9325) KerberosAuthenticationHandler AuthenticationFilter and should be able to reference Hadoop configurations

2013-02-22 Thread Kai Zheng (JIRA)
Kai Zheng created HADOOP-9325:
-

 Summary: KerberosAuthenticationHandler AuthenticationFilter and 
should be able to reference Hadoop configurations
 Key: HADOOP-9325
 URL: https://issues.apache.org/jira/browse/HADOOP-9325
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Kai Zheng


In KerberosAuthenticationHandler SPNEGO activities, KerberosName is used to get 
short name for client principal, which needs in some Kerberos authentication 
situations to reference translation rules defined in Hadoop configuration file 
like core-site.xml
as follows:

  property
namehadoop.security.auth_to_local/name
value.../value
  /property

Note, this is an issue only if default rule can't meet the requirement and 
custom rules need to be defined.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9326) BUG when i run test or when i skipped test and i run package step, i have the same problem

2013-02-22 Thread JLASSI Aymen (JIRA)
JLASSI Aymen created HADOOP-9326:


 Summary: BUG when i run test or when i skipped test and i run 
package step, i have the same problem
 Key: HADOOP-9326
 URL: https://issues.apache.org/jira/browse/HADOOP-9326
 Project: Hadoop Common
  Issue Type: Bug
  Components: build, test
 Environment: For information, i take hadoop with GIT and i run it on 
mac OS (last version)
Reporter: JLASSI Aymen


I'd like to compile hadoop from source code, and when i launch test-step, i 
have the desciption as follows, when i skip the test-step to the package step, 
i have the same problem, the same description of bug:



Results :

Failed tests:   testFailFullyDelete(org.apache.hadoop.fs.TestFileUtil): The 
directory xSubDir *should* not have been deleted. expected:true but 
was:false
  testFailFullyDeleteContents(org.apache.hadoop.fs.TestFileUtil): The directory 
xSubDir *should* not have been deleted. expected:true but was:false
  
testListStatusThrowsExceptionForUnreadableDir(org.apache.hadoop.fs.TestFSMainOperationsLocalFileSystem):
 Should throw IOException
  test0[0](org.apache.hadoop.fs.TestLocalDirAllocator): Checking for 
build/test/temp/RELATIVE1 in 
build/test/temp/RELATIVE0/block4197707426846287299.tmp - FAILED!
  testROBufferDirAndRWBufferDir[0](org.apache.hadoop.fs.TestLocalDirAllocator): 
Checking for build/test/temp/RELATIVE2 in 
build/test/temp/RELATIVE1/block138767728739012230.tmp - FAILED!
  testRWBufferDirBecomesRO[0](org.apache.hadoop.fs.TestLocalDirAllocator): 
Checking for build/test/temp/RELATIVE3 in 
build/test/temp/RELATIVE4/block4888615109050601773.tmp - FAILED!
  test0[1](org.apache.hadoop.fs.TestLocalDirAllocator): Checking for 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/ABSOLUTE1
 in 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/ABSOLUTE0/block4663369813226761504.tmp
 - FAILED!
  testROBufferDirAndRWBufferDir[1](org.apache.hadoop.fs.TestLocalDirAllocator): 
Checking for 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/ABSOLUTE2
 in 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/ABSOLUTE1/block2846944239985650460.tmp
 - FAILED!
  testRWBufferDirBecomesRO[1](org.apache.hadoop.fs.TestLocalDirAllocator): 
Checking for 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/ABSOLUTE3
 in 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/ABSOLUTE4/block4367331619344952181.tmp
 - FAILED!
  test0[2](org.apache.hadoop.fs.TestLocalDirAllocator): Checking for 
file:/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/QUALIFIED1
 in 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/QUALIFIED0/block5687619346377173125.tmp
 - FAILED!
  testROBufferDirAndRWBufferDir[2](org.apache.hadoop.fs.TestLocalDirAllocator): 
Checking for 
file:/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/QUALIFIED2
 in 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/QUALIFIED1/block2235209534902942511.tmp
 - FAILED!
  testRWBufferDirBecomesRO[2](org.apache.hadoop.fs.TestLocalDirAllocator): 
Checking for 
file:/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/QUALIFIED3
 in 
/Users/aymenjlassi/Desktop/hadoop_source/releaseGit/hadoop-common/hadoop-common-project/hadoop-common/build/test/temp/QUALIFIED4/block6994640486900109274.tmp
 - FAILED!
  testReportChecksumFailure(org.apache.hadoop.fs.TestLocalFileSystem)
  
testListStatusThrowsExceptionForUnreadableDir(org.apache.hadoop.fs.viewfs.TestFSMainOperationsLocalFileSystem):
 Should throw IOException
  testCount(org.apache.hadoop.metrics2.util.TestSampleQuantiles): 
expected:50[.00 %ile +/- 5.00%: 1337(..)
  testCheckDir_notDir_local(org.apache.hadoop.util.TestDiskChecker): checkDir 
success
  testCheckDir_notReadable_local(org.apache.hadoop.util.TestDiskChecker): 
checkDir success
  testCheckDir_notWritable_local(org.apache.hadoop.util.TestDiskChecker): 
checkDir success
  testCheckDir_notListable_local(org.apache.hadoop.util.TestDiskChecker): 
checkDir success

Tests run: 1842, Failures: 19, Errors: 0, Skipped: 22

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main  SUCCESS [1.805s]
[INFO] 

[jira] [Created] (HADOOP-9327) Out of date code examples

2013-02-22 Thread Hao Zhong (JIRA)
Hao Zhong created HADOOP-9327:
-

 Summary: Out of date code examples
 Key: HADOOP-9327
 URL: https://issues.apache.org/jira/browse/HADOOP-9327
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Hao Zhong


1. This page contains code examples that use JobConfigurationParser
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/tools/rumen/package-summary.html
JobConfigurationParser jcp = 
  new JobConfigurationParser(interestedProperties);
JobConfigurationParser is deleted in 2.0.3

2. This page contains code examples that use ContextFactory
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics/package-summary.html
 ContextFactory factory = ContextFactory.getFactory();
... examine and/or modify factory attributes ...
MetricsContext context = factory.getContext(myContext);
ContextFactory is deleted in 2.0.3

3. This page contains code examples that use LoggedNetworkTopology
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/tools/rumen/package-summary.html
 do.init(topology.json, conf);

  // get the job summary using TopologyBuilder
  LoggedNetworkTopology topology = topologyBuilder.build();
LoggedNetworkTopology is deleted in 2.0.3

Please revise the documentation to reflect the code.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


APIs to move data blocks within HDFS

2013-02-22 Thread Karthiek C
Hi,

Is there any APIs to move data blocks in HDFS from one node to another *
after* they have been added to HDFS? Also can we write some sort of
pluggable module (like scheduler) that controls how data gets placed in
hadoop cluster? I am working with hadoop-1.0.3 version and I couldn't find
any filesystem APIs available to do that.

PS: I am working on a research project where we want to investigate how to
optimally place data in hadoop.

Thanks,
Karthiek


Re: APIs to move data blocks within HDFS

2013-02-22 Thread Harsh J
There's no filesystem (i.e. client) level APIs to do this, but the
Balancer tool of HDFS does exactly this. Reading its sources should
let you understand what kinda calls you need to make to reuse the
balancer protocol and achieve what you need.

In trunk, the balancer is at
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java

HTH, and feel free to ask any relevant follow up questions.

On Fri, Feb 22, 2013 at 11:43 PM, Karthiek C karthi...@gmail.com wrote:
 Hi,

 Is there any APIs to move data blocks in HDFS from one node to another *
 after* they have been added to HDFS? Also can we write some sort of
 pluggable module (like scheduler) that controls how data gets placed in
 hadoop cluster? I am working with hadoop-1.0.3 version and I couldn't find
 any filesystem APIs available to do that.

 PS: I am working on a research project where we want to investigate how to
 optimally place data in hadoop.

 Thanks,
 Karthiek



--
Harsh J


Re: APIs to move data blocks within HDFS

2013-02-22 Thread Chris Nauroth
Regarding your question about a pluggable module to control placement of
data, try taking a look at the abstract class BlockPlacementPolicy and
BlockPlacementPolicyDefault, which is its default implementation.

On branch-1, you can find these classes
at src/hdfs/org/apache/hadoop/hdfs/server/namenode.  On trunk, the package
structure is different, and these classes are
at 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement.

Best of luck with your research!

--Chris


On Fri, Feb 22, 2013 at 11:17 AM, Harsh J ha...@cloudera.com wrote:

 There's no filesystem (i.e. client) level APIs to do this, but the
 Balancer tool of HDFS does exactly this. Reading its sources should
 let you understand what kinda calls you need to make to reuse the
 balancer protocol and achieve what you need.

 In trunk, the balancer is at

 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java

 HTH, and feel free to ask any relevant follow up questions.

 On Fri, Feb 22, 2013 at 11:43 PM, Karthiek C karthi...@gmail.com wrote:
  Hi,
 
  Is there any APIs to move data blocks in HDFS from one node to another *
  after* they have been added to HDFS? Also can we write some sort of
  pluggable module (like scheduler) that controls how data gets placed in
  hadoop cluster? I am working with hadoop-1.0.3 version and I couldn't
 find
  any filesystem APIs available to do that.
 
  PS: I am working on a research project where we want to investigate how
 to
  optimally place data in hadoop.
 
  Thanks,
  Karthiek



 --
 Harsh J



Re: APIs to move data blocks within HDFS

2013-02-22 Thread Karthiek C
Thank you Harsh and Chris. This really helps!

-Karthiek

On Fri, Feb 22, 2013 at 2:46 PM, Chris Nauroth cnaur...@hortonworks.comwrote:

 Regarding your question about a pluggable module to control placement of
 data, try taking a look at the abstract class BlockPlacementPolicy and
 BlockPlacementPolicyDefault, which is its default implementation.

 On branch-1, you can find these classes
 at src/hdfs/org/apache/hadoop/hdfs/server/namenode.  On trunk, the package
 structure is different, and these classes are
 at
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement.

 Best of luck with your research!

 --Chris


 On Fri, Feb 22, 2013 at 11:17 AM, Harsh J ha...@cloudera.com wrote:

  There's no filesystem (i.e. client) level APIs to do this, but the
  Balancer tool of HDFS does exactly this. Reading its sources should
  let you understand what kinda calls you need to make to reuse the
  balancer protocol and achieve what you need.
 
  In trunk, the balancer is at
 
 
 hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
 
  HTH, and feel free to ask any relevant follow up questions.
 
  On Fri, Feb 22, 2013 at 11:43 PM, Karthiek C karthi...@gmail.com
 wrote:
   Hi,
  
   Is there any APIs to move data blocks in HDFS from one node to another
 *
   after* they have been added to HDFS? Also can we write some sort of
   pluggable module (like scheduler) that controls how data gets placed in
   hadoop cluster? I am working with hadoop-1.0.3 version and I couldn't
  find
   any filesystem APIs available to do that.
  
   PS: I am working on a research project where we want to investigate how
  to
   optimally place data in hadoop.
  
   Thanks,
   Karthiek
 
 
 
  --
  Harsh J
 



[jira] [Created] (HADOOP-9328) INSERT INTO a S3 external table with no reduce phase results in FileNotFoundException

2013-02-22 Thread Marc Limotte (JIRA)
Marc Limotte created HADOOP-9328:


 Summary: INSERT INTO a S3 external table with no reduce phase 
results in FileNotFoundException
 Key: HADOOP-9328
 URL: https://issues.apache.org/jira/browse/HADOOP-9328
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.9.0
 Environment: YARN, Hadoop 2.0.2-alpha
Ubuntu
Reporter: Marc Limotte


With Yarn and Hadoop 2.0.2-alpha, hive 0.9.0.

The destination is an S3 table, the source for the query is a small hive 
managed table.

CREATE EXTERNAL TABLE payout_state_product (
  state STRING,
  product_id STRING,
  element_id INT,
  element_value DOUBLE,
  number_of_fields INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://com.weatherbill.foo/bar/payout_state_product/';

A simple query to copy the results from the hive managed table into a S3. 

hive INSERT OVERWRITE TABLE payout_state_product 
SELECT * FROM payout_state_product_cached; 

Total MapReduce jobs = 2 
Launching Job 1 out of 2 
Number of reduce tasks is set to 0 since there's no reduce operator 
Starting Job = job_1360884012490_0014, Tracking URL = 
http://i-9ff9e9ef.us-east-1.production.climatedna.net:8088/proxy/application_1360884012490_0014/
 
Kill Command = /usr/lib/hadoop/bin/hadoop job 
-Dmapred.job.tracker=i-9ff9e9ef.us-east-1.production.climatedna.net:8032 -kill 
job_1360884012490_0014 
Hadoop job information for Stage-1: number of mappers: 100; number of reducers: 
0 
2013-02-22 19:15:46,709 Stage-1 map = 0%, reduce = 0% 
...snip... 
2013-02-22 19:17:02,374 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 427.13 
sec 
MapReduce Total cumulative CPU time: 7 minutes 7 seconds 130 msec 
Ended Job = job_1360884012490_0014 
Ended Job = -1776780875, job is filtered out (removed at runtime). 
Launching Job 2 out of 2 
Number of reduce tasks is set to 0 since there's no reduce operator 
java.io.FileNotFoundException: File does not exist: 
/tmp/hive-marc/hive_2013-02-22_19-15-31_691_7365912335285010827/-ext-10002/00_0
 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:782)
 
at 
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat$OneFileInfo.init(CombineFileInputFormat.java:493)
 
at 
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:284)
 
at 
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:244)
 
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69)
 
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:386)
 
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:352)
 
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.processPaths(CombineHiveInputFormat.java:419)
 
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:390)
 
at 
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:479) 
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471) 
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) 
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
 
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:617) 
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:612) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
 
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:612) 
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:435) 
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) 
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) 
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) 
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326) 
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) 
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) 
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) 
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) 
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689) 
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) 
at 

[jira] [Created] (HADOOP-9329) document native build dependencies in BUILDING.txt

2013-02-22 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-9329:


 Summary: document native build dependencies in BUILDING.txt
 Key: HADOOP-9329
 URL: https://issues.apache.org/jira/browse/HADOOP-9329
 Project: Hadoop Common
  Issue Type: Task
Affects Versions: 2.0.4-beta
Reporter: Colin Patrick McCabe
Priority: Trivial


{{BUILDING.txt}} describes {{-Pnative}}, but it does not specify what native 
libraries are needed for the build.  We should address this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira