Re: oozie issue java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementatio

2017-01-17 Thread 권병창
Hi.
I think there are many jar in sharelib/spark.
Try to delete all jar in sharelib/spark except oozie-sharelib-spark-*.jar, 
spark-assembly-*.jar. 
 
-Original Message-
From: "Rohit Mishra"rohitkmis...@mindagroup.com 
To: "권병창"magnu...@navercorp.com; user@hadoop.apache.org; 
Cc: 
Sent: 2017-01-17 (화) 20:38:38
Subject: Re: oozie issue java.lang.UnsupportedOperationException: Not 
implemented by the TFS FileSystem implementatio
 
Hi there, Please find below portion of my oozie-site.xml property   
 
nameoozie.service.HadoopAccessorService.hadoop.configurations/name
value*=/disk2/oozie/conf/hadoop-conf/value
descriptionComma separated AUTHORITY=HADOOP_CONF_DIR, where 
AUTHORITY is the HOST:PORT ofthe Hadoop service (JobTracker, HDFS). 
The wildcard '*' configuration isused when there is no exact match 
for an authority. The HADOOP_CONF_DIR containsthe relevant Hadoop 
*-site.xml files. If the path is relative is looked withinthe Oozie 
configuration directory; though the path can be absolute (i.e. to point 
   to Hadoop client conf/ directories in the local filesystem.
/description/property property
nameoozie.service.WorkflowAppService.system.libpath/name
value/user/oozie/sharelib/sharelib/value
descriptionSystem library path to use for workflow 
applications.This path is added to workflow application if their 
job properties setsthe property 'oozie.use.system.libpath' to true. 
   /description/property in hdfs at location 
/user/oozie/sharelib/sharelib I have following content: distcp  hcatalog  hive  
hive2  mapreduce-streaming  oozie  pig  sharelib.properties  spark  sqoop in 
the spark folder i do have spark-assembly-1.5.2-hadoop2.6.0.jar Please let me 
know if this is the required set up, otherwise what am i missing over here.  
Regards,Rohit Mishra On 16-Jan-2017, at 1:33 pm, 권병창 
magnu...@navercorp.com wrote:
Hitry to  make sure there is spark-assembly-1.5.2-hadoop2.6.0.jar in oozie 
spark share lib. spark assembly jar must be locate in oozie spark share lib.  
-Original Message-
From: "Rohit Mishra"rohitkmis...@mindagroup.com 
To: user@hadoop.apache.org; 
Cc: 
Sent: 2017-01-16 (월) 15:04:26
Subject: oozie issue java.lang.UnsupportedOperationException: Not implemented 
by the TFS FileSystem implementatio
 Hello, I am new to hadoop.I am having issue to run a spark job in 
oozie.individually i am able to run the spark job but with oozie after the job 
is launched i am getting the following error: 017-01-12 13:51:57,696 INFO 
[main] org.apache.hadoop.service.AbstractService: Service 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: 
java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem 
implementationjava.lang.UnsupportedOperationException: Not implemented by the 
TFS FileSystem implementation  at 
org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)   at 
org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574) at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)   at 
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)   at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)  at 
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)  at 
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at 
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.getFileSystem(MRAppMaster.java:497)
   at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:281)
 at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1499)  at 
java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:422)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1496)
  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429) 
Spark version: spark-1.5.2-bin-hadoop2.6Hadoop: hadoop-2.6.2Hbase : 
hbase-1.1.5Oozie: oozie-4.2.0 snapshot of my pom.xml is: dependency
   groupIdorg.apache.zookeeper/groupId
   artifactIdzookeeper/artifactId
   version3.4.8/version
   typepom/type
/dependency
dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase-common/artifactId
   version1.1.5/version
   exclusions
  exclusion
 groupIdorg.slf4j/groupId
 artifactIdslf4j-log4j12/artifactId
  /exclusion
   /exclusions
/dependency

dependency
   groupIdorg.apache.hbase/groupId
   art

RE: oozie issue java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementatio

2017-01-16 Thread 권병창
Hi
try to  make sure there is spark-assembly-1.5.2-hadoop2.6.0.jar in oozie spark 
share lib. 
spark assembly jar must be locate in oozie spark share lib. 
 
-Original Message-
From: "Rohit Mishra"rohitkmis...@mindagroup.com 
To: user@hadoop.apache.org; 
Cc: 
Sent: 2017-01-16 (월) 15:04:26
Subject: oozie issue java.lang.UnsupportedOperationException: Not implemented 
by the TFS FileSystem implementatio
 
Hello, I am new to hadoop.I am having issue to run a spark job in 
oozie.individually i am able to run the spark job but with oozie after the job 
is launched i am getting the following error: 017-01-12 13:51:57,696 INFO 
[main] org.apache.hadoop.service.AbstractService: Service 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: 
java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem 
implementationjava.lang.UnsupportedOperationException: Not implemented by the 
TFS FileSystem implementation   at 
org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)   at 
org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574) at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)   at 
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)   at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)  at 
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)  at 
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at 
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.getFileSystem(MRAppMaster.java:497)
   at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:281)
 at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1499)  at 
java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:422)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1496)
  at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429) 
Spark version: spark-1.5.2-bin-hadoop2.6Hadoop: hadoop-2.6.2Hbase : 
hbase-1.1.5Oozie: oozie-4.2.0 snapshot of my pom.xml is: dependency
   groupIdorg.apache.zookeeper/groupId
   artifactIdzookeeper/artifactId
   version3.4.8/version
   typepom/type
/dependency
dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase-common/artifactId
   version1.1.5/version
   exclusions
  exclusion
 groupIdorg.slf4j/groupId
 artifactIdslf4j-log4j12/artifactId
  /exclusion
   /exclusions
/dependency

dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase-client/artifactId
   version1.1.5/version
   exclusions
  exclusion
 groupIdorg.slf4j/groupId
 artifactIdslf4j-log4j12/artifactId
  /exclusion
   /exclusions
/dependency

dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase-server/artifactId
   version1.1.5/version
   exclusions
  exclusion
 groupIdorg.slf4j/groupId
 artifactIdslf4j-log4j12/artifactId
  /exclusion
   /exclusions
/dependency
dependency
   groupIdorg.apache.hbase/groupId
   artifactIdhbase-testing-util/artifactId
   version1.1.5/version
/dependency
dependency
   groupIdorg.apache.spark/groupId
   artifactIdspark-core_2.11/artifactId
   version1.5.2/version
   exclusions
  exclusion
 artifactIdjavax.servlet/artifactId
 groupIdorg.eclipse.jetty.orbit/groupId
  /exclusion
   /exclusions
/dependency
!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 
--
dependency
   groupIdorg.apache.spark/groupId
   artifactIdspark-sql_2.11/artifactId
   version1.5.2/version
/dependency
!-- https://mvnrepository.com/artifact/org.apache.spark/spark-yarn_2.10 
--
dependency
   groupIdorg.apache.spark/groupId
   artifactIdspark-yarn_2.11/artifactId
   version1.5.2/version

/dependency


!-- 
https://mvnrepository.com/artifact/org.mongodb.mongo-hadoop/mongo-hadoop-core 
--
dependency
   groupIdorg.mongodb.mongo-hadoop/groupId
   artifactIdmongo-hadoop-core/artifactId
   version1.5.2/version
/dependency
!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common 
--
dependency
   groupIdorg.apache.hadoop/groupId
   artifactIdhadoop-common/artifactId
   version2.6.2/version
   exclusions
  exclusion
 artifactIdservlet-api/artifactId
 groupIdjavax.servlet/groupId
  /exclusion
  exclusion
 artifactIdjetty-util/artifactId
 groupIdorg.mortbay.jetty/groupId
  /exclusion
  exclusion
 artifactIdjsp-api/artifactId
 groupIdjavax.servlet.jsp/groupId
  /exclusion
   /exclusions
/dependency
!-- 

Re: Connecting Hadoop HA cluster via java client

2016-10-18 Thread 권병창
that configure need to using webhdfs://${nameservice}
try to "hdfs dfs -ls webhdfs://${nameservice}/some/files"
 
 
-Original Message-
From: "Pushparaj Motamari"pushpara...@gmail.com 
To: "권병창"magnu...@navercorp.com; 
Cc: user@hadoop.apache.org; 
Sent: 2016-10-18 (화) 23:02:14
Subject: Re: Connecting Hadoop HA cluster via java client
 
Hi,
 Following are not required I guess. I am able to connect to cluster without 
these. Is there any reason to include them?
dfs.namenode.http-address.${dfs.nameservices}.nn1 
dfs.namenode.http-address.${dfs.nameservices}.nn2 
Regards
Pushparaj 
 
On Wed, Oct 12, 2016 at 6:39 AM, 권병창 magnu...@navercorp.com wrote:
Hi.
 
1. minimal configuration to connect HA namenode is below properties.
zookeeper information does not necessary.
 
dfs.nameservices
dfs.ha.namenodes.${dfs.nameservices}
dfs.namenode.rpc-address.${dfs.nameservices}.nn1 
dfs.namenode.rpc-address.${dfs.nameservices}.nn2
dfs.namenode.http-address.${dfs.nameservices}.nn1 
dfs.namenode.http-address.${dfs.nameservices}.nn2
dfs.client.failover.proxy.provider.c3=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
  
 
2. client use round robin manner for selecting active namenode.
 
 
-Original Message-
From: "Pushparaj Motamari"pushpara...@gmail.com 
To: user@hadoop.apache.org; 
Cc: 
Sent: 2016-10-12 (수) 03:20:53
Subject: Connecting Hadoop HA cluster via java client
 
Hi,
 I have two questions pertaining to accessing the hadoop ha cluster from java 
client.  1. Is  it necessary to supply 
conf.set("dfs.ha.automatic-failover.enabled",true);
and
conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");

in addition to the other properties set in the code below?
private Configuration initHAConf(URI journalURI, Configuration conf) {
  conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
  journalURI.toString());
  
  String address1 = "127.0.0.1:" + NN1_IPC_PORT;
  String address2 = "127.0.0.1:" + NN2_IPC_PORT;
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
  NAMESERVICE, NN1), address1);
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
  NAMESERVICE, NN2), address2);
  conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
  conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
  NN1 + "," + NN2);
  conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
  ConfiguredFailoverProxyProvider.class.getName());
  conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);
  
  return conf;
}

2. If we supply zookeeper configuration details as mentioned in the question 1 
is it necessary to set the primary and secondary namenode addresses as 
mentioned in the code above? Since we have 
given zookeeper connection details the client should be able to figure out the 
active namenode connection details.


Regards

Pushparaj




 




RE: Connecting Hadoop HA cluster via java client

2016-10-11 Thread 권병창
Hi.
 
1. minimal configuration to connect HA namenode is below properties.
zookeeper information does not necessary.
 
dfs.nameservices
dfs.ha.namenodes.${dfs.nameservices}
dfs.namenode.rpc-address.${dfs.nameservices}.nn1 
dfs.namenode.rpc-address.${dfs.nameservices}.nn2
dfs.namenode.http-address.${dfs.nameservices}.nn1 
dfs.namenode.http-address.${dfs.nameservices}.nn2
dfs.client.failover.proxy.provider.c3=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
 
 
2. client use round robin manner for selecting active namenode.
 
 
-Original Message-
From: "Pushparaj Motamari"pushpara...@gmail.com 
To: user@hadoop.apache.org; 
Cc: 
Sent: 2016-10-12 (수) 03:20:53
Subject: Connecting Hadoop HA cluster via java client
 
Hi,
 I have two questions pertaining to accessing the hadoop ha cluster from java 
client.  1. Is  it necessary to supply 
conf.set("dfs.ha.automatic-failover.enabled",true);
and
conf.set("ha.zookeeper.quorum","zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");

in addition to the other properties set in the code below?
private Configuration initHAConf(URI journalURI, Configuration conf) {
  conf.set(DFSConfigKeys.DFS_NAMENODE_SHARED_EDITS_DIR_KEY,
  journalURI.toString());
  
  String address1 = "127.0.0.1:" + NN1_IPC_PORT;
  String address2 = "127.0.0.1:" + NN2_IPC_PORT;
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
  NAMESERVICE, NN1), address1);
  conf.set(DFSUtil.addKeySuffixes(DFS_NAMENODE_RPC_ADDRESS_KEY,
  NAMESERVICE, NN2), address2);
  conf.set(DFSConfigKeys.DFS_NAMESERVICES, NAMESERVICE);
  conf.set(DFSUtil.addKeySuffixes(DFS_HA_NAMENODES_KEY_PREFIX, NAMESERVICE),
  NN1 + "," + NN2);
  conf.set(DFS_CLIENT_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + NAMESERVICE,
  ConfiguredFailoverProxyProvider.class.getName());
  conf.set("fs.defaultFS", "hdfs://" + NAMESERVICE);
  
  return conf;
}

2. If we supply zookeeper configuration details as mentioned in the question 1 
is it necessary to set the primary and secondary namenode addresses as 
mentioned in the code above? Since we have 
given zookeeper connection details the client should be able to figure out the 
active namenode connection details.


Regards

Pushparaj





Re: Installing just the HDFS client

2016-08-29 Thread 권병창
let's  default core-site.xml, hdfs-site.xml log4j.properties (is not your 
customized *.xml) locate like below tree.
and re-build jar using 'mvn clean package'.  
finally resources(*.xml *.properties) will locate into a jar.
 .
├── pom.xml 
└── src
└── main
└── resources
├── core-site.xml
├── hdfs-site.xml
└── log4j.properties
 
and you can use a jar:
 
java -jar ${build_jar} -conf /opt/hadoop/etc/hdfs-site.xml -ls 
hdfs://${nameservices}/user/home


 
-Original Message-
From: "F21"f21.gro...@gmail.com 
To: "권병창"magnu...@navercorp.com; 
Cc: 
Sent: 2016-08-30 (화) 12:47:16
Subject: Re: Installing just the HDFS client
 

  

  
  
I am still getting the same error
  despite setting the variable. In my case, hdfs-site.xml and
  core-site.xml is customized at run time, so cannot be compiled
  into the jar.

  

  My core-site.xml and hdfs-site.xml is located in /opt/hbase/conf

  

  Here's what I did:

  

  bash-4.3# export HADOOP_CONF_DIR="/opt/hbase/conf"

  bash-4.3# echo $HADOOP_CONF_DIR

  /opt/hbase/conf

  bash-4.3# cd /opt/hadoop/

  bash-4.3# java -jar hdfs-fs.jar

  log4j:WARN No appenders could be found for logger
  (org.apache.hadoop.util.Shell).

  log4j:WARN Please initialize the log4j system properly.

  log4j:WARN See
  http://logging.apache.org/log4j/1.2/faq.html#noconfig for more
  info.

  Exception in thread "main" java.lang.RuntimeException:
  core-site.xml not found

  at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2566)

  at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)

  at
  org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)

  at
  org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)

  at
  org.apache.hadoop.conf.Configuration.set(Configuration.java:1115)

  at
  org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1451)

  at
org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:321)

  at
org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:487)

  at
org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:170)

  at
org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153)

  at
  org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)

  at
  org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

  at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)

  

  On 30/08/2016 11:20 AM, 권병창 wrote:



  
set below environmental variable.
 

export HADOOP_CONF_DIR=/opt/hadoop/etc


  or
 

hdfs-site.xml core-site.xml can locate in the jar.


  


-Original Message-


보낸 사람: F21 f21.gro...@gmail.com


받는 사람: 권병창 magnu...@navercorp.com


참조: 

날짜: 2016. 8. 30 오전 9:12:58


제목: Re: Installing just the HDFS client




  

  

Hi,

  

  Thanks for the pom.xml. I was able to build it
  successfully. How do I point it to the config files? My
  core-site.xml and hdfs-site.xml are located in
  /opt/hadoop/etc.

  

  I tried the following:

  java -jar hdfs-fs.jar -ls /

  java -jar hdfs-fs.jar --config /opt/hbase/etc -ls /

  java -jar hdfs-fs.jar -conf /opt/hbase/etc -ls /

  

  This is the error I am getting:

  log4j:WARN No appenders could be found for logger
  (org.apache.hadoop.util.Shell).

  log4j:WARN Please initialize the log4j system properly.

  log4j:WARN See 
http://logging.apache.org/log4j/1.2/faq.html#noconfig
  for more info.

  Exception in thread "main" java.lang.RuntimeException:
  core-site.xml not found

  at
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2566)

  at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)

  at
  
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)

  at
  org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)

  at
  org.apache.had

RE: Installing just the HDFS client

2016-08-29 Thread 권병창
Hi
 
refer to below pom.xml
you should modify hadoop version that you want.
 
build is simple:  mvn clean package 
It will make a jar  of 34mb size. 
 
usage is simple:
java -jar ${build_jar}.jar -mkdir /user/home
java -jar ${build_jar}.jar -ls /user/home
 
 
pom.xml
?xml version="1.0" encoding="UTF-8"? 
project xmlns="http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";

  modelVersion4.0.0/modelVersion

  groupIdcom.naver.c3/groupId
  artifactIdhdfs-connector/artifactId
  version1.0-SNAPSHOT/version

  dependencies
dependency
  groupIdorg.apache.hadoop/groupId
  artifactIdhadoop-common/artifactId
  version2.7.1/version
/dependency
dependency
  groupIdorg.apache.hadoop/groupId
  artifactIdhadoop-hdfs/artifactId
  version2.7.1/version
/dependency
  /dependencies

  build
plugins
  plugin
groupIdorg.apache.maven.plugins/groupId
artifactIdmaven-shade-plugin/artifactId
version2.3/version
executions
  execution
phasepackage/phase
goals
  goalshade/goal
/goals
configuration
  minimizeJarfalse/minimizeJar
  transformers
transformer 
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"
  
mainClassorg.apache.hadoop.fs.FsShell/mainClass
/transformer
transformer 
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/
  /transformers
   /configuration
  /execution
/executions
  /plugin
/plugins
  /build

/project 
 
-Original Message-
From: "F21"f21.gro...@gmail.com 
To: user@hadoop.apache.org; 
Cc: 
Sent: 2016-08-29 (월) 14:25:09
Subject: Installing just the HDFS client
 
Hi all,

I am currently building a HBase docker image. As part of the bootstrap 
process, I need to run some `hdfs dfs` commands to create directories on 
HDFS.

The whole hadoop distribution is pretty heavy contains things to run 
namenodes, etc. I just need a copy of the dfs client for my docker 
image. I have done some poking around and see that I need to include the 
files in bin/, libexec/, lib/ and share/hadoop/common and share/hadoop/hdfs.

However, including the above still takes up quite a bit of space. Is 
there a single JAR I can add to my image to perform operations against HDFS?


Cheers,

Francis


-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org





Re: Output File could only be replicated to 0 nodes

2016-08-01 Thread 권병창
 
dfs.namenode.replication.considerLoadtrueDecide if chooseTarget considers the 
target's load or not 
there are wide variation in Xceivers of below datanode.  
try  dfs.namenode.replication.considerLoad=false 
 
 
-Original Message-
From: "Madhav Sharan"msha...@usc.edu 
To: "Gagan Brahmi"gaganbra...@gmail.com; 
Cc: "Gabriel Balan"gabriel.ba...@oracle.com; 
"user"user@hadoop.apache.org; ananthk_gan...@yahoo.com; 
Sent: 2016-08-01 (월) 15:57:35
Subject: Re: Output File could only be replicated to 0 nodes
 
Thanks everyone for help. I believe the error was coming because of increase in 
Non DFS memory usage. 
Name: 172.31.11.49:50010 Decommission Status : NormalConfigured Capacity: 
74033672192 (68.95 GB)DFS Used: 6814511104 (6.35 GB)Non DFS Used: 4128133120 
(3.84 GB)DFS Remaining: 63091027968 (58.76 GB)DFS Used%: 9.20%DFS Remaining%: 
85.22%Configured Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 
(0 B)Cache Used%: 100.00%Cache Remaining%: 0.00%Xceivers: 1   Name: 
172.31.11.49:50010 Decommission Status : NormalConfigured Capacity: 74033672192 
(68.95 GB)DFS Used: 7336382464 (6.83 GB)Non DFS Used: 60541867008 (56.38 GB)DFS 
Remaining: 6155422720 (5.73 GB)DFS Used%: 9.91%DFS Remaining%: 8.31%Configured 
Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 (0 B)Cache Used%: 
100.00%Cache Remaining%: 0.00%Xceivers: 847 --Madhav Sharan  

On Mon, Jul 25, 2016 at 10:00 AM, Gagan Brahmi gaganbra...@gmail.com 
wrote:
There can be several reasons you see this error. The most common ones are: Disk 
Space on Datanodes - Like mentioned earlier in the thread.Inconsistent 
DataNodes - You can try to restart HDFS which should clean it up.Bad or 
Unresponsive DatanodeNegative 'Block Size' in hdfs-site.xml.Network 
communication issues.  Regards,Gagan Brahmi
On Mon, Jul 25, 2016 at 9:14 AM, Gabriel Balan gabriel.ba...@oracle.com 
wrote:

  

  
  




Hi

org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File
/user/pts/output/OTSOutput/_temporary/1/_temporary/attempt_1463_0008_m_18_2/video.mp4.of.txt
could only be replicated to 0 nodes instead of minReplication
(=1).  There are 9 datanode(s) running and no node(s)
are excluded in this operation.

Can it be there is no more space left (for HDFS) on the host
  running data nodes?



Try running "hdfs dfsadmin -report"



hth

Gabriel Balan





On 7/24/2016 7:53 PM, Madhav Sharan
  wrote:



  
  
Hi hadoop users,




  
We are running a mapreduce jobs with 10 nodes. Each
 map job process a video and generate a .txt file as output.
We are getting DataStreamer Exception that File could
  only be replicated to 0 nodes instead of minReplication (=1).
  This is the output file we expect after successful run. 
 

Any help is appreciated. We checked hdfs has space
  available and all nodes are responding.
 

Full trace - 
 

2016-07-24 21:55:08,343 WARN [Thread-214]
  org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException):
  File
/user/pts/output/OTSOutput/_temporary/1/_temporary/attempt_1463_0008_m_18_2/video.mp4.of.txt
  could only be replicated to 0 nodes instead of minReplication
  (=1).  There are 9 datanode(s) running and no node(s) are
  excluded in this operation.
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1547)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:724)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at
  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at
  org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at
  org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at
  java.security.AccessController.doPrivileged(Native Method)
at
  javax.security.auth.Subject.doAs(Subject.java:415)
at

RE: Reload an update hdfs-site.xml configuration without restart

2016-07-31 Thread 권병창
https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#dfsadmin
 
refer 'hdfs dfsadmin -reconfig' command. 
 
-Original Message-
From: "Himawan Mahardianto"mahardia...@ugm.ac.id 
To: user@hadoop.apache.org; 
Cc: 
Sent: 2016-08-01 (월) 08:43:10
Subject: Reload an update hdfs-site.xml configuration without restart
 
Hi guys, is there anyway to reload an update hdfs-site.xml configuration on the 
namenode and datanode without restart allover hadoop cluster with stop-dfs.sh 
and start-dfs.sh? Best regardsHimawan Mahardianto




mismatch corrupt blocks from fsck and dfadmin report.

2016-01-12 Thread 권병창
Hi. hadooper!
 
I use hadoop-2.7.1 and  my cluster has 130 nodes.
 
recently I am facing a problem.
 
I have found corrupt block by nagios.
 
nagios request 
http://namenode01:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem. 
 
below is result.  notice CorruptBlocks is 1. 

 {
  "beans" : [ {
"name" : "Hadoop:service=NameNode,name=FSNamesystem",
"modelerType" : "FSNamesystem",
"tag.Context" : "dfs",
"tag.HAState" : "active",
"tag.Hostname" : "css0700.nhnsystem.com",
"MissingBlocks" : 0,
"MissingReplOneBlocks" : 0,
"ExpiredHeartbeats" : 10,
"TransactionsSinceLastCheckpoint" : 820630,
"TransactionsSinceLastLogRoll" : 1916,
"LastWrittenTransactionId" : 376685578,
"LastCheckpointTime" : 1452583950883,
"CapacityTotal" : 1650893130075660,
"CapacityTotalGB" : 1537514.0,
"CapacityUsed" : 1237079990848257,
"CapacityUsedGB" : 1152121.0,
"CapacityRemaining" : 410189981364473,
"CapacityRemainingGB" : 382019.0,
"CapacityUsedNonDFS" : 3623157862930,
"TotalLoad" : 6717,
"SnapshottableDirectories" : 0,
"Snapshots" : 0,
"BlocksTotal" : 4034155,
"FilesTotal" : 2866690,
"PendingReplicationBlocks" : 0,
"UnderReplicatedBlocks" : 0,
"CorruptBlocks" : 1,
"ScheduledReplicationBlocks" : 0,
"PendingDeletionBlocks" : 0,
"ExcessBlocks" : 87,
"PostponedMisreplicatedBlocks" : 0,
"PendingDataNodeMessageCount" : 0,
"MillisSinceLastLoadedEdits" : 0,
"BlockCapacity" : 67108864,
"StaleDataNodes" : 0,
"TotalFiles" : 2866690
  } ]
}


 
 
however  'hdfs fsck / -list-corruptfileblocks' does not found.
 
below is fsck result.

 The filesystem under path '/' has 0 CORRUPT files



 
'hdfs dfsadmin -report'  is similar first result.

 Configured Capacity: 1650892318461452 (1.47 PB)
Present Capacity: 1647353181258422 (1.46 PB)
DFS Remaining: 408711410865856 (371.72 TB)
DFS Used: 1238641770392566 (1.10 PB)
DFS Used%: 75.19%
Under replicated blocks: 0
Blocks with corrupt replicas: 1
Missing blocks: 0
Missing blocks (with replication factor 1): 0



 
My question is
 
1. why different result?
2. How do I find corrupt filename? I wonder which file is corrupt.
 
Thank you.