Unsubscribe

2015-11-06 Thread Kiran Dangeti
On Nov 5, 2015 4:56 PM, "Arvind Thakur"  wrote:

> Unsubscribe
>


Re: hadoop 2.4.0 streaming generic parser options using TAB as separator

2015-06-09 Thread Kiran Dangeti
\bbb
On Jun 10, 2015 10:58 AM, "anvesh ragi"  wrote:

> Hello all,
>
> I know that the tab is default input separator for fields :
>
> stream.map.output.field.separator
> stream.reduce.input.field.separator
> stream.reduce.output.field.separator
> mapreduce.textoutputformat.separator
>
> but if i try to write the generic parser option :
>
> stream.map.output.field.separator=\t (or)
> stream.map.output.field.separator="\t"
>
> to test how hadoop parses white space characters like "\t,\n" when used as
> separators. I observed that hadoop reads it as \t character but not "
>  " tab space itself. I checked it by printing each line in reducer (python)
> as it reads using :
>
> sys.stdout.write(str(line))
>
> My mapper emits key/value pairs as : key value1 value2
>
> using print (key,value1,value2,sep='\t',end='\n') command.
>
> So I expected my reducer to read each line as : key value1 value2 too,
> but instead sys.stdout.write(str(line)) printed :
>
> key value1 value2 \\with trailing space
>
> From Hadoop streaming - remove trailing tab from reducer output
> ,
> I understood that the trailing space is due to
> mapreduce.textoutputformat.separator not being set and left as default.
>
> So, this confirmed my assumption that hadoop considered my total map
> output :
>
> key value1 value2
>
> as key and value as empty Text object since it read the separator from
> stream.map.output.field.separator=\t as "\t" character instead of "" tab
> space itself.
>
> Please help me understand this behavior and how can I use \t as a
> separator if I want to?
>
> Thanks & Regards,
> Anvesh R
>
>


Re: Unable to start Hive

2015-05-15 Thread Kiran Dangeti
Anand,

Sometimes it will error out due some resources are not available. So stop
and start the hadoop cluster and see
On May 15, 2015 12:24 PM, "Anand Murali"  wrote:

> Dear All:
>
> I am running Hadoop-2.6 (pseudo mode) on Ubuntu 15.04, and trying to
> connect Hive to it after installation. I run . .hadoop as start-up script
> which contain environment variables setting. Find below
>
> *. ,hadoop*
> export HADOOP_HOME=/home/anand_vihar/hadoop-2.6.0
> export JAVA_HOME=/home/anand_vihar/jdk1.7.0_75/
> export HADOOP\_PREFIX=/home/anand_vihar/hadoop-2.6.0
> export HADOOP_INSTALL=/home/anand_vihar/hadoop-2.6.0
> export PIG_HOME=/home/anand_vihar/pig-0.14.0
> export PIG_INSTALL=/home/anand_vihar/pig-0.14.0
> export PIG_CLASSPATH=/home/anand_vihar/hadoop-2.6.0/etc/hadoop/
> export HIVE_HOME=/home/anand_vihar/hive-1.1.0
> export HIVE_INSTALL=/home/anand_vihar/hive-1.1.0
> export
> PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin:$HADOOP_HOME:$JAVA_HOME:$PIG_INSTALL/bin:$PIG_CLASSPATH:$HIVE_HOME:$HIVE_INSTALL/bin
> echo $HADOOP_HOME
> echo $JAVA_HOME
> echo $HADOOP_INSTALL
> echo $PIG_HOME
> echo $PIG_INSTALL
> echo $PIG_CLASSPATH
> echo $HIVE_HOME
> echo $PATH
>
>
> *Error*
>
> anand_vihar@Latitude-E5540:~$ hive
>
> Logging initialized using configuration in
> jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-common-1.1.0.jar!/hive-log4j.properties
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/home/anand_vihar/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/home/anand_vihar/hive-1.1.0/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Exception in thread "main" java.lang.RuntimeException:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot create directory
> /tmp/hive/anand_vihar/a9eb2cf7-9890-4ec3-af6c-ae0c40d9e9d7. Name node is in
> safe mode.
> The reported blocks 2 has reached the threshold 0.9990 of total blocks 2.
> The number of live datanodes 1 has reached the minimum number 0. In safe
> mode extension. Safe mode will be turned off automatically in 6 seconds.
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4216)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4191)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>
> at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:472)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
> Cannot create directory
> /tmp/hive/anand_vihar/a9eb2cf7-9890-4ec3-af6c-ae0c40d9e9d7. Name node is in
> safe mode.
> The reported blocks 2 has reached the threshold 0.9990 of total blocks 2.
> The number of live datanodes 1 has reached the minimum number 0. In safe
> mode extension. Safe mode will be turned off automatically in 6 seconds.
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1364)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNam

Re: How to debug why example not finishing (or even starting)

2015-01-28 Thread Kiran Dangeti
Frank,

Did you set the debug mode ??
On Jan 28, 2015 7:10 PM, "Frank Lanitz"  wrote:

> Hi,
>
> I've got a simple 3-node-setup where I wanted to test the grep function
> based on some examples. So
>
> $ hadoop fs -put /home/hadoop/hadoop/etc/hadoop hadoop-config
> $ hadoop jar
> hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep
> /user/hadoop/hadoop-config /user/hadoop/output 'dfs[a-z.]+'
>
> When running this I've get
>
> > 15/01/28 14:32:14 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> > 15/01/28 14:32:15 INFO client.RMProxy: Connecting to ResourceManager at
> hadoopm/:8032
> > 15/01/28 14:32:15 WARN mapreduce.JobSubmitter: No job jar file set.
> User classes may not be found. See Job or Job#setJar(String).
> > 15/01/28 14:32:15 INFO input.FileInputFormat: Total input paths to
> process : 30
> > 15/01/28 14:32:16 INFO mapreduce.JobSubmitter: number of splits:30
> > 15/01/28 14:32:16 INFO mapreduce.JobSubmitter: Submitting tokens for
> job: job_1422451252723_0002
> > 15/01/28 14:32:16 INFO mapred.YARNRunner: Job jar is not present. Not
> adding any jar to the list of resources.
> > 15/01/28 14:32:16 INFO impl.YarnClientImpl: Submitted application
> application_1422451252723_0002
> > 15/01/28 14:32:16 INFO mapreduce.Job: The url to track the job:
> http://hadoopm:8088/proxy/application_1422451252723_0002/
> > 15/01/28 14:32:16 INFO mapreduce.Job: Running job: job_1422451252723_0002
>
> and nothing more seems to happen. When checking
> http://hadoopm:8088/cluster/apps I can see, that the job is accepted,
> but in undefined state. However, when killing the job and restarting a
> new one it might starts to work. Obviously something is not working well
> here -- so I'm wondering how to debug what's going wrong here.
>
> Cheers,
> Frank
>


Re: not able to run map reduce job example on aws machine

2014-04-10 Thread Kiran Dangeti
Rahul,

Please check the port name given in mapred.site.xml
Thanks
Kiran

On Thu, Apr 10, 2014 at 3:23 PM, Rahul Singh wrote:

>  Hi,
>   I am getting following exception while running word count example,
>
> 14/04/10 15:17:09 INFO mapreduce.Job: Task Id :
> attempt_1397123038665_0001_m_00_2, Status : FAILED
> Container launch failed for container_1397123038665_0001_01_04 :
> java.lang.IllegalArgumentException: Does not contain a valid host:port
> authority: poc_hadoop04:46162
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211)
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163)
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152)
> at
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:210)
> at
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196)
> at
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
> at
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
> at
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
> at
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>
>
> I have everything configured with hdfs  running where i am able to create
> files and directories. running jps on my machine shows all components
> running.
>
> 10290 NameNode
> 10416 DataNode
> 10738 ResourceManager
> 11634 Jps
> 10584 SecondaryNameNode
> 10844 NodeManager
>
>
> Any pointers will be appreciated.
>
> Thanks and Regards,
> -Rahul Singh
>


Re: Hadoop property precedence

2013-07-13 Thread Kiran Dangeti
Shalish,

The default block size is 64MB which is good at the client end. Make sure
the same at your end also in conf. You can increase the size of each block
to 128MB or greater than that only thing you can see the processing will be
fast but at end there may be  chances of losing data.

Thanks,
Kiran


On Fri, Jul 12, 2013 at 10:20 PM, Shalish VJ  wrote:

> Hi,
>
>
> Suppose block size set in configuration file at client side is 64MB,
> block size set in configuration file at name node side is 128MB and block
> size set in configuration file at datanode side is something else.
> Please advice, If the client is writing a file to hdfs,which property
> would be executed.
>
> Thanks,
> Shalish.
>


Re: Issues Running Hadoop 1.1.2 on multi-node cluster

2013-07-09 Thread Kiran Dangeti
Hi Siddharth,

While running the multi-node we need to take care of the local host of the
slave machine from the error messages the task tracker root directory not
able to get to the masters. Please check and rerun it.

Thanks,
Kiran


On Tue, Jul 9, 2013 at 10:26 PM, siddharth mathur wrote:

> Hi,
>
> I have installed Hadoop 1.1.2 on a 5 nodes cluster. I installed it
> watching this tutorial *
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
> *
>
> When I startup the hadoop, I get the folloing error in *all* the
> tasktrackers.
>
> "
> 2013-07-09 12:15:22,301 INFO org.apache.hadoop.mapred.UserLogCleaner:
> Adding job_201307051203_0001 for user-log deletion with
> retainTimeStamp:1373472921775
> 2013-07-09 12:15:22,301 INFO org.apache.hadoop.mapred.UserLogCleaner:
> Adding job_201307051611_0001 for user-log deletion with
> retainTimeStamp:1373472921775
> 2013-07-09 12:15:22,601 INFO org.apache.hadoop.mapred.TaskTracker:*Failed to 
> get system directory
> *...
> 2013-07-09 12:15:25,164 INFO org.apache.hadoop.mapred.TaskTracker: Failed
> to get system directory...
> 2013-07-09 12:15:27,901 INFO org.apache.hadoop.mapred.TaskTracker: Failed
> to get system directory...
> 2013-07-09 12:15:30,144 INFO org.apache.hadoop.mapred.TaskTracker: Failed
> to get system directory...
> "
>
> *But everything looks fine in the webUI. *
>
> When I run a job, I get the following error but the job completes anyways.
> I have* attached the* *screenshots* of the maptask failed error log in
> the UI.
>
> *"*
> 13/07/09 12:29:37 INFO input.FileInputFormat: Total input paths to process
> : 2
> 13/07/09 12:29:37 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 13/07/09 12:29:37 WARN snappy.LoadSnappy: Snappy native library not loaded
> 13/07/09 12:29:37 INFO mapred.JobClient: Running job: job_201307091215_0001
> 13/07/09 12:29:38 INFO mapred.JobClient:  map 0% reduce 0%
> 13/07/09 12:29:41 INFO mapred.JobClient: Task Id :
> attempt_201307091215_0001_m_01_0, Status : FAILED
> Error initializing attempt_201307091215_0001_m_01_0:
> ENOENT: No such file or directory
> at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
> at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:699)
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:654)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
> at
> org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
> at
> org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
> at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1331)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> at
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1306)
> at
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1221)
> at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2581)
> at java.lang.Thread.run(Thread.java:724)
>
> 13/07/09 12:29:41 WARN mapred.JobClient: Error reading task
> outputhttp://dmkd-1:50060/tasklog?plaintext=true&attemptid=attempt_201307091215_0001_m_01_0&filter=stdout
> 13/07/09 12:29:41 WARN mapred.JobClient: Error reading task
> outputhttp://dmkd-1:50060/tasklog?plaintext=true&attemptid=attempt_201307091215_0001_m_01_0&filter=stderr
> 13/07/09 12:29:45 INFO mapred.JobClient:  map 50% reduce 0%
> 13/07/09 12:29:53 INFO mapred.JobClient:  map 50% reduce 16%
> 13/07/09 12:30:38 INFO mapred.JobClient: Task Id :
> attempt_201307091215_0001_m_00_1, Status : FAILED
> Error initializing attempt_201307091215_0001_m_00_1:
> ENOENT: No such file or directory
> at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
> at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:699)
> at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:654)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
> at
> org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
> at
> org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
> at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1331)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> at
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1306)
> at
> org.apac