Can we use Kryo instead of WritableSerialization to serialize objects in Hadoop?

2014-04-22 Thread Tao Xiao
The default implementation of serialization in Hadoop is *org.apache.hadoop.io.serializer.WritableSerialization*, but it is not so efficient as *Kryo*. Can we replace WritableSerialization with Kryo? How to do it ?

Re: Which version to learn ?

2014-04-22 Thread Fengyun RAO
try CDH5 2014-04-21 14:11 GMT+08:00 老赵 : > Hello , I am new to hadoop ,now is learning hadoop-1.2.1, > > but the stable version also is 2.2.0, I want to find a job about hadoop . > > Which one i should master more ? > > > Thank you all . > > >

Re: Can we use Kryo instead of WritableSerialization to serialize objects in Hadoop?

2014-04-22 Thread niksheibm
Kryo may have some limitations ,such as less support for field meta information. -- 星期二, 22 4月 2014, 07:16下午 +0800 from Tao Xiao : The default implementation of serialization in Hadoop is   org.apache.hadoop.io.serializer.WritableSerialization , but it is not so efficient as  Kryo .  Can we rep

Strange error in Hadoop 2.2.0: FileNotFoundException: file:/tmp/hadoop-hadoop/mapred/

2014-04-22 Thread Natalia Connolly
Hello, I am running Hadoop 2.2.0 in a single-node "cluster" mode. My application dies with the following strange error: Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: file:/tmp/hadoop-hadoop/mapred/local/1398179594286/part-0 (No such file or directory) This looks

Re: Strange error in Hadoop 2.2.0: FileNotFoundException: file:/tmp/hadoop-hadoop/mapred/

2014-04-22 Thread Jay Vyas
Is this happening in the job client? or the mappers? On Tue, Apr 22, 2014 at 11:21 AM, Natalia Connolly < natalia.v.conno...@gmail.com> wrote: > Hello, > > I am running Hadoop 2.2.0 in a single-node "cluster" mode. My > application dies with the following strange error: > > Caused by: java.la

Re: Strange error in Hadoop 2.2.0: FileNotFoundException: file:/tmp/hadoop-hadoop/mapred/

2014-04-22 Thread Natalia Connolly
Hi Jay, I am really not sure how to answer this question. Here is the full error: 14/04/22 11:31:02 INFO mapred.LocalJobRunner: Map task executor complete. 14/04/22 11:31:02 WARN mapred.LocalJobRunner: job_local607122693_0003 java.lang.Exception: java.lang.RuntimeException: Error in configur

Re: Warning: $HADOOP_HOME is deprecated

2014-04-22 Thread saurabh chhajed
I think this should give you an answer - http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201202.mbox/%3ccb4ecc21.33727%25ev...@yahoo-inc.com%3E On Wed, Apr 16, 2014 at 12:04 AM, Radhe Radhe wrote: > Hello All, > > I have configured Apache Hadoop 1.2.0 and set the $HADOOP_HOME env. >

Network paritions and Failover Times

2014-04-22 Thread Paul K. Harter, Jr.
I am trying to understand the mechanisms and timing involved when Hadoop is faced with a network partition. Suppose we have a large Hadoop cluster configured with automatic failover: 1) Active Name node 2) Standby NameNode 3) Quorum journal nodes (which we'll ignore for now)

Re: Can we use Kryo instead of WritableSerialization to serialize objects in Hadoop?

2014-04-22 Thread Tao Xiao
So Kryo is not good enough to replace *org.apache.hadoop.io.serializer.WritableSerialization*, Hadoop's default serialization implementation? Any other serialization implementation better than *WritableSerialization *? 2014-04-22 19:24 GMT+08:00 : > Kryo may have some limitations ,such as less

map execute twice

2014-04-22 Thread EdwardKing
I use Hadoop 2.2.0, I know hadoop will execute map first,when map is 100%, it then execute reduce, after reduce is 100%,job will end. I execute a job,the map is from 0% to 100% and map is from 0% to 100% again, why map execute twice? Thanks. Hadoop job information for Stage-1: number of mapper

Re: Strange error in Hadoop 2.2.0: FileNotFoundException: file:/tmp/hadoop-hadoop/mapred/

2014-04-22 Thread Shumin Guo
Can you list the file using hadoop commands? for example, hadoop fs -ls ...? On Tue, Apr 22, 2014 at 10:32 AM, Natalia Connolly < natalia.v.conno...@gmail.com> wrote: > Hi Jay, > >I am really not sure how to answer this question. Here is the full > error: > > 14/04/22 11:31:02 INFO mapred

Re: analyzing s3 data

2014-04-22 Thread Shumin Guo
You can configure your hadoop cluster to use s3 as the file system. Everything else should be same as for HDFS. On Mon, Apr 21, 2014 at 7:21 AM, kishore alajangi wrote: > > Hi Experts, > > We are running four node cluster which is installed cdh4.5 with cm4.8, We > have large size files in zip

Re: All datanodes are bad. Aborting ...

2014-04-22 Thread Shumin Guo
Did you do fsck? And what's the result? On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra wrote: > 1) ulimit -a > > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pendi

Re: Stuck Job - how should I troubleshoot?

2014-04-22 Thread Shumin Guo
As the last map task is in pending state, it is possible that some issue is happening within your cluster, for example, not enough memory, deadlock, data problem etc. You can kill this map task manually, and see if the problem can be solved. On Sun, Apr 20, 2014 at 9:46 AM, Serge Blazhievsky wrot

Re: Problem with jobtracker hadoop 1.2

2014-04-22 Thread Shumin Guo
It seems you are using the local FS rather than HDFS. You need to make sure your hdfs cluster is up and running. On Thu, Apr 17, 2014 at 6:42 PM, Shengjun Xin wrote: > Did you start datanode service? > > > On Thu, Apr 17, 2014 at 9:23 PM, Karim Awara wrote: > >> Hi, >> >> Whenever I start the h

Re: Task or job tracker seems not working?

2014-04-22 Thread Shumin Guo
The error message tells that you are using local FS rather than HDFS. So, you need to make sure your HDFS cluster is up and running before running any mapreduce jobs. For example, you can use fsck or other hdfs commands to test if the HDFS cluster is running ok. On Thu, Apr 17, 2014 at 8:51 AM, K

Differences between HistoryServer and Yarn TimeLine server?

2014-04-22 Thread sam liu
Hi Experts, I am confusing on these two concepts. Could you help explain the differences? Thanks!

Re: All datanodes are bad. Aborting ...

2014-04-22 Thread Amit Kabra
fsck showed cluster to be healthy that time. On Wed, Apr 23, 2014 at 8:25 AM, Shumin Guo wrote: > Did you do fsck? And what's the result? > > > On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra > wrote: >> >> 1) ulimit -a >> >> core file size (blocks, -c) 0 >> data seg size (kbytes

Re: Differences between HistoryServer and Yarn TimeLine server?

2014-04-22 Thread Zhijie Shen
In Hadoop 2.4, we have delivered the timeline server at a preview stage, which actually can serve some generic YARN application history as well as the framework specific information. Due to the development logistics, we have created the two concepts: History Server and Timeline Server. To be simple