Hadoop cluster setup problems

2013-12-31 Thread Karthik K
Hi I am Karthik from India. We have been working on a temperature aware yarn scheduler, where we want to stop the scheduler from awarding new jobs to a node if its temperature crosses certain threshold. We figured out a simple way to do so, where we set the health checker script, and set it in yar

Re: any suggestions on IIS log storage and analysis?

2013-12-31 Thread Peyman Mohajerian
You can run a series of map-reduce jobs on your data, if some log line is related to another line, e.g. based on sessionId, you can emit the sessionId as the key of your mapper output with the value being on the rows associated with the sessionId, so on the reducer side data from different blocks w

Re: Hadoop vs Ceph and GlusterFS

2013-12-31 Thread Chris Embree
Ceph and glusterfs are NOT centralized files systems. Glusterfs can be used with Hadoop map reduce, but it requires a special plug in, and hdfs 2 can be ha, so it's probably not worth switching. Ymmv. On Dec 31, 2013 4:01 PM, "Jiayu Ji" wrote: > I am not very familiar with Ceph and GlusterFS, b

Secondary name node error in checkpoint with Kerberos enabled

2013-12-31 Thread Manoj Samel
* Name node and secondary name nodes on different machines * Kerberos was just enabled * Cloudera CDH 4.5 on Centos *Secondary name node log (HOST2) shows following* 2013-12-31 22:00:11,728 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hdfs/<2NN-host>@ (aut

Re: Map succeeds but reduce hangs

2013-12-31 Thread Hardik Pandya
as expected, its failing during shuffle it seems like hdfs could not resolve the DNS name for slave nodes have your configured your slaves host names correctly? 2013-12-31 14:27:54,207 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201312311107_0003_r_00_0: Shuffle Error: E

Re: Map succeeds but reduce hangs

2013-12-31 Thread navaz
Hi My hdfs-site is configured for 4 nodes. ( One is master and 3 slaves) dfs.replication 4 start-dfs.sh and stop-mapred.sh doesnt solve the problem. Also tried to run the program after formatting the namenode(Master) which also fails. My jobtracker logs on the master ( name node) is give be

Re: block replication

2013-12-31 Thread Hardik Pandya
dfs.heartbeat.interval 3 Determines datanode heartbeat interval in seconds. and may be you are looking for *dfs.namenode.stale.datanode.interval<*/name> 3 Default time interval for marking a datanode as "stale", i.e., if the namenode has not received heartbeat msg fro

Re: Hadoop vs Ceph and GlusterFS

2013-12-31 Thread Jiayu Ji
I am not very familiar with Ceph and GlusterFS, but I know they are centralized file systems. In this kinds of FS, compute nodes and the storage nodes are separated. If the size of your data increases, the network may eventually become the bottleneck. Hadoop is a framework includes storage (HDFS)

Re: Map succeeds but reduce hangs

2013-12-31 Thread Hardik Pandya
what does your job log says? is yout hdfs-site configured properly to find 3 data nodes? this could very well getting stuck in shuffle phase last thing to try : does stop-all and start-all helps? even worse try formatting namenode On Tue, Dec 31, 2013 at 11:40 AM, navaz wrote: > Hi > > > I am

Map succeeds but reduce hangs

2013-12-31 Thread navaz
Hi I am running Hadoop cluster with 1 name node and 3 data nodes. My HDFS looks like this. hduser@nm:/usr/local/hadoop$ hadoop fs -ls /user/hduser/getty/gutenberg Warning: $HADOOP_HOME is deprecated. Found 7 items -rw-r--r-- 4 hduser supergroup 343691 2013-12-30 19:12 /user/hduser/getty/

Re: Write an object into hadoop hdfs issue

2013-12-31 Thread unmesha sreeveni
Is there a way to store the same object? On Mon, Dec 30, 2013 at 7:05 PM, Chris Mawata wrote: > Not unique to hdfs. The same thing would happen on your local file system > or anywhere and any way you store the state of the object outside of the > JVM. That is why singletons should not be serial

Re: Error: Java Heap space

2013-12-31 Thread Dieter De Witte
Is it happening in map or reduce phase, and are you allocating anything in your mappers/reducers? For example if you are having a collection in one of them, this might be causing the heap error. Also what are the specs of your nodes? How many concurrent map en reduce tasks per tasktracker,... We ca