That will break the consistency of the file system, but it doesn't hurt to
try.
On Jul 17, 2014 8:48 PM, Zesheng Wu wuzeshen...@gmail.com wrote:
How about write a new block with new checksum file, and replace the old
block file and checksum file both?
2014-07-17 19:34 GMT+08:00 Wellington
If the datanode is dead, block replications will be re-replicated on other
datanodes automatically. So, after a while, when those blocks have been
re-replicated, you should be able to delete the data on the dead nodes and
restart it. Here, I am assuming your cluster have enough space on other
live
Can you list the file using hadoop commands? for example, hadoop fs -ls
...?
On Tue, Apr 22, 2014 at 10:32 AM, Natalia Connolly
natalia.v.conno...@gmail.com wrote:
Hi Jay,
I am really not sure how to answer this question. Here is the full
error:
14/04/22 11:31:02 INFO
You can configure your hadoop cluster to use s3 as the file system.
Everything else should be same as for HDFS.
On Mon, Apr 21, 2014 at 7:21 AM, kishore alajangi alajangikish...@gmail.com
wrote:
Hi Experts,
We are running four node cluster which is installed cdh4.5 with cm4.8, We
have
Did you do fsck? And what's the result?
On Sun, Apr 20, 2014 at 12:14 PM, Amit Kabra amitkabrai...@gmail.comwrote:
1) ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f)
As the last map task is in pending state, it is possible that some issue is
happening within your cluster, for example, not enough memory, deadlock,
data problem etc. You can kill this map task manually, and see if the
problem can be solved.
On Sun, Apr 20, 2014 at 9:46 AM, Serge Blazhievsky
It seems you are using the local FS rather than HDFS. You need to make sure
your hdfs cluster is up and running.
On Thu, Apr 17, 2014 at 6:42 PM, Shengjun Xin s...@gopivotal.com wrote:
Did you start datanode service?
On Thu, Apr 17, 2014 at 9:23 PM, Karim Awara
The error message tells that you are using local FS rather than HDFS. So,
you need to make sure your HDFS cluster is up and running before running
any mapreduce jobs. For example, you can use fsck or other hdfs commands to
test if the HDFS cluster is running ok.
On Thu, Apr 17, 2014 at 8:51 AM,
what is your input data like?
On Apr 2, 2014 10:16 AM, ei09072 ei09...@fe.up.pt wrote:
After installing Hadoop 2.3.0 on windows 8, I tried to run the wordcount
example given. However I get the following error:
c:\hadoopbin\yarn jar share/hadoop/mapreduce/hadoop-
mapreduce-examples-2.3.0.ja
Dear subscribers,
My name is Shumin Guo. I am the author of the Hadoop management book Hadoop
Operations and Cluster Management
Cookbookhttp://www.amazon.com/Hadoop-Operations-Cluster-Management-Cookbook/dp/1782165169/ref=sr_1_1?ie=UTF8qid=1395873783sr=8-1keywords=hadoop+operations+and+cluster
If you want to do profiling on your hadoop cluster, the starfish project
might be interesting. You can find more info
http://www.cs.duke.edu/starfish/
Thanks,
Shumin
On Feb 25, 2014 3:31 PM, Thomas Bentsen t...@bentzn.com wrote:
Thanks a lot guys!
From Dieters original reply I got TeraSort
The value should be hdfs:///localhost:port
On Feb 24, 2014 6:37 AM, Chirag Dewan chirag.de...@ericsson.com wrote:
Hi All,
I am new to hadoop. I am using hadoop 2.2.0. I have a simple client code
which reads a file from HDFS on a single node cluster. Now when I run my
code using java -jar
You can extend the fileinputformat and set splittable to be false. More
info is in the java doc:
https://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/mapred/FileInputFormat.html
Shumin
On Feb 25, 2014 10:56 AM, java8964 java8...@hotmail.com wrote:
See my reply for another email today for
I Think the client side configuration will take effect.
Shumin
On Jul 12, 2013 11:50 AM, Shalish VJ shalis...@yahoo.com wrote:
Hi,
Suppose block size set in configuration file at client side is 64MB,
block size set in configuration file at name node side is 128MB and block
size set in
You also need to pay attention to the split boundary, because you don’t
want to split one line to different mappers. May be you can think about
multi-line input format.
Simon.
On Jul 6, 2013 10:18 AM, Sanjay Subramanian
sanjay.subraman...@wizecommerce.com wrote:
More mappers will make it
Yes, I agree with Bertrand. Hadoop can take a whole file as input and you
just put your compression code into the map method, and use the identity
reduce function that simply writes your compressed data on to HDFS by using
the file output format.
Thanks,
On Thu, Mar 7, 2013 at 7:35 AM, Bertrand
You can also try the following two commands:
1, hadoop job -status job-id
For example:
hadoop job -status job_201303021057_0004
I will get the following output:
Job: job_201303021057_0004
file:
hdfs://master:54310/user/ec2-user/.staging/job_201303021057_0004/job.xml
tracking URL:
Nitin is right. The hadoop Job tracker will schedule a job based on the
data block location and the computing power of the node.
Based on the number of data blocks, the job tracker will split a job into
map tasks. Optimally, map tasks should be scheduled on nodes with local
data. And also because
I used to have similar problem. Looks like there is a recursive folder
creation bug. How about you try remove the srcData from the dst, for
example use the following command:
*hadoop fs -cp s3n://acessKey:acesssec...@bucket.name/srcData /test/*
Or with distcp:
*hadoop distcp
To decommission a live datanode from the cluster, you can do the following
steps:
1, edit configuration file $HADOOP_HOME/conf/hdfs-site.xml, and add the
following property:
property
namedfs.hosts.exclude/name
value$HADOOP_HOME/conf/dfs-exclude.txt/value
/property
2, put the host name of the
You can always print out the hadoop classpath before running the hadoop
command, for example by editing the $HADOOP_HOME/bin/hadoop file.
HTH.
On Wed, Mar 6, 2013 at 5:01 AM, shubhangi shubhangi.g...@oracle.com wrote:
Hi All,
I am writing an application in c++, which uses API provided by
Oozie for mapreduce job flow management can be a good choice. It can be too
heavy weight for your problem.
Based on your description. I am simply assuming that you are processing
some static data files, for example, the files will not change on the way
of processing, and there are no
22 matches
Mail list logo