Hi Guys,
It took me a while to build and get Hadoop 2.4.0 working on my Windows 8
machine. Thus I would like to share it.
See my blog post for details:
http://zutai.blogspot.com/2014/06/build-install-and-run-hadoop-24-240-on.html
.
Regards,
Zutai
You can get the wikipedia data from it's website, it's pretty big;
Regards,
*Stanley Shi,*
On Tue, Jul 8, 2014 at 1:35 PM, Du Lam delim123...@gmail.com wrote:
Configuration conf = getConf();
conf.setLong(mapreduce.input.fileinputformat.split.maxsize,1000);
// u can set this to some
This is not recommended.
You only backup the fsimage file, the data blocks, which are stored in
datanode, are not stored.
The chances may be if you removed some file after your fsimage backup,
those data blocks belonging to this file will be removed from all
datanodes; in this case, your namenode
There's a DistCP utility for this kind of purpose;
Also there's Spring XD there, but I am not sure if you want to use it.
Regards,
*Stanley Shi,*
On Mon, Jul 7, 2014 at 10:02 PM, Mohan Radhakrishnan
radhakrishnan.mo...@gmail.com wrote:
Hi,
We used a commercial FT and scheduler
Hi Tomek,
You have 9.26GB in 4 nodes what is 2.315GB on average. What is your value
of yarn.nodemanager.resource.memory-mb?
You consume 1GB of RAM per container (8 containers running = 8GB of memory
used). My idea is that, after running 8 containers (1 AM + 7 map tasks),
you have only 315MB of
Hi,
I¹m using Controlledjob and my code is:
ControlledJob doConcordance = new ControlledJob(
this.doParallelConcordance(), null);
....
control.addJob(doConcordance);
control.addJob(viableSubequenceMaxLength);
You can try snakebite https://github.com/spotify/snakebite.
$ snakebite ls -R path
I just run it to list 705K files and it went fine.
2014-05-30 20:42 GMT+02:00 Harsh J ha...@cloudera.com:
The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
utilities. The right way to extend
NOTICE: This e-mail message and any attachments are confidential, subject to
copyright and may be privileged. Any unauthorized use, copying or disclosure is
prohibited. If you are not the intended recipient, please delete and contact
the sender immediately. Please consider the environment
See http://hadoop.apache.org/mailing_lists.html#User
Cheers
2014-07-09 7:28 GMT-07:00 Kartashov, Andy andy.kartas...@mpac.ca:
NOTICE: This e-mail message and any attachments are confidential,
subject to copyright and may be privileged. Any unauthorized use, copying
or disclosure is
hi all,
I am trying to find documentation relavanet to 'rhadoop' on cdh4. If there are
any one in the group who has experience in 'rhadoop' can you provide me some
details like 1) installation procedure of rhadoop on cdh4.4.
regards,
raj
Looks like you may get better answer from cdh mailing list.
Cheers
On Jul 9, 2014, at 7:53 AM, Raj Hadoop hadoop...@yahoo.com wrote:
hi all,
I am trying to find documentation relavanet to 'rhadoop' on cdh4. If there
are any one in the group who has experience in 'rhadoop' can you provide
Have you restarted your Job History Server?
2014-05-30 4:56 GMT+02:00 ch huang justlo...@gmail.com:
hi,maillist:
i want remove jobs history logs ,and i configure the
following info in yarn-site.xml,but it seems no use ,why? ( i use CDH4.4
yarn ,i configue on each datanode ,and
Hello Dear,
I made an estimation of a number of nodes of a cluster that can be supplied by
720GB of data/day.
My estimation gave me 367 datanodes in a year. I'm a bit afraid by that amount
of datanodes.
The assumptions, I used are the followings :
- Daily supply (feed) : 720GB
-
You might need to set *yarn.application.classpath* in yarn-site.xml
*property*
* nameyarn.application.classpath/name*
*
Hello,
if I follow your numbers I see one missing fact: *What is the number of
HDDs per DataNode*?
Let's assume you use machines with 6 x 3TB HDDs per box, you would need
about 60 DataNodes
per year (0.75 TB per day x 3 for replication x 1.3 for overhead / ( nr of
HDDs per node x capacity per HDD
I am a beginner. But this seems to be similar to what I intend. The data
source will be external FTP or S3 storage.
Spark Streaming can read data from HDFS
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
,Flume http://flume.apache.org/, Kafka
Is your data already compressed? If it's not you can safely assume a
compression ratio of 5.
Olivier
On 9 Jul 2014 17:10, Mirko Kämpf mirko.kae...@gmail.com wrote:
Hello,
if I follow your numbers I see one missing fact: *What is the number of
HDDs per DataNode*?
Let's assume you use
367 nodes sounded quite high for that amount of data per day. You might
need 367 disks, but do your nodes have more than one disk?
You may also take into account the compression factor that you are likely
to use for the data on the cluster.
Oner
9 Tem 2014 19:00 tarihinde YIMEN YIMGA Gael
Hi Chris,
Actually I need this functionality for my research, basically for fault
tolerance. I can calculate some failure probability for some data nodes
after certain unit of time. So I need to copy all the blocks reside on
these nodes to another nodes.
Thanks
Yehia
On 7 July 2014 20:45,
Hello,
Is there a way to query the Resource Manager for configuration properties from
an external client process other than using the web interface?
Our background: We run a YARN application by running a Client on an external
machine that may access one of many remote Hadoop clusters. The
Instead of Resource-Manager-WebApp-Address/conf, If you have application
id and job id, you can query the Resource Manager for the configuration of
this particular application. You can use HTTP and Java API for that.
2014-07-09 21:42 GMT+02:00 Geoff Thompson ge...@bearpeak.com:
Hello,
Is
Hi, I am using Pegasus. Can someone help me with this error?
When I run the command list in the UI, after giving a demo command
(demo adds the graph catstar, but I get error afterwards), I get the
following
PEGASUS list
=== GRAPH LIST ===
14/07/09 14:45:22 WARN util.NativeCodeLoader: Unable to
You can get info about all blocks stored in perticuler data node, i,e block
report. But you to handle, move in block level not in file or start and end
bytes level.
On Thu, Jul 10, 2014 at 2:49 AM, Chris Mawata chris.maw...@gmail.com
wrote:
Haven't looked at the source but the thing you are
The balancer does something similar. It uses
DataTransferProtocol.replaceBlock.
On Wed, Jul 9, 2014 at 9:20 PM, sudhakara st sudhakara...@gmail.com wrote:
You can get info about all blocks stored in perticuler data node, i,e
block report. But you to handle, move in block level not in file or
hello
I have one use-case that spans multiple map tasks in hadoop environment. I
use hadoop 1.2.1 and with 6 task nodes. Each map task writes their output
into a file stored in hdfs. This file is shared across all the map tasks.
Though, they all computes thier output but some of them are
25 matches
Mail list logo