Auto clean DistCache?

2013-03-25 Thread Jean-Marc Spaggiari
Hi, Each time my MR job is run, a directory is created on the TaskTracker under mapred/local/taskTracker/hadoop/distcache (based on my configuration). I looked at the directory today, and it's hosting thousands of directories and more than 8GB of data there. Is there a way to automatically delet

Re: Auto clean DistCache?

2013-03-26 Thread Jean-Marc Spaggiari
t; >> Hi JM , >> >> Actually these dirs need to be purged by a script that keeps the last 2 >> days worth of files, Otherwise you may run into # of open files exceeds >> error. >> >> Thanks >> >> >> On Mar 25, 2013, at 5:16 PM, J

Re: Auto clean DistCache?

2013-03-27 Thread Jean-Marc Spaggiari
ld be inconsistency and you'll start seeing task initialization > failures due to no file found error. > > Koji > > > On Mar 26, 2013, at 9:00 PM, Jean-Marc Spaggiari wrote: > >> For the situation I faced I was really a disk space issue, not related >> to th

Re: Auto clean DistCache?

2013-03-28 Thread Jean-Marc Spaggiari
the > configuration property local.cache.size, which is measured in bytes. > """ > > And also, the maximum allowed dirs is also checked for automatically > today, to not violate the OS's limits. > > On Wed, Mar 27, 2013 at 7:07 PM, Jean-Marc Spaggiari > w

Migrate from 1.0 to 1.1

2013-04-12 Thread Jean-Marc Spaggiari
Hi, I'm currently running my cluster under hadoop 1.0.3 and I'm looking to upgrate it to 1.1.2 (or 1.1.3 if coming soon). Is there any specific documentation I should follow? So far I found this one: http://wiki.apache.org/hadoop/Hadoop_Upgrade Downtime is not an issue for me. Thanks, JM

Error writing file (Invalid argument)

2013-05-02 Thread Jean-Marc Spaggiari
Hi, I'm facing the issue below with Hadoop. Configuration: - 1 WAS node; - Replication factore setup to 1; - Short Circuit activated. Exception: 2013-05-02 14:02:41,063 INFO org.apache.hadoop.hdfs.server. datanode.DataNode: opWriteBlock BP-1179773663-10.238.38.193-1363960970263:blk_7082931589039

Re: get recent changed files in hadoop

2013-05-07 Thread Jean-Marc Spaggiari
You can still parse the hadoop ls ouput with bash and sort it (revert, cut, sort, etc.), but that will read all the entries, just just the x first one... 2013/5/7 Winston Lin : > look like we cannot even sort the output of ls by date with fs command? > > In *ux system, we can do ls -t ...to sort

Re: YCSB on hbase fails with java.io.EOFException

2013-05-15 Thread Jean-Marc Spaggiari
Hi Manoj, Are you able to do basic operations with HBase shell? Like scan, put, get, etc.? WebUI is show tables correctly too? 2013/5/15 Manoj S > Yes, it is running fine. > > I am using version 0.94. > > Thanks, > Manoj > > On Wed, May 15, 2013 at 10:14 AM, Ted Yu wrote: > >> I assume region

Re: YCSB on hbase fails with java.io.EOFException

2013-05-15 Thread Jean-Marc Spaggiari
Regarding the version, which 0.94 are you using? 0.94.1? Or 0.94.7? I will install it locally and give it a try... 2013/5/15 S, Manoj > ** > Hi Jean, > > Yeah. All these operations work fine. > > Thanks, > Manoj > > > > -Original Message----- >

Re: YCSB on hbase fails with java.io.EOFException

2013-05-15 Thread Jean-Marc Spaggiari
Hi Manoj, Is it a production environment? Or a test environment? If it's a test one, can you try with a recent versoin like 0.94.7? Else I will try to instal 0.94.1 locally... JM 2013/5/15 Manoj S > Hi Jean, > > I am using 0.94.1 > > -Manoj > > On May 15, 2013 8:0

Re: YCSB on hbase fails with java.io.EOFException

2013-05-15 Thread Jean-Marc Spaggiari
your help! > > -Manoj > > On May 16, 2013 8:46 AM, "Jean-Marc Spaggiari" > wrote: > > > > Hi Manoj, > > > > Is it a production environment? Or a test environment? If it's a test > one, can you try with a recent versoin like 0.94.7? > &

Re: Getting this exception "java.net.ConnectException: Call From ubuntu/127.0.0.1 to ubuntu:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: ht

2013-05-16 Thread Jean-Marc Spaggiari
ed"; > > > > Hi JM, > > Actually I am using Apache Hadoop 1.1.1 not the cloudera installer. > > When I run the file system check I get the same error. > > Thanks, > Bala. > > > On Thu, May 16, 2013 at 7:47 AM, Jean-Marc Spaggiari wrote: > >> Hi Bal

Re: YCSB on hbase fails with java.io.EOFException

2013-05-16 Thread Jean-Marc Spaggiari
Hi Manoj, I installed 0.94.1 and YCSB and everything is working fine. Which YCSB version are you using? Can you try to export the one from GitHub? JM 2013/5/15 Jean-Marc Spaggiari > No worries. I will prepare a VM and install 0.94.1 on it. > > I will let you know when it's

Re: YCSB on hbase fails with java.io.EOFException

2013-05-16 Thread Jean-Marc Spaggiari
he.hadoop.hbase.util.RegionSplitter usertable UniformSplit -c 200 > -f family" > 5) ycsb-0.1.4/bin/ycsb load hbase -P workloads/workloada -p > columnfamily=family -p recordcount=10 -threads 3 -s > data.tmp > > Step 5 threw me the error. Can you tell me if i have missed something

Re: YCSB on hbase fails with java.io.EOFException

2013-05-16 Thread Jean-Marc Spaggiari
ase-site.xml? > Thanks > On Fri, May 17, 2013 at 7:38 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> Hum. >> >> I did slightly different steps. >> >> 1) I try on standalone. I wlil try on fully distributed soon. >> 2) I did n

Re: Viewing snappy compressed files

2013-05-21 Thread Jean-Marc Spaggiari
Hi Robert, What command are you using to extract your data from hadoop? JM Hey, there. My Google skills have failed me, and I hope someone here can point me in the right direction. ** ** We’re storing data on our Hadoop cluster in Snappy compressed format. When we pull a raw file down and

Re: Child Error

2013-05-24 Thread Jean-Marc Spaggiari
Hi Jim, Which JVM are you using? I don't think you have any memory issue. Else you will have got some OOME... JM 2013/5/24 Jim Twensky > Hi again, in addition to my previous post, I was able to get some error > logs from the task tracker/data node this morning and looks like it might > be a j

Re: Child Error

2013-05-25 Thread Jean-Marc Spaggiari
two days but I'm almost clueless. > > Thanks, > Jim > > > > > On Fri, May 24, 2013 at 10:32 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> Hi Jim, >> >> Which JVM are you using? >> >> I don't think you have

Re: Child Error

2013-05-28 Thread Jean-Marc Spaggiari
t; javax.security.auth.login.LoginContext.invokeCreatorPriv(LoginContext.java:703) > at javax.security.auth.login.LoginContext.login(LoginContext.java:575) > at > org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:482) > > > On Sat, May

Re: Install hadoop on multiple VMs in 1 laptop like a cluster

2013-05-31 Thread Jean-Marc Spaggiari
Hi Sai Sai, You can take a look at that also: http://goo.gl/iXzae I just did that yesterday for some other folks I'm working with. Maybe not the best way, but working like a charm. JM 2013/5/31 shashwat shriparv : > Try this > http://www.youtube.com/watch?v=gIRubPl20oo > there will be three vid

Re: Logs directory for HBASE and Task Tracker

2013-06-05 Thread Jean-Marc Spaggiari
Hi Rams, For HBase, the logs are under the logs directory. For the task tracker, it's under your hadoop logs directory. You should find a file like hbase-hbase-master-.log where is your master host name. JM 2013/6/5 Ramasubramanian : > Hi, > > Can you please help me in letting me know

Re: Logs directory for HBASE and Task Tracker

2013-06-05 Thread Jean-Marc Spaggiari
ams > On 05-Jun-2013, at 8:17 PM, Jean-Marc Spaggiari > wrote: > >> Hi Rams, >> >> For HBase, the logs are under the logs directory. >> >> For the task tracker, it's under your hadoop logs directory. >> >> You should find a file like hbase-h

Could not obtain block?

2013-11-08 Thread Jean-Marc Spaggiari
Hi, I have a situation here that I'm wondering where it's coming from. I know that I'm using a pretty old version... 1.0.3 When I fsck a file, I can see tha there is one block but when I try to get the file, I'm not able to retrieve this block. I tought it was because the file was opened, so I ki

Re: reducer not starting

2012-11-21 Thread Jean-Marc Spaggiari
Just FYI, you don't need to stop the job, update the host, and retry. Just update the host while the job is running and it should retry and restart. I had a similar issue with one of my node where the hosts file were not updated. After the updated it has automatically resume the work... JM 2012

MapReduce logs

2012-11-21 Thread Jean-Marc Spaggiari
Hi, When we run a MapReduce job, the logs are stored on all the tasktracker nodes. Is there an easy way to agregate all those logs together and see them in a single place instead of going to the tasks one by one and open the file? Thanks, JM

Re: MapReduce logs

2012-11-21 Thread Jean-Marc Spaggiari
For huge logs this becomes slow and time consuming. > > Hope this helps. > > Regards, > Dino Kečo > msn: xdi...@hotmail.com > mail: dino.k...@gmail.com > skype: dino.keco > phone: +387 61 507 851 > > > On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari < > jean

CheckPoint Node

2012-11-22 Thread Jean-Marc Spaggiari
Hi, I'm reading a bit about hadoop and I'm trying to increase the HA of my current cluster. Today I have 8 datanodes and one namenode. By reading here: http://www.aosabook.org/en/hdfs.html I can see that a Checkpoint node might be a good idea. So I'm trying to start a checkpoint node. I looked

Re: CheckPoint Node

2012-11-22 Thread Jean-Marc Spaggiari
over internet. JM 2012/11/22, Jean-Marc Spaggiari : > Hi, > > I'm reading a bit about hadoop and I'm trying to increase the HA of my > current cluster. > > Today I have 8 datanodes and one namenode. > > By reading here: http://www.aosabook.org/en/hdfs.html I can

Re: CheckPoint Node

2012-11-22 Thread Jean-Marc Spaggiari
abilities, and this is documented > at > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html. > > On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari > wrote: >> Replying to myself ;) >> >> By digging a bit more I figured th

Re: CheckPoint Node

2012-11-22 Thread Jean-Marc Spaggiari
> journal log mount or quorum setup would automatically act as > safeguards for the FS metadata. > > On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari > wrote: >> Hi Harsh, >> >> Thanks for pointing me to this link. I will take a close look at it. >> >&

Re: CheckPoint Node

2012-11-22 Thread Jean-Marc Spaggiari
continue to run with > the lone remaining disk, but its not a good idea to let it run for too > long without fixing/replacing the disk, for you will be losing out on > redundancy. > > On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari > wrote: >> Hi Harsh, >> >

Diskspace usage

2012-11-22 Thread Jean-Marc Spaggiari
Hi, Quick question on the way hadoop is using the disk space. Let's say I have 8 nodes. 7 of them with a 2T disk, and one with a 256GB. Is hadoop going to use the 256GB until it's full, then continue with the other nodes only but keeping the 256GB live? Or will it bring the 256GB node down when

Re: CheckPoint Node

2012-11-30 Thread Jean-Marc Spaggiari
like bin/hadoop --showparameters Thanks, JM 2012/11/22, Jean-Marc Spaggiari : > Perfect. Thanks again for your time! > > I will first add another drive on the Namenode because this will take > 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x > and most probably wi

Re: CheckPoint Node

2012-11-30 Thread Jean-Marc Spaggiari
Sorry about that. My fault. I have put this on the core-site.xml file but should be on the hdfs-site.xml... I moved it and it's now working fine. Thanks. JM 2012/11/30, Jean-Marc Spaggiari : > Hi, > > Is there a way to ask Hadoop to display its parameters? > > I have upd

Re: M/R, Strange behavior with multiple Gzip files

2012-12-06 Thread Jean-Marc Spaggiari
Hi, Have you configured the mapredsite.xml to tell where the job tracker is? If not, your job is running on the local jobtracker, running the tasks one by one. JM PS: I faced the same issue few weeks ago and got the exact same behaviour. This (above) solved the issue. 2012/12/6, x6i4uybz labs :

Force task location on input split location?

2012-12-07 Thread Jean-Marc Spaggiari
Hi, Is there a way for force the tasks from a MR job to run ONLY on the taskservers where the input split location is? I mean, on the taskdetails UI, I can see all my tasks (25), and some of them have Machine == Input split Location. But some don't. So I'm wondering if there is a way to force ha

Re: Assigning reduce tasks to specific nodes

2012-12-07 Thread Jean-Marc Spaggiari
Hi Hiroyuki, Have you made any progress on that? I'm also looking at a way to assign specific Map tasks to specific nodes (I want the Map to run where the data is). JM 2012/12/1, Michael Segel : > I haven't thought about reducers but in terms of mappers you need to > override the data locality

How to build Hadoop 1.0.3?

2012-12-07 Thread Jean-Marc Spaggiari
Hi, I have successfuly retreived hadoop trunk, and built it following the wiki instructions. I have now extracted the Hadoop 1.0.3 branch and would like to build it, but seems the same build instructions are not working for 1.0.3 Can anyone point me to the steps to follow to build this release?

Re: How to build Hadoop 1.0.3?

2012-12-07 Thread Jean-Marc Spaggiari
Thanks a lot! It's working very well! 2012/12/7, Harsh J : > I have some simple steps up at > http://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk for > most of the branches that you can follow (you need the branch-1 > instructions). > > On Sat, Dec 8, 201

Re: Assigning reduce tasks to specific nodes

2012-12-08 Thread Jean-Marc Spaggiari
containers, but where to allocate MapTasks. > > If you have any question, please ask me. > > Thanks, > Tsuyoshi > > > On Sat, Dec 8, 2012 at 4:51 AM, Jean-Marc Spaggiari > > wrote: > >> Hi Hiroyuki, >> >> Have you made any progress on that? >> &g

Re: Force task location on input split location?

2012-12-08 Thread Jean-Marc Spaggiari
only 1? Thanks, JM 2012/12/8, Harsh J : > Answer depends on a couple of features to be present in your version > of Hadoop, and is inline. > > On Fri, Dec 7, 2012 at 11:38 PM, Jean-Marc Spaggiari > wrote: >> Hi, >> >> Is there a way for force the tasks from a MR

Re: Force task location on input split location?

2012-12-08 Thread Jean-Marc Spaggiari
ity (N options). > > On Sat, Dec 8, 2012 at 11:27 PM, Jean-Marc Spaggiari > wrote: >> Hi Harsh, >> >> Thanks for your help. >> >> mapred.fairscheduler.locality.delay seems to be working very well for >> me. I have set it with 60s and JoInProgress picked

Re: compile hadoop-1.1.1 on zLinux using apache maven

2012-12-13 Thread Jean-Marc Spaggiari
Fyi, I compiles 1.0.3 successfully using ant last week. So steps seems still to be good. JM Le 13 déc. 2012 05:28, "Nicolas Liochon" a écrit : > branch1 does not use maven but ant. > There are some docs here: > http://wiki.apache.org/hadoop/BuildingHadoopFromSVN, not sure it's > totally up to da

Re: compile hadoop-1.1.1 on zLinux using apache maven

2012-12-13 Thread Jean-Marc Spaggiari
Hi, Take a look here: I think you should be using and instead... http://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk#Building_branch-1 JM

Misconfiguration of hdfs-site.xml

2012-12-18 Thread Jean-Marc Spaggiari
Hi, For months now I'm using my hadoop cluster with absolutly nothing related to the drive directory on my hdfs-site.xml file. It seems that it's using the hadoop.tmp.dir directory to store data. My hadoop.tmp.dir is pointing to /home/hadoop/haddop_drive/${user.name} and on my /home/hadoop/haddo

Re: Misconfiguration of hdfs-site.xml

2012-12-18 Thread Jean-Marc Spaggiari
> If you wish to move the location somewhere else, you will need to mv > the {data,name} directories elsewhere and re-point down to that path > component again. > > On Wed, Dec 19, 2012 at 1:58 AM, Jean-Marc Spaggiari > wrote: >> Hi, >> >> For months now I'm

Hadoop harddrive space usage

2012-12-28 Thread Jean-Marc Spaggiari
Hi, Quick question regarding hard drive space usage. Hadoop will distribute the data evenly on the cluster. So all the nodes are going to receive almost the same quantity of data to store. Now, if on one node I have 2 directories configured, is hadoop going to assign twice the quantity on this n

Re: Hadoop harddrive space usage

2012-12-30 Thread Jean-Marc Spaggiari
s. Now if we try to store a file > that takes up 3 blocks, Hadoop will just place 1 block in each node. > > Hope that helps. > > Regards, > Robert > > On Fri, Dec 28, 2012 at 9:12 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> Hi, >> >

Re: unsubscribe

2013-01-08 Thread Jean-Marc Spaggiari
https://www.google.ca/search?q=unsubscribe+hadoop&oq=unsubscribe+hadoop&sugexp=chrome,mod=5&sourceid=chrome&ie=UTF-8 First link... http://hadoop.apache.org/mailing_lists.html 2013/1/8, Melody Fleishauer : > > >

Re: How Hadoop decide the capacity of each node

2013-01-09 Thread Jean-Marc Spaggiari
Hi Dora, Hadoop is not deciding. It's "simply" pushing the same amount of data on each node. If a node is out of space, it's removed from the "write" list and is used only for reads. Hadoop is only using the space it needs. So if it uses only 50G it's because it don't need the extra 50G yet. JM

Exit code 126?

2013-01-11 Thread Jean-Marc Spaggiari
Hi, I ran a very simple rowcount (HBase) MR today, and one of the tasks failed with the status 126. Seems that there is no logs at all on the server side. Any idea what this mean? Thanks, JM java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)

Re: Exit code 126?

2013-01-11 Thread Jean-Marc Spaggiari
s this issue. It has been applied to the apache distro, but I don't > think it is incorporated in and release of CDH yet. > > Dave Shine > Sr. Software Engineer > 321.939.5093 direct | 407.314.0122 mobile > CI Boost™ Clients Outperform Online™ www.ciboost.com > >

Re: Exit code 126?

2013-01-14 Thread Jean-Marc Spaggiari
™ Clients Outperform Online™ www.ciboost.com > > > -Original Message----- > From: Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org] > Sent: Friday, January 11, 2013 5:36 PM > To: user@hadoop.apache.org > Subject: Re: Exit code 126? > > Do you think it can be > h

Re: Hadoop NON DFS space

2013-01-16 Thread Jean-Marc Spaggiari
I think you can still run with the OS on another drive, or on a live USB drive, or even on the memory only, loaded from the network while the server is booting from the network drive, etc. No? JM 2013/1/16, Mohammad Tariq : > That would be really cool Chris. > +1 for that. > > Warm Regards, > Tar

Re: does "fs -put " create subdirectories?

2013-01-16 Thread Jean-Marc Spaggiari
Yes it does, you can just try ;) hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls /user/ Found 1 items drwxr-xr-x - hbase supergroup 0 2013-01-03 09:54 /user/hbase hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -put CHANGES.txt /user/test/CHANGES.txt hadoop@node3:~/hadoop-1.0.3$ bin/hadoop

Re: Hadoop NON DFS space

2013-01-17 Thread Jean-Marc Spaggiari
out > 13/01/17 08:44:27 WARN mapred.JobClient: Error reading task outputhttp:// > rdcesx12078.race.sas.com:50060/tasklog?plaintext=true&attemptid=attempt_201301170837_0004_m_09_0&filter=stderr > 13/01/17 08:44:28 INFO mapred.JobClient: map 82% reduce 25% > 13/01/17 08:44:31 INFO mapred.JobClient: map 83%

Re: Problems

2013-01-17 Thread Jean-Marc Spaggiari
Hi Sean, This is an issue with your JVM. Not related to hadoop. Which JVM are you using, and can you try with the last from Sun? JM 2013/1/17, Sean Hudson : > Hi, > I have recently installed hadoop-1.0.4 on a linux machine. Whilst > working through the post-install instructions contained

Re: Problems

2013-01-17 Thread Jean-Marc Spaggiari
uild 1.6.0_25-b06) > Java HotSpot(TM) Client VM (build 20.0-b11, mixed mode, sharing) > > Would you advise obtaining a later Java version? > > Sean > > -Original Message- > From: Jean-Marc Spaggiari > Sent: Thursday, January 17, 2013 2:52 PM > To: user@hadoop.apache.

Re: unsubscribe

2013-01-17 Thread Jean-Marc Spaggiari
https://www.google.ca/search?q=unsubscribe+hadoop+mailing+list Just follow the first link... 2013/1/17, Ignacio Aranguren : > unsubscribe >

Re: Estimating disk space requirements

2013-01-18 Thread Jean-Marc Spaggiari
Hi Panshul, If you have 20 GB with a replication factor set to 3, you have only 6.6GB available, not 11GB. You need to divide the total space by the replication factor. Also, if you store your JSon into HBase, you need to add the key size to it. If you key is 4 bytes, or 1024 bytes, it makes a di

Re: Estimating disk space requirements

2013-01-18 Thread Jean-Marc Spaggiari
pace on > HDFS.. (combined free space of all the nodes connected to the cluster). So > I can connect 20 nodes having 40 GB of hdd on each node to my cluster. Will > this be enough for the storage? > Please confirm. > > Thanking You, > Regards, > Panshul. > > > On Fri, Jan 18,

Re: Estimating disk space requirements

2013-01-18 Thread Jean-Marc Spaggiari
80 GB HDD? > > they are connected on a gigabit LAN > > Thnx > > > On Fri, Jan 18, 2013 at 2:26 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> 20 nodes with 40 GB will do the work. >> >> After that you will have to consider performanc

Re: Problems

2013-01-18 Thread Jean-Marc Spaggiari
t: Re: Problems > > Hi, > My Java version is > > java version "1.6.0_25" > Java(TM) SE Runtime Environment (build 1.6.0_25-b06) Java HotSpot(TM) Client > > VM (build 20.0-b11, mixed mode, sharing) > > Would you advise obtaining a later Java version? >

How to unsubcribe from the list (Re: unsubscribe)

2013-01-18 Thread Jean-Marc Spaggiari
Search on google and clic on the first link ;) https://www.google.ca/search?q=unsubscribe+hadoop+mailing+list 2013/1/18, Cristian Cira : > Please unsubscribe be from this news feed > > Thank you > > Cristian Cira > Graduate Research Assistant > Parallel Architecture and System Laboratory(PASL) >

Re: Prolonged safemode

2013-01-19 Thread Jean-Marc Spaggiari
Hi Tariq, I often have to force HDFS to go out of safe mode manually when I restart my cluster (or after power outage) I never tought about reporting that ;) I'm using hadoop-1.0.3. I think it was because of the MR files still not replicated on enought nodes. But not 100% sure. JM 2013/1/19

Re: Problems

2013-01-22 Thread Jean-Marc Spaggiari
a > way to back-out the distributed-filesystem. Will the existence of this > distributed-filesystem interfere with my Standalone tests? > > Regards, > > Sean Hudson > > -Original Message- > From: Jean-Marc Spaggiari > Sent: Friday, January 18, 2013 3:24 PM > T

Re: reg max map task config

2013-01-28 Thread Jean-Marc Spaggiari
Hi Harsh, Is there a recommanded way to configure this value? I think I read something like 0.7 x cores but I'm not 100% sure... JM 2013/1/28, Manoj Babu : > Thank you Harsh. > > Cheers! > Manoj. > > > On Mon, Jan 28, 2013 at 10:50 AM, Harsh J wrote: > >> Hi, >> >> As noted on >> http://hadoop.

Re: Maximum Storage size in a Single datanode

2013-01-30 Thread Jean-Marc Spaggiari
Hi, Also, think about the memory you will need in your DataNode to serve all this data... I'm not sure there is any server which can take that today. You need a certain amount of memory per block in the DN. With all this data, you will have S many blocks... Regarding RH vs Ubuntu, I think Ubu

Re: what will happen when HDFS restarts but with some dead nodes

2013-01-30 Thread Jean-Marc Spaggiari
Hi Nan, When the Namenode will EXIT the safemode, you, you can assume that all blocks ARE fully replicated. If the Namenode is still IN safemode that mean that all blocks are NOT fully replicated. JM 2013/1/29, Nan Zhu : > So, we can assume that all blocks are fully replicated at the start point

Hadoop + Fuse: Compilation errors

2013-02-08 Thread Jean-Marc Spaggiari
Hi, I'm trying to install FUSE with Hadoop 1.0.3 and I'm facing some issues. I'm following the steps I have there: http://wiki.apache.org/hadoop/MountableHDFS I have extracted 1.0.3 code using svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-X.Y.Z/ hadoop-common-X.Y.Z ant

Re: Hadoop + Fuse: Compilation errors

2013-02-08 Thread Jean-Marc Spaggiari
later on a compilation issue, but at least I made some progress... JM 2013/2/8, Jean-Marc Spaggiari : > Hi, > > I'm trying to install FUSE with Hadoop 1.0.3 and I'm facing some issues. > > I'm following the steps I have there: > http://wiki.apache.org/hadoop/Mountabl

How to installl FUSE with Hadoop 1.0.x?

2013-02-08 Thread Jean-Marc Spaggiari
Hi, I'm wondering what's the best way to install FUSE with Hadoop 1.0.3? I'm trying to follow all the steps described here: http://wiki.apache.org/hadoop/MountableHDFS but it's failing on each one, taking hours to fix it and move to the next one. So I think I'm following the wrong path. There sh

Re: How to installl FUSE with Hadoop 1.0.x?

2013-02-08 Thread Jean-Marc Spaggiari
one and you should be able to get a successful result > if you have all the necessary dependencies. > > On Fri, Feb 8, 2013 at 10:00 PM, Jean-Marc Spaggiari > wrote: >> Hi, >> >> I'm wondering what's the best way to install FUSE with Hadoop 1.0.3? >> >

Mutiple dfs.data.dir vs RAID0

2013-02-10 Thread Jean-Marc Spaggiari
Hi, I have a quick question regarding RAID0 performances vs multiple dfs.data.dir entries. Let's say I have 2 x 2TB drives. I can configure them as 2 separate drives mounted on 2 folders and assignes to hadoop using dfs.data.dir. Or I can mount the 2 drives with RAID0 and assigned them as a sing

Re: Mutiple dfs.data.dir vs RAID0

2013-02-10 Thread Jean-Marc Spaggiari
JBOD, you lost > one disk. > > -Michael > > On Feb 10, 2013, at 8:58 PM, Jean-Marc Spaggiari > wrote: > >> Hi, >> >> I have a quick question regarding RAID0 performances vs multiple >> dfs.data.dir entries. >> >> Let's say I have 2 x 2

Re: Mutiple dfs.data.dir vs RAID0

2013-02-10 Thread Jean-Marc Spaggiari
into the picture here ... LVM is on > the software layer and (hopefully) the RAID/JBOD stuff is at the > hardware layer (and in the case of HDFS, LVM will only add unneeded > overhead). > > -Michael > > On Feb 10, 2013, at 9:19 PM, Jean-Marc Spaggiari > wrote: > >> T

Re: Mutiple dfs.data.dir vs RAID0

2013-02-11 Thread Jean-Marc Spaggiari
expose >> each disk as its own RAID0 volume... >> >> Not sure why or where LVM comes into the picture here ... LVM is on >> the software layer and (hopefully) the RAID/JBOD stuff is at the >> hardware layer (and in the case of HDFS, LVM will only add unneeded >> ove

r0.20.2 documentation gone?

2013-02-13 Thread Jean-Marc Spaggiari
Hi, http://hadoop.apache.org/docs/r0.20.2/hdfs-default.html is not working anymore. Only 0.19 pages are still there. Is there any reason? Where can we found the documentatino for 0.20.2? Thanks, JM

Re: Piping output of hadoop command

2013-02-18 Thread Jean-Marc Spaggiari
Hi Julian, I think it's not outputing on the standard output bu on the error one. You might want to test that: hadoop fs -copyToLocal FILE_IN_HDFS 1>&2 | ssh REMOTE_HOST "dd of=FILE_ON REMOTE_HOST" Which will redirect the stderr to the stdout too. Not sure, but it might be your issue. JM 2013

Re: In Compatible clusterIDs

2013-02-20 Thread Jean-Marc Spaggiari
Hi Nagarjuna, Is it a test cluster? Do you have another cluster running close-by? Also, is it your first try? It seems there is some previous data in the dfs directory which is not in sync with the last installation. Maybe you can remove the content of /Users/nagarjunak/Documents/hadoop-install/

Re: copy chunk of hadoop output

2013-02-20 Thread Jean-Marc Spaggiari
But be careful. hadoop fs -cat will retrieve the entire file and last only when it will have retrieve the last bytes you are looking for. If your file is many GB big, it will take a lot of time for this command to complete and will put some pressure on your network. JM 2013/2/19, jamal sasha :

Re: copy chunk of hadoop output

2013-02-20 Thread Jean-Marc Spaggiari
ttrace: src: > /127.0.0.1:50010, dest: /127.0.0.1:58802, bytes: 198144, op: > HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-1698829178_1, offset: 0, > srvID: DS-1092147940-192.168.2.1-50010-1349279636946, blockid: > BP-1461691939-192.168.2.1-1349279623549:blk_2568668834545125596_73870, > duration: 19207000 >

Re: About Hadoop Deb file

2013-02-21 Thread Jean-Marc Spaggiari
Hi Mayur, Where have you downloaded the DEB files? Are they Debian related? Or Unbuntu related? Unbuntu is not worst than CentOS. They are just different choices. Both should work. JM Hi Ma 2013/2/21 Harsh J > Try the debs from the Apache Bigtop project 0.3 release, its a bit of > an older 1.x

Re: About Hadoop Deb file

2013-02-21 Thread Jean-Marc Spaggiari
Mayur, Have you looked at that? http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ I just created a VM, installed Debian 64bits, downloaded the .deb file and installed it without any issue. Are you using Unbuntu 64bits? Or 32bits? JM 2013/2/21 Mayur Pati

Re: adding space on existing datanode ?

2013-02-22 Thread Jean-Marc Spaggiari
Hi Brice, To add disk space to you datanode you simply need to add another drive, then add it to the dfs.data.dir or dfs.datanode.data.dir entry. After a datanode restart, hadoop will start to use it. It will not balance the existing data between the directories. It will continue to add to the 2.

Re: About Hadoop Deb file

2013-02-22 Thread Jean-Marc Spaggiari
to take a look too. JM 2013/2/21 Mayur Patil > I am using 32 bits. I will look out for your link JM sir. > > > On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > >> Mayur, >> >> Have you looked at that? >> >

Re: About Hadoop Deb file

2013-02-24 Thread Jean-Marc Spaggiari
;HOME"} in >> concatenation (.) or string at /usr/bin/lintian*) and found many >> reference to it. You might want to take a look too. > > > I will try to search for Google query. > >> >> >> JM >> >> >> 2013/2/21 Mayur Patil &

Re: childerror

2013-02-24 Thread Jean-Marc Spaggiari
Hi Fatih, Have you looked in the logs files? Anything there? JM 2013/2/24 Fatih Haltas > I am always getting the Child Error, I googled but I could not solve the > problem, did anyone encounter with same problem before? > > > [hadoop@ADUAE042-LAP-V conf]$ hadoop jar > /home/hadoop/project/hado

Re: adding space on existing datanode ?

2013-02-25 Thread Jean-Marc Spaggiari
d add another complete > datanode (incrementing dfs.replication by 1) whereas here, I'd like just to > "extend" an existing one. Am I wrong ? > > > Le 22/02/2013 19:56, Patai Sangbutsarakum a écrit : > > Just want to add up from JM. > > If you already have

Re: Hadoop advantages vs Traditional Relational DB

2013-02-25 Thread Jean-Marc Spaggiari
Hi Oleg, Have you asked google first? http://indico.cern.ch/conferenceDisplay.py?confId=162202 http://www.wikidifference.com/difference-between-hadoop-and-rdbms/ http://iablog.sybase.com/paulley/2008/11/hadoop-vs-relational-databases/ JM 2013/2/25 Oleg Ruchovets > Hi , >Can you please sha

Re: JobTracker security

2013-02-26 Thread Jean-Marc Spaggiari
Maybe restrict access to the hadoop file(s) to the user1? 2013/2/26 Serge Blazhievsky : > I am trying to not to use kerberos... > > Is there other option? > > Thanks > Serge > > > On Tue, Feb 26, 2013 at 3:31 PM, Patai Sangbutsarakum > wrote: >> >> Kerberos >> >> From: Serge Blazhievsky >> Reply

Re: JobTracker security

2013-02-26 Thread Jean-Marc Spaggiari
Serge Blazhievsky : > hi Jean, > > Do you mean input files for hadoop ? or hadoop directory? > > Serge > > > On Tue, Feb 26, 2013 at 4:38 PM, Jean-Marc Spaggiari > wrote: >> >> Maybe restrict access to the hadoop file(s) to the user1? >> >> 2013/2/26

Re: Datanodes shutdown and HBase's regionservers not working

2013-02-26 Thread Jean-Marc Spaggiari
Hi Davey, So were you able to find the issue? JM 2013/2/25 Davey Yan : > Hi Nicolas, > > I think i found what led to shutdown of all of the datanodes, but i am > not completely certain. > I will return to this mail list when my cluster returns to be stable. > > On Mon, Feb 25, 2013 at 8:01 PM, N

Re: Is there a way to keep all intermediate files there after the MapReduce Job run?

2013-03-01 Thread Jean-Marc Spaggiari
Ling, do you have Hadoop: The Definitive Guide close-by? I think I remember somewhere they said about keeping the intermediate files. Take a look at keep.task.files.pattern... It might help you to keep some of the files you are looking for? Maybe not all... Or even maybe not any. JM 2013/3/1 Mi

Re: hadoop filesystem corrupt

2013-03-01 Thread Jean-Marc Spaggiari
Hi Mohit, Is your replication factor really setup to 1? "Default replication factor:1" Also, can you look into you data directories and ensure you always have the right structur and all the related META files? JM 2013/3/1 Mohit Vadhera : > Hi, > > While moving the data my data folder didn'

Re: Unknown processes unable to terminate

2013-03-04 Thread Jean-Marc Spaggiari
Hi Sai, Are you fine to kill all those process on this machine? If you need ALL those process to be killed, and if they are all Java processes, you can use killall -9 java. That will kill ALL the java process under this user. JM 2013/3/4 shashwat shriparv : > You can you kill -9 13082 > > Is the

Re: How to setup Cloudera Hadoop to run everything on a localhost?

2013-03-05 Thread Jean-Marc Spaggiari
Hi Anton, Can you try to add something like: your.local.ip.addressyourhostname into your hosts file? Like: 192.168.1.2 masterserver 2013/3/5 anton ashanin : > I am trying to run all Hadoop servers on a single Ubuntu localhost. All > ports are open and my /etc/hosts file is > > 127.

Re: How to setup Cloudera Hadoop to run everything on a localhost?

2013-03-05 Thread Jean-Marc Spaggiari
different IP address from my WiFi modem /router. > Will it be ok to add static address from 192.168.*.* to /etc/hosts in this > case? > > > > On Tue, Mar 5, 2013 at 9:47 PM, Jean-Marc Spaggiari > wrote: >> >> Hi Anton, >> >> Can you try to add so

Re:

2013-03-06 Thread Jean-Marc Spaggiari
Hi Ashish, It's operation you have to do on your side Have you tried google? https://www.google.ca/search?q=unsubscribe+hadoop.apache.org&aq=f&oq=unsubscribe+hadoop.apache.org&aqs=chrome.0.57.2271&sourceid=chrome&ie=UTF-8 JM 2013/3/6 : > Unsubscribe me > How many more times, I have to

Re: good way to measure and confirm.

2013-03-07 Thread Jean-Marc Spaggiari
Hi Patai, You might think about installing Ganglia. That will give you all the details of the CPU usage. user vs io vs idle, etc. JM 2013/3/7 Patai Sangbutsarakum : > Good morning Hadoopers! > > I am finding the way to prove that our cluster has cpu bound but not > io bound. Diamond + Graphite i

  1   2   >