Re: How to change topology

2012-10-09 Thread Steve Loughran
On 8 October 2012 14:23, Shinichi Yamashita wrote: > Hi, > > I know that DataNode and TaskTracker must restart to change topology. > no, it's the Namenode and JobTracker that needs to be restarted; they are the bits that care where the boxes are. > Is there the method to execute the topology c

Re: sym Links in hadoop

2012-10-09 Thread Visioner Sadak
I am using hadoop 1.0.3 On Tue, Oct 9, 2012 at 12:22 PM, Visioner Sadak wrote: > Thanks Colin I tried using FileContext but the class is showing as > depricated > > On Tue, Oct 9, 2012 at 12:02 AM, Colin McCabe wrote: > >> You can create an HDFS symlink by using the FileContext#createSymlink >> f

DFS respond very slow

2012-10-09 Thread Alexey
Hi, I have an issues with hadoop dfs, I have 3 servers (24Gb RAM on each). The servers are not overloaded, they just have hadoop installed. One have datanode and namenode, second - datanode only, third - datanode and secondarynamenode. Hadoop datanodes have a max memory limit 8Gb. Default replica

Re: stable release of hadoop

2012-10-09 Thread Bejoy KS
Hi Nisha The current stable version is the 1.0.x releases. This is well suited for production environments. 0.23.x/2.x.x releases is of alpha quality and hence not that recommended on production. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: nisha

Re: stable release of hadoop

2012-10-09 Thread nisha
thnks it was not mentionnd in the latest rlease for 17 September, 2012: Release 0.23.3. rest releases its mentioned tht its aplha any idez whn the next stable rel wil cume... On Tue, Oct 9, 2012 at 5:15 PM, Bejoy KS wrote: > ** > Hi Nisha > > The current stable version is the 1.0.x releases. Thi

Re: sym Links in hadoop

2012-10-09 Thread Dave Beech
A lot of code in Hadoop is marked "Deprecated". This doesn't mean you shouldn't use it ;) Cheers, Dave On 9 October 2012 09:28, Visioner Sadak wrote: > I am using hadoop 1.0.3 > > On Tue, Oct 9, 2012 at 12:22 PM, Visioner Sadak > wrote: >> >> Thanks Colin I tried using FileContext but the class

Re: sym Links in hadoop

2012-10-09 Thread Dave Beech
Actually, I don't think the FileContext class Colin mentioned is the one you are talking about. Hadoop 1.0.3 doesn't have org.apache.hadoop.fs.FileContext, which is the class you would need, but it does have org.apache.hadoop.metrics.file.FileContext which is something completely different. Symli

Re: sym Links in hadoop

2012-10-09 Thread Visioner Sadak
yes dave u r right man i think symlink support is not available for 1.x.x rel ... i did searched a lot :) i cant move to lower versions act i wanted webhdfs thts why i needed 1.x.x versions its a complete deadlock for me know i guess coz i needed webhdfs plus symlink :) On Tue, Oct 9, 20

Re: Acces HAR over http through browser

2012-10-09 Thread Visioner Sadak
i tried these 3 combinations all in vain :) if my har location is /user/testhar.har/test.jpg On Sat, Oct 6, 2012 at 3:08 PM, Visioner Sadak wrote: > > > Hello experts any thoughts on how to access HAR files thru http if its a > normal file then i can access it by usin

Re: File block size use

2012-10-09 Thread Anna Lahoud
Bejoy - I tried this technique a number of times, and was not able to get this to work. My files remain as they were on input. Is there a version I need (beyond 0.20.2) to make this work, or another setting that could prevent it from working? On Tue, Oct 2, 2012 at 12:23 AM, Bejoy KS wrote: > **

Re: File block size use

2012-10-09 Thread Anna Lahoud
Raj - I was not able to get this to work either. On Tue, Oct 2, 2012 at 10:52 AM, Raj Vishwanathan wrote: > I haven't tried it but this should also work > > hadoop fs -Ddfs.block.size= -cp src dest > > Raj > > -- > *From:* Anna Lahoud > *To:* user@hadoop.apache

Hive-Site XML changing any proprty.

2012-10-09 Thread Uddipan Mukherjee
Hi hadoop, hive gurus, I have a requirement to change the path of the scratch folder of Hive. Hence I have added following property in Hive-Site.xml and changed its value as required. hive.exec.scratchdir Scratch space for Hive jobs But still it is not reflecting as required. Do

Re: Hive-Site XML changing any proprty.

2012-10-09 Thread nagarjuna kanamarlapudi
by restarting the hive server your problem should be solved. Not sure if we have any other ways of starting the hive server other than . 1. bin/hive --service hiveserver 2. HIVE_PORT= ./hive --service hiveserver Regards, Nagarjuna On Tue, Oct 9, 2012 at 8:10 PM, Uddipan Mukherjee < uddip

Re: How to change topology

2012-10-09 Thread Shinichi Yamashita
Hi Steve, Thank you for your reply. > no, it's the Namenode and JobTracker that needs to be restarted; > they are the bits that care where the boxes are. I confirmed it in my cluster, and I understood it as follows. First, the resolved node information is recorded in ConcurrentHashMap. Next same

Re: File block size use

2012-10-09 Thread Raj Vishwanathan
Anna I misunderstood your problem. I thought you wanted to change the block size of every file. I didn' t realize that you were aggregating multiple small files into different, albeit smaller, set of larger files of a bigger block size  to improve performance.  I think as Chris suggested you ne

Re: File block size use

2012-10-09 Thread Anna Lahoud
You are correct that I want to create a small number of large files from a large number of small files. The only solution that has worked, as you say, has been a custom M/R job. Thank you for the help and ideas. On Tue, Oct 9, 2012 at 12:09 PM, Raj Vishwanathan wrote: > Anna > > I misunderstood

using sequencefile generated by Sqoop in Mapreduce

2012-10-09 Thread Kartashov, Andy
Guys, I have trouble using sequence file in Mar-Reduce. The output I get is very last record. I am creating sequence file while importing MySQL table into Hadoop using: $Sqoop import.. --as-sequencefile I am then are trying to read from this file into the mapper and create keys from obje

Re: How to change topology

2012-10-09 Thread Steve Loughran
On 9 October 2012 16:51, Shinichi Yamashita wrote: > Hi Steve, > > Thank you for your reply. > > > > no, it's the Namenode and JobTracker that needs to be restarted; > > they are the bits that care where the boxes are. > > I confirmed it in my cluster, and I understood it as follows. > First, the

Re: How to change topology

2012-10-09 Thread Ted Dunning
On Tue, Oct 9, 2012 at 12:17 PM, Steve Loughran wrote: > > > On 9 October 2012 16:51, Shinichi Yamashita wrote: > >> Hi Steve, >> >> Thank you for your reply. >> >> >> > no, it's the Namenode and JobTracker that needs to be restarted; >> > they are the bits that care where the boxes are. >> >> I

RE: using sequencefile generated by Sqoop in Mapreduce

2012-10-09 Thread Kartashov, Andy
Gents, please ignore my below. Everything works as a glove. conf.setInputFormat(SequenceFileInputFormat.class) indeed works well with Sqoop generated class. The reason why I was getting only the last line in my output is because I failed to notice that I am using fs.create() i/o fs.append(). *

Re: HDFS-347 and HDFS-2246 issues different

2012-10-09 Thread Colin McCabe
Would it make more sense to post these comments on the JIRA? This list is more for user issues. In general, I don't see why the DataNode being "dead" has anything to do with whether we are reading the wrong data by reading the files that it used to manage. Files in HDFS are append-only, and our c

issue with permissions of mapred.system.dir

2012-10-09 Thread Goldstone, Robin J.
I am bringing up a Hadoop cluster for the first time (but am an experienced sysadmin with lots of cluster experience) and running into an issue with permissions on mapred.system.dir. It has generally been a chore to figure out all the various directories that need to be created to get Hadoop w

Re: issue with permissions of mapred.system.dir

2012-10-09 Thread Marcos Ortiz
On 10/09/2012 07:44 PM, Goldstone, Robin J. wrote: I am bringing up a Hadoop cluster for the first time (but am an experienced sysadmin with lots of cluster experience) and running into an issue with permissions on mapred.system.dir. It has generally been a chore to figure out all the various

Re: copyFromLocal

2012-10-09 Thread Robert Molina
Here is the information for webhdfs rest call that should allow you to upload a file: http://hadoop.apache.org/docs/r1.0.3/webhdfs.html#CREATE HTH On Fri, Oct 5, 2012 at 1:16 AM, Visioner Sadak wrote: > Hey thanks bejoy and andy act my user just has a desktop web user(like we > browsing web) so

Re: issue with permissions of mapred.system.dir

2012-10-09 Thread Arpit Gupta
what is your "mapreduce.jobtracker.staging.root.dir" set to. This is a directory that needs to be writable by the user and is is recommended to be set to "/user" so it writes in appropriate users home directory. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Oct 9, 2012, at 4:44 PM,

unsubscribe

2012-10-09 Thread Shanliang Shen

Hadoop/Lucene + Solr architecture suggestions?

2012-10-09 Thread Mark Kerzner
Hi, if I create a Lucene index in each mapper, locally, then copy them to under /jobid/mapid1, /jodid/mapid2, and then in the reducers copy them to some Solr machine (perhaps even merging), does such architecture makes sense, to create a searchable index with Hadoop? Are there links for similar a

Re: Hadoop/Lucene + Solr architecture suggestions?

2012-10-09 Thread Ivan Frain
Hi Mark, I don't know Lucene/Solr very well but your question made me remember the lily project: http://www.lilyproject.org/lily/index.html. They use hadoop/hbase and solr to provide a searchable data management platform. Maybe you will find ideas in their documentation. BR, Ivan 2012/10/10 Mar

Re: DFS respond very slow

2012-10-09 Thread Alexey
Additional info: I also tried to use openjdk instead of sun's - issue still persists On 10/09/12 03:12 AM, Alexey wrote: > Hi, > > I have an issues with hadoop dfs, I have 3 servers (24Gb RAM on each). > The servers are not overloaded, they just have hadoop installed. One > have datanode and name

Re: DFS respond very slow

2012-10-09 Thread Harsh J
Hey Alexey, Have you noticed this right from the start itself? Also, what exactly do you mean by "Limited replication bandwidth between datanodes - 5Mb." - Are you talking of dfs.balance.bandwidthPerSec property? On Wed, Oct 10, 2012 at 10:53 AM, Alexey wrote: > Additional info: I also tried to

Re: DFS respond very slow

2012-10-09 Thread Alexey
Hello Harsh, I notices such issues from the start. Yes, I mean dfs.balance.bandwidthPerSec property, I set this property to 500. On 10/09/12 11:50 PM, Harsh J wrote: > Hey Alexey, > > Have you noticed this right from the start itself? Also, what exactly > do you mean by "Limited replication

Re: DFS respond very slow

2012-10-09 Thread Harsh J
Hi, OK, can you detail your network infrastructure used here, and also make sure your daemons are binding to the right interfaces as well (use netstat to check perhaps)? What rate of transfer do you get for simple file transfers (ftp, scp, etc.)? On Wed, Oct 10, 2012 at 12:24 PM, Alexey wrote: >