Re: hadoop knowledge gaining

2011-10-10 Thread Steve Loughran
On 07/10/11 15:25, Jignesh Patel wrote: Guys, I am able to deploy the first program word count using hadoop. I am interesting exploring more about hadoop and Hbase and don't know which is the best way to grasp both of them. I have hadoop in action but it has older api. Actually the API

Re: ways to expand hadoop.tmp.dir capacity?

2011-10-10 Thread Marcos Luis Ortiz Valmaseda
2011/10/9 Harsh J ha...@cloudera.com Hello Meng, On Wed, Oct 5, 2011 at 11:02 AM, Meng Mao meng...@gmail.com wrote: Currently, we've got defined: property namehadoop.tmp.dir/name value/hadoop/hadoop-metadata/cache//value /property In our experiments with SOLR, the

Developing MapReduce

2011-10-10 Thread Mohit Anchlia
I use eclipse. Is this http://wiki.apache.org/hadoop/EclipsePlugIn still the best way to develop mapreduce programs in hadoop? Just want to make sure before I go down this path. Or should I just add hadoop jars in my classpath of eclipse and create my own MapReduce programs. Thanks

Re: Developing MapReduce

2011-10-10 Thread Jignesh Patel
When you download the hadoop in its dist(i don't remember exact name) there is a related plugin. Go and get it from there. On Oct 10, 2011, at 10:34 AM, Mohit Anchlia wrote: I use eclipse. Is this http://wiki.apache.org/hadoop/EclipsePlugIn still the best way to develop mapreduce programs in

How to iterate over a hdfs folder with hadoop

2011-10-10 Thread Raimon Bosch
Hi, I'm wondering how can I browse an hdfs folder using the classes in org.apache.hadoop.fs package. The operation that I'm looking for is 'hadoop dfs -ls' The standard file system equivalent would be: File f = new File(outputPath); if(f.isDirectory()){ String files[] = f.list(); for(String

Re: How to iterate over a hdfs folder with hadoop

2011-10-10 Thread John Conwell
FileStatus[] files = fs.listStatus(new Path(path)); for (FileStatus fileStatus : files) { //...do stuff ehre } On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch raimon.bo...@gmail.comwrote: Hi, I'm wondering how can I browse an hdfs folder using the classes in org.apache.hadoop.fs package.

Re: hadoop input buffer size

2011-10-10 Thread Uma Maheswara Rao G 72686
I think below can give you more info about it. http://developer.yahoo.com/blogs/hadoop/posts/2009/08/the_anatomy_of_hadoop_io_pipel/ Nice explanation by Owen here. Regards, Uma - Original Message - From: Yang Xiaoliang yangxiaoliang2...@gmail.com Date: Wednesday, October 5, 2011 4:27 pm

Custom InputFormat for Multiline Input File Hive/Hadoop

2011-10-10 Thread Mike Sukmanowsky
Hi all, Sending this to core-u...@hadoop.apache.org and d...@hive.apache.org. Trying to process Omniture's data log files with Hadoop/Hive. The file format is tab delimited and while being pretty simple for the most part, they do allow you to have multiple new lines and tabs within a field that

Re: How to iterate over a hdfs folder with hadoop

2011-10-10 Thread Uma Maheswara Rao G 72686
Yes, FileStatus class would be trhe equavalent for list. FileStstus has the API's isDir and getPath. This both api's can satify for your futher usage.:-) I think small difference would be, FileStatus will ensure the sorted order. Regards, Uma - Original Message - From: John Conwell

Re: How to iterate over a hdfs folder with hadoop

2011-10-10 Thread Raimon Bosch
Thanks John! There is the complete solution: Configuration jc = new Configuration(); Object files[] = null; List files_in_hdfs = new ArrayList(); FileSystem fs = FileSystem.get(jc); FileStatus[] file_status = fs.listStatus(new Path(outputPath)); for (FileStatus fileStatus : file_status) {

Re: hdfs directory location

2011-10-10 Thread bejoy . hadoop
Jignesh You are creating a dir in hdfs by that command. The dir won't be in your local file system but it hdfs. Issue a command like hadoop fs -ls /user/hadoop-user/citation/ You can see the dir you created in hdfs If you want to create a die on local unix use a simple linux command mkdir

Re: hdfs directory location

2011-10-10 Thread Jignesh Patel
Bejoy, If I create a directory in unix box then how I can link it with HDFS directory structure? -Jignesh On Oct 10, 2011, at 2:59 PM, bejoy.had...@gmail.com wrote: Jignesh You are creating a dir in hdfs by that command. The dir won't be in your local file system but it hdfs. Issue a

Re: Developing MapReduce

2011-10-10 Thread bejoy . hadoop
Hi Mohit I'm really not sure how many of the map reduce developers use the map reduce eclipse plugin. AFAIK majority don't. As Jignesh mentioned you can get it from the hadoop distribution folder as soon as you unzip the same. My suggested approach would be,If you are on Windows OS, you

Re: hdfs directory location

2011-10-10 Thread bejoy . hadoop
Jignesh Sorry I didn't get your query, 'how I can link it with HDFS directory structure? ' You mean putting your unix dir contents into hdfs? If so use hadoop fs -copyFromLocal src destn --Original Message-- From: Jignesh Patel To: common-user@hadoop.apache.org To:

Re: hdfs directory location

2011-10-10 Thread Jignesh Patel
Bejoy, copyToLocal makes sense, it worked. But I am still wondering if HDFS has a directory created on local box, somewhere it exist physically but couldn't able to locate it. Is HDFS directory structure is a virtual structure, doesn't exist physically? -Jignesh On Oct 10, 2011, at 3:53 PM,

Re: hdfs directory location

2011-10-10 Thread bejoy . hadoop
Jignesh You are absolutely right. In hdfs directory doesn't exist physically. It is just meta data on name node. I don't think such a dir structure would be there in name node lfs as well as it just meta data and hence no physical dir structure is created. Regards Bejoy K S -Original

Re: hdfs directory location

2011-10-10 Thread Arko Provo Mukherjee
Hi, I guess what you are wanting is to see your HDFS directory through normal File System commands like ls etc or by browsing your directory structure. This is not possible as none of your commands or Finder (in Mac) have ability to read / write HDFS. So they don't have the capability to show

ssh setup stop working

2011-10-10 Thread Jignesh Patel
I have created private key setup on local box and till this week end everything was working great. But when today I tried JPS I found none of the service works as well as when I tried to do ssh localhost it started asking for password. when I tried ssh-keygen -t rsa the message appeared

Re: ssh setup stop working

2011-10-10 Thread Jignesh Patel
nope they works. I have a mac system On Oct 10, 2011, at 4:40 PM, Ilker Ozkaymak wrote: Has your user account's password been expired?? Best regards, IO On Mon, Oct 10, 2011 at 3:35 PM, Jignesh Patel jign...@websoft.com wrote: I have created private key setup on local box and till this

Re: Secondary namenode fsimage concept

2011-10-10 Thread Shouguo Li
hey parick i wanted to configure my cluster to write namenode metadata to multiple directories as well: property namedfs.name.dir/name value/hadoop/var/name,/mnt/hadoop/var/name/value /property in my case, /hadoop/var/name is local directory, /mnt/hadoop/var/name is NFS volume. i

Re: ssh setup stop working

2011-10-10 Thread Jignesh Patel
Infect I have created passphraseless key again and still it asks me for password. On Oct 10, 2011, at 4:51 PM, Jignesh Patel wrote: nope they works. I have a mac system On Oct 10, 2011, at 4:40 PM, Ilker Ozkaymak wrote: Has your user account's password been expired?? Best regards, IO

Re: ssh setup stop working

2011-10-10 Thread Ilker Ozkaymak
Key requires a specific permissions for .ssh directory 700 and authorized_keys file 600 anything more it won't work. However you said it worked before, I usually experience problem when password ages the key also doesn't work until the password is reset. Anyhow it might be little different. Best

Subscribe to list

2011-10-10 Thread Joan Figuerola hurtado
Hi, I want to know your improvement subscribing to this list. Many thanks :)

problem in running program

2011-10-10 Thread Jignesh Patel
I m trying to run attached program. My input directory structure is /user/hadoop-user/input/cite65_77.txt file. But it doesn't do anything. It doesn't read the file and not creates output directory.

Re: ssh setup stop working

2011-10-10 Thread Jignesh Patel
You are right I have a problem with the access rights. Now it works. On Oct 10, 2011, at 5:36 PM, Ilker Ozkaymak wrote: Key requires a specific permissions for .ssh directory 700 and authorized_keys file 600 anything more it won't work. However you said it worked before, I usually experience

Re: ways to expand hadoop.tmp.dir capacity?

2011-10-10 Thread Meng Mao
So the only way we can expand to multiple mapred.local.dir paths is to config our site.xml and to restart the DataNode? On Mon, Oct 10, 2011 at 9:36 AM, Marcos Luis Ortiz Valmaseda marcosluis2...@googlemail.com wrote: 2011/10/9 Harsh J ha...@cloudera.com Hello Meng, On Wed, Oct 5, 2011

Re: hdfs directory location

2011-10-10 Thread Harsh J
Jignesh, Can be done. Use the fuse-dfs feature of HDFS to have your DFS as a 'physical' mount point on Linux. Instructions may be found here: http://wiki.apache.org/hadoop/MountableHDFS and on other resources across the web (search around for fuse hdfs). On Tue, Oct 11, 2011 at 1:32 AM, Jignesh

Re: problem in running program

2011-10-10 Thread Harsh J
Jignesh, Please do not attach files to the mailing list. They are stripped away and the community will never receive them. Instead, if its small enough, paste it along in the mail, or paste it at services like pastebin.com and pass along the public links. On Tue, Oct 11, 2011 at 3:35 AM, Jignesh

Re: ways to expand hadoop.tmp.dir capacity?

2011-10-10 Thread Harsh J
Meng, Yes, configure the mapred-site.xml (mapred.local.dir) to add the property and roll-restart your TaskTrackers. If you'd like to expand your DataNode to multiple disks as well (helps HDFS I/O greatly), do the same with hdfs-site.xml (dfs.data.dir) and perform the same rolling restart of

Re: Secondary namenode fsimage concept

2011-10-10 Thread Uma Maheswara Rao G 72686
Hi, It looks to me that, problem with your NFS. It is not supporting locks. Which version of NFS are you using? Please check your NFS locking support by writing simple program for file locking. I think NFS4 supports locking ( i did not tried). http://nfs.sourceforge.net/ A6. What are the

Re: Secondary namenode fsimage concept

2011-10-10 Thread Harsh J
Generally you just gotta ensure that your rpc.lockd service is up and running on both ends, to allow for locking over NFS. On Tue, Oct 11, 2011 at 8:16 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: Hi, It looks to me that, problem with your NFS. It is not supporting locks. Which

Re: Is it possible to run multiple MapReduce against the same HDFS?

2011-10-10 Thread Zhenhua (Gerald) Guo
Thanks, Robert. I will look into hod. When MapReduce framework accesses data stored in HDFS, which account is used, the account which MapReduce daemons (e.g. job tracker) run as or the account of the user who submits the job? If HDFS and MapReduce clusters are run with different accounts, can

Re: hadoop input buffer size

2011-10-10 Thread Mark question
Thanks for the clarifications guys :) Mark On Mon, Oct 10, 2011 at 8:27 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: I think below can give you more info about it. http://developer.yahoo.com/blogs/hadoop/posts/2009/08/the_anatomy_of_hadoop_io_pipel/ Nice explanation by Owen

Re: Error using hadoop distcp

2011-10-10 Thread Uma Maheswara Rao G 72686
Distcp will run as mapreduce job. Here tasktrackers required the hostname mappings to contact to other nodes. Please configure the mapping correctly in both the machines and try. egards, Uma - Original Message - From: trang van anh anh...@vtc.vn Date: Wednesday, October 5, 2011 1:41 pm