Re: Questions about Hadoop

2008-09-26 Thread Paco NATHAN
Edward, Can you describe more about Hama, with respect to Hadoop? I've read through the Incubator proposal and your blog -- it's a great approach. Are there any benchmarks available? E.g., size of data sets used, kinds of operations performed, etc. Will this project be able to make use of

Re: Questions about Hadoop

2008-09-25 Thread Edward J. Yoon
The decision making system seems interesting to me. :) The question I want to ask is whether it is possible to perform statistical analysis on the data using Hadoop and MapReduce. I'm sure Hadoop could do it. FYI, The Hama project is an easy-to-use to matrix algebra and its uses in

Re: Questions about Hadoop

2008-09-24 Thread Enis Soztutar
Hi, Arijit Mukherjee wrote: Hi We've been thinking of using Hadoop for a decision making system which will analyze telecom-related data from various sources to take certain decisions. The data can be huge, of the order of terabytes, and can be stored as CSV files, which I understand will fit

RE: Questions about Hadoop

2008-09-24 Thread Arijit Mukherjee
: Wednesday, September 24, 2008 2:57 PM To: core-user@hadoop.apache.org Subject: Re: Questions about Hadoop Hi, Arijit Mukherjee wrote: Hi We've been thinking of using Hadoop for a decision making system which will analyze telecom-related data from various sources to take certain decisions

RE: Questions about Hadoop

2008-09-24 Thread Arijit Mukherjee
-Original Message- From: Enis Soztutar [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 4:53 PM To: core-user@hadoop.apache.org Subject: Re: Questions about Hadoop Arijit Mukherjee wrote: Thanx Enis. By workflow, I was trying to mean something like a chain of MapReduce

Re: Questions about Hadoop

2008-09-24 Thread Paco NATHAN
Arijit, For workflow, check out http://cascading.org -- that works quite well and fits what you described. Greenplum and Aster Data have announced support for running MR within the context of their relational databases, e.g., http://www.greenplum.com/resources/mapreduce/ In terms of PIG, Hive,

RE: Questions about Hadoop

2008-09-24 Thread Arijit Mukherjee
Lake Kolkata 700 091, India Phone: +91 (0)33 23577531/32 x 107 http://www.connectivasystems.com -Original Message- From: Paco NATHAN [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 6:10 PM To: core-user@hadoop.apache.org; [EMAIL PROTECTED] Subject: Re: Questions about Hadoop

Re: Questions about Hadoop

2008-09-24 Thread Paco NATHAN
://www.connectivasystems.com -Original Message- From: Paco NATHAN [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 6:10 PM To: core-user@hadoop.apache.org; [EMAIL PROTECTED] Subject: Re: Questions about Hadoop Arijit, For workflow, check out http://cascading.org

Re: basic questions about Hadoop!

2008-09-01 Thread Victor Samoylov
Gerardo, Thank for you information. I've success with remote writing on HDFS using the following steps: 1. Installation of the latest stable version (hadoop 0.17.2.1) to data nodes and client machine. 2. Open ports 50010, 50070, 54310, 54311 on data nodes machines to access from client machine

Re: basic questions about Hadoop!

2008-09-01 Thread Mafish Liu
On Sat, Aug 30, 2008 at 10:12 AM, Gerardo Velez [EMAIL PROTECTED]wrote: Hi Victor! I got problem with remote writing as well, so I tried to go further on this and I would like to share what I did, maybe you have more luck than me 1) as I'm working with user gvelez in remote host I had to

Re: basic questions about Hadoop!

2008-08-29 Thread Victor Samoylov
Jeff, Thanks for detailed instructions, but on machine that is not hadoop server I got error: ~/hadoop-0.17.2$ ./bin/hadoop dfs -copyFromLocal NOTICE.txt test 08/08/29 19:33:07 INFO dfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection refused 08/08/29 19:33:07

Re: basic questions about Hadoop!

2008-08-29 Thread Gerardo Velez
Hi Victor! I got problem with remote writing as well, so I tried to go further on this and I would like to share what I did, maybe you have more luck than me 1) as I'm working with user gvelez in remote host I had to give write access to all, like this: bin/hadoop dfs -chmod -R a+w input

basic questions about Hadoop!

2008-08-28 Thread Gerardo Velez
Hi Everybody! I'm a newbie with Hadoop, I've installed it as a single node as a pseudo-distributed environment, but I would like to go further and configure a complete hadoop cluster. But I got the following questions. 1.- I undertsand that HDFS has a master/slave architecture. So master and the

Re: basic questions about Hadoop!

2008-08-28 Thread Jeff Payne
Gerardo: I can't really speak to all of your questions, but the master/slave issue is a common concern with hadoop. A cluster has a single namenode and therefore a single point of failure. There is also a secondary name node process which runs on the same machine as the name node in most

Re: basic questions about Hadoop!

2008-08-28 Thread Gerardo Velez
Hi Jeff, thank you for answering! What about remote writing on HDFS, lets suppose I got an application server on a linux server A and I got a Hadoop cluster on servers B (master), C (slave), D (slave) What I would like is sent some files from Server A to be processed by hadoop. So in order to do

Re: basic questions about Hadoop!

2008-08-28 Thread Jeff Payne
You can use the hadoop command line on machines that aren't hadoop servers. If you copy the hadoop configuration from one of your master servers or data node to the client machine and run the command line dfs tools, it will copy the files directly to the data node. Or, you could use one of the

Re: basic questions about Hadoop!

2008-08-28 Thread Gerardo Velez
Thanks Jeff and sorry for bothering you again! I got clear the remoting writing into HDFS, but what about hadoop process? Once the file has been copied to HDFS, do I still needs to run hadoop -jarfile input output everytime? if I need to do it everytime, should I do it from remote server as

Re: Two questions about hadoop

2008-07-16 Thread chaitanya krishna
Hi, Try setting number of map tasks in the program itself. For example, in the Wordcount example, you can set the number of maptasks in run method as conf.setNumMapTasksno. of map tasks I hope this answers your first query. Regards, V.V.Chaitanya Krishna IIIT,Hyderabad On Wed, Jul 16, 2008

Two questions about hadoop

2008-07-15 Thread Wei Jiang
Hi all, I am a new user with hadoop and have some questions about it. 1)about setting the number of maps/reduces: With running hadoop on a 8-node cluster, I set mapred.map.tasks to 64 and mapred.tasktracker.map.tasks.maximum to 8, but by examining the counter launched map tasks from the output,