Edward,
Can you describe more about Hama, with respect to Hadoop?
I've read through the Incubator proposal and your blog -- it's a great approach.
Are there any benchmarks available? E.g., size of data sets used,
kinds of operations performed, etc.
Will this project be able to make use of
The decision making system seems interesting to me. :)
The question I want to ask is whether it is possible to perform statistical
analysis on the data using Hadoop and MapReduce.
I'm sure Hadoop could do it. FYI, The Hama project is an easy-to-use
to matrix algebra and its uses in
Hi,
Arijit Mukherjee wrote:
Hi
We've been thinking of using Hadoop for a decision making system which
will analyze telecom-related data from various sources to take certain
decisions. The data can be huge, of the order of terabytes, and can be
stored as CSV files, which I understand will fit
: Wednesday, September 24, 2008 2:57 PM
To: core-user@hadoop.apache.org
Subject: Re: Questions about Hadoop
Hi,
Arijit Mukherjee wrote:
Hi
We've been thinking of using Hadoop for a decision making system which
will analyze telecom-related data from various sources to take certain
decisions
-Original Message-
From: Enis Soztutar [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 24, 2008 4:53 PM
To: core-user@hadoop.apache.org
Subject: Re: Questions about Hadoop
Arijit Mukherjee wrote:
Thanx Enis.
By workflow, I was trying to mean something like a chain of MapReduce
Arijit,
For workflow, check out http://cascading.org -- that works quite well
and fits what you described.
Greenplum and Aster Data have announced support for running MR within
the context of their relational databases, e.g.,
http://www.greenplum.com/resources/mapreduce/
In terms of PIG, Hive,
Lake
Kolkata 700 091, India
Phone: +91 (0)33 23577531/32 x 107
http://www.connectivasystems.com
-Original Message-
From: Paco NATHAN [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 24, 2008 6:10 PM
To: core-user@hadoop.apache.org; [EMAIL PROTECTED]
Subject: Re: Questions about Hadoop
://www.connectivasystems.com
-Original Message-
From: Paco NATHAN [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 24, 2008 6:10 PM
To: core-user@hadoop.apache.org; [EMAIL PROTECTED]
Subject: Re: Questions about Hadoop
Arijit,
For workflow, check out http://cascading.org
Gerardo,
Thank for you information.
I've success with remote writing on HDFS using the following steps:
1. Installation of the latest stable version (hadoop 0.17.2.1) to data nodes
and client machine.
2. Open ports 50010, 50070, 54310, 54311 on data nodes machines to access
from client machine
On Sat, Aug 30, 2008 at 10:12 AM, Gerardo Velez [EMAIL PROTECTED]wrote:
Hi Victor!
I got problem with remote writing as well, so I tried to go further on this
and I would like to share what I did, maybe you have more luck than me
1) as I'm working with user gvelez in remote host I had to
Jeff,
Thanks for detailed instructions, but on machine that is not hadoop server I
got error:
~/hadoop-0.17.2$ ./bin/hadoop dfs -copyFromLocal NOTICE.txt test
08/08/29 19:33:07 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection refused
08/08/29 19:33:07
Hi Victor!
I got problem with remote writing as well, so I tried to go further on this
and I would like to share what I did, maybe you have more luck than me
1) as I'm working with user gvelez in remote host I had to give write access
to all, like this:
bin/hadoop dfs -chmod -R a+w input
Hi Everybody!
I'm a newbie with Hadoop, I've installed it as a single node as a
pseudo-distributed environment, but I would like to go further and configure
a complete hadoop cluster. But I got the following questions.
1.- I undertsand that HDFS has a master/slave architecture. So master and
the
Gerardo:
I can't really speak to all of your questions, but the master/slave issue is
a common concern with hadoop. A cluster has a single namenode and therefore
a single point of failure. There is also a secondary name node process
which runs on the same machine as the name node in most
Hi Jeff, thank you for answering!
What about remote writing on HDFS, lets suppose I got an application server
on a
linux server A and I got a Hadoop cluster on servers B (master), C (slave),
D (slave)
What I would like is sent some files from Server A to be processed by
hadoop. So in order to do
You can use the hadoop command line on machines that aren't hadoop servers.
If you copy the hadoop configuration from one of your master servers or data
node to the client machine and run the command line dfs tools, it will copy
the files directly to the data node.
Or, you could use one of the
Thanks Jeff and sorry for bothering you again!
I got clear the remoting writing into HDFS, but what about hadoop process?
Once the file has been copied to HDFS, do I still needs to run
hadoop -jarfile input output everytime?
if I need to do it everytime, should I do it from remote server as
Hi,
Try setting number of map tasks in the program itself. For example, in the
Wordcount example, you can set the number of maptasks in run method as
conf.setNumMapTasksno. of map tasks
I hope this answers your first query.
Regards,
V.V.Chaitanya Krishna
IIIT,Hyderabad
On Wed, Jul 16, 2008
Hi all,
I am a new user with hadoop and have some questions about it.
1)about setting the number of maps/reduces: With running hadoop on a 8-node
cluster, I set mapred.map.tasks to 64 and
mapred.tasktracker.map.tasks.maximum to 8, but by examining the counter
launched map tasks from the output,
19 matches
Mail list logo