libhdfs and libjvm.so on distributed cache

2011-09-26 Thread Vivek K
Hi all, I have written a hadoop pipes program that uses libhdfs to read files from HDFS. The program runs fine in the pseudo-distributed setup on the cloudera virtual machine. But when I tried to test it on a cluster, it failed. Turns out the cluster computers didn't have libhdfs installed. For t

Re: operation of DistributedCache following manual deletion of cached files?

2011-09-26 Thread Meng Mao
Let's frame the issue in another way. I'll describe a sequence of Hadoop operations that I think should work, and then I'll get into what we did and how it failed. Normal sequence: 1. have files to be cached in HDFS 2. Run Job A, which specifies those files to be put into DistributedCache space 3.

Re: NoSQL to NoSQL

2011-09-26 Thread Jignesh Patel
Marcos, Rather then focusing on using hadoop finance domain I am more interested in knowing about Mumps(Cache or GT.M) to hadoop transformation. It would be better to ignore the finance domain and just focus on technical aspect of how to do? -Jignesh On Sep 26, 2011, at 2:58 PM, Marcos Luis O

MBean for mapred context

2011-09-26 Thread patrick sang
hi listers, I am a hadoop newbie. Currently I am working hadoop/mr monitoring. Jmx looks good to me, but seems to me that MBean doesn't expose "mapred" context. Currently, i am running Cloudera ch3u0. I wonder what hadoop/mr version (if any) that has MBean that expose mapred context. Or, what work

Re: NoSQL to NoSQL

2011-09-26 Thread Marcos Luis Ortiz Valmaseda
Regards, Jignesh You can start your research here: *1235 Joe Cunningham – Visa – Large scale transaction analysis **Cross Data Center Log Processing – Stu Hood, Rackspace **Data Processing for Financial Services – Peter Krey and Sin Lee, JP Morgan Chase *http://atbrox.com/tag/finance/ Next, at Qu

RE: I need help talking to HDFS over a firewall

2011-09-26 Thread Dhodapkar, Chinmay
Have you tried using the host ip addr instead of hostname? This seems a little weird... If you are going to face firewall issues in the future, you may want to consider using Hoop to access hdfs using REST api. (http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/) -chinmay ---

RE: Getting the cpu, memory usage of map/reduce tasks

2011-09-26 Thread Ralf Heyde
Hi Bikash, every map-/reduce task is - as far as I know - a single jvm instance - you can configure and/or run with jvm options. Maybe you can track these jvm's by using some system tools. Regards, Ralf -Original Message- From: bikash sharma [mailto:sharmabiks...@gmail.com] Sent: Freita

NoSQL to NoSQL

2011-09-26 Thread Jignesh Patel
I am working on a finance application and we are thinking of using Hadoop HBase instead of old GT.M and Cache based NoSQL system. Has anybody done that kind of transformation? -Jignesh

RE: Too many fetch failures. Help!

2011-09-26 Thread Devaraj k
Hi Bharath, There are few reasons to cause this problem. I have listed below some reasons with solutions. This might help you to solve this. If you post the logs, the problem can be figured out. Reason 1: It could be that the mapping in the /etc/hosts file is not present. The DNS server is d

Re: Too many fetch failures. Help!

2011-09-26 Thread Uma Maheswara Rao G 72686
Hello Abdelrahman, Are you able to ping from one machine to other with the configured hostname? configure both the hostnames in /etc/hosts file properly and try. Regards, Uma - Original Message - From: Abdelrahman Kamel Date: Monday, September 26, 2011 8:47 pm Subject: Too many fetch fa

Re: Too many fetch failures. Help!

2011-09-26 Thread bharath vissapragada
Hey, Try configuring your cluster with hostnames instead of ips and add those entries to /etc/hosts and sync it across all the nodes in the cluster. You need to restart the cluster after making these changes. Hope this helps, On Mon, Sep 26, 2011 at 8:46 PM, Abdelrahman Kamel wrote: > Hi, > Thi

Too many fetch failures. Help!

2011-09-26 Thread Abdelrahman Kamel
Hi, This is my first post here. I'm new to Hadoop. I've already installed Hadoop on 2 Ubuntu boxes (one is both master and slave and the other is only slave). When I run a Wordcount example on 5 small txt files, the process never completes and I get a "Too many fetch failures" error on my terminal.

Re: How to run java code using Mahout from commandline ?

2011-09-26 Thread Linden Hillenbrand
Praveenesh, Are you saying you have written a traditional Java MR job using a library from Mahout to analyze the data set? In that case, I would compile into a .jar and run it as a hadoop jar on the command line, it should work fine. Best, Linden On Fri, Sep 23, 2011 at 8:44 AM, praveenesh kuma

Re: How to run Hadoop in standalone mode in Windows

2011-09-26 Thread Mark Kerzner
So I do need to build Hadoop, right? Thank you, Mark On Mon, Sep 26, 2011 at 1:04 AM, Uma Maheswara Rao G 72686 < mahesw...@huawei.com> wrote: > Java 6, Cygwin ( maven + tortoiseSVN are for building hadoop) should be > enough for running standalone mode in windows. > > Regards, > Uma > - Or

About export HADOOP_NAMENODE_OPTS in hadoop-env.sh

2011-09-26 Thread Ossi
hi, on page http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html there is a following instructions: "For example, To configure Namenode to use parallelGC, the following statement should be added in hadoop-env.sh: export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"

Re: Environment consideration for a research on scheduling

2011-09-26 Thread Steve Loughran
On 23/09/11 16:09, GOEKE, MATTHEW (AG/1000) wrote: If you are starting from scratch with no prior Hadoop install experience I would configure stand-alone, migrate to pseudo distributed and then to fully distributed verifying functionality at each step by doing a simple word count run. Also, if