Hi all,
I have written a hadoop pipes program that uses libhdfs to read files from
HDFS. The program runs fine in the pseudo-distributed setup on the cloudera
virtual machine. But when I tried to test it on a cluster, it failed.
Turns out the cluster computers didn't have libhdfs installed. For t
Let's frame the issue in another way. I'll describe a sequence of Hadoop
operations that I think should work, and then I'll get into what we did and
how it failed.
Normal sequence:
1. have files to be cached in HDFS
2. Run Job A, which specifies those files to be put into DistributedCache
space
3.
Marcos,
Rather then focusing on using hadoop finance domain I am more interested in
knowing about Mumps(Cache or GT.M) to hadoop transformation.
It would be better to ignore the finance domain and just focus on technical
aspect of how to do?
-Jignesh
On Sep 26, 2011, at 2:58 PM, Marcos Luis O
hi listers,
I am a hadoop newbie. Currently I am working hadoop/mr monitoring.
Jmx looks good to me, but seems to me that MBean doesn't expose "mapred"
context.
Currently, i am running Cloudera ch3u0.
I wonder what hadoop/mr version (if any) that has MBean that expose mapred
context.
Or, what work
Regards, Jignesh
You can start your research here:
*1235 Joe Cunningham – Visa – Large scale transaction analysis
**Cross Data Center Log Processing – Stu Hood, Rackspace
**Data Processing for Financial Services – Peter Krey and Sin Lee, JP Morgan
Chase
*http://atbrox.com/tag/finance/
Next, at Qu
Have you tried using the host ip addr instead of hostname? This seems a little
weird...
If you are going to face firewall issues in the future, you may want to
consider using Hoop to access hdfs using REST api.
(http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/)
-chinmay
---
Hi Bikash,
every map-/reduce task is - as far as I know - a single jvm instance - you
can configure and/or run with jvm options.
Maybe you can track these jvm's by using some system tools.
Regards,
Ralf
-Original Message-
From: bikash sharma [mailto:sharmabiks...@gmail.com]
Sent: Freita
I am working on a finance application and we are thinking of using Hadoop HBase
instead of old GT.M and Cache based NoSQL system.
Has anybody done that kind of transformation?
-Jignesh
Hi Bharath,
There are few reasons to cause this problem. I have listed below some reasons
with solutions. This might help you to solve this. If you post the logs, the
problem can be figured out.
Reason 1:
It could be that the mapping in the /etc/hosts file is not present.
The DNS server is d
Hello Abdelrahman,
Are you able to ping from one machine to other with the configured hostname?
configure both the hostnames in /etc/hosts file properly and try.
Regards,
Uma
- Original Message -
From: Abdelrahman Kamel
Date: Monday, September 26, 2011 8:47 pm
Subject: Too many fetch fa
Hey,
Try configuring your cluster with hostnames instead of ips and add
those entries to /etc/hosts and sync it across all the nodes in the
cluster. You need to restart the cluster after making these changes.
Hope this helps,
On Mon, Sep 26, 2011 at 8:46 PM, Abdelrahman Kamel wrote:
> Hi,
> Thi
Hi,
This is my first post here.
I'm new to Hadoop.
I've already installed Hadoop on 2 Ubuntu boxes (one is both master and
slave and the other is only slave).
When I run a Wordcount example on 5 small txt files, the process never
completes and I get a "Too many fetch failures" error on my terminal.
Praveenesh,
Are you saying you have written a traditional Java MR job using a library
from Mahout to analyze the data set?
In that case, I would compile into a .jar and run it as a hadoop jar on the
command line, it should work fine.
Best,
Linden
On Fri, Sep 23, 2011 at 8:44 AM, praveenesh kuma
So I do need to build Hadoop, right?
Thank you,
Mark
On Mon, Sep 26, 2011 at 1:04 AM, Uma Maheswara Rao G 72686 <
mahesw...@huawei.com> wrote:
> Java 6, Cygwin ( maven + tortoiseSVN are for building hadoop) should be
> enough for running standalone mode in windows.
>
> Regards,
> Uma
> - Or
hi,
on page http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html
there is a following instructions:
"For example, To configure Namenode to use parallelGC, the following
statement should be added in hadoop-env.sh:
export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"
On 23/09/11 16:09, GOEKE, MATTHEW (AG/1000) wrote:
If you are starting from scratch with no prior Hadoop install experience I
would configure stand-alone, migrate to pseudo distributed and then to fully
distributed verifying functionality at each step by doing a simple word count
run. Also, if
16 matches
Mail list logo