I have no Java implementation of my job, sorry.
Since it's all in the map side, IdentityMapper/IdentityReducer is
fine, as long as both the splits and the number of reduce tasks are
the same.
The data is a representation for loglines, and not exactly small,
e.g. the
stuff has already be
On Tuesday 03 June 2008 04:53:22 Chris Douglas wrote:
> Is anyone observing this outside of streaming?
>
> We've been able to reproduce this trace with a bad comparator that
> only returns negative values, but haven't found any uncontrived
> patterns in data that produce this, nor any comparators i
Hi
Hama project has accepted into the Apache Incubator, and There's a work on.
Please subscribe the mailing list. ([EMAIL PROTECTED])
Regards,
Edward.
> On Mon, Jun 2, 2008 at 9:11 PM, Hadoop <[EMAIL PROTECTED]> wrote:
>
> I downloaded the Matrix Multiplication code from:
> http://code.google.co
Hello Hadoopers:
I am running the RandomWrite on a 8 nodes cluster. Because the default
setting is creating 1G/mapper, 10mappers/host. Considering replications, it
is essentially creating 30G/host. Because each node in the cluster has at
most 30G. So my cluster is full and can not execute further c
Hello Hadoopers:
I am running the RandomWrite on a 8 nodes cluster. Because the default
setting is creating 1G/mapper, 10mappers/host. Considering replications, it
is essentially creating 30G/host. Because each node in the cluster has at
most 30G. So my cluster is full and can not execute further c
> No , it is in different storage file.
What is in different storage file?
All data-nodes should have different configuration files, and each
configuration file
should set a different storage directory property: "dfs.data.dir"
It is not a file, it is directory with all data-blocks.
> the data-n
2008/6/3 Konstantin Shvachko <[EMAIL PROTECTED]>:
> Is it possible that your different data-nodes point to the same storage
> directory on
> the hard drive? If so one of the data-nodes will be shut down.
> In general this is impossible because storage directories are locked once
> one of the nod
Oops, missed the part where you already tried that.
On Mon, Jun 2, 2008 at 3:23 PM, Michael Di Domenico <[EMAIL PROTECTED]>
wrote:
> Depending on your windows version, there is a dos command called "subst"
> which you could use to virtualize a drive letter on your third machine
>
>
> On Fri, May
Depending on your windows version, there is a dos command called "subst"
which you could use to virtualize a drive letter on your third machine
On Fri, May 30, 2008 at 4:35 AM, Sridhar Raman <[EMAIL PROTECTED]>
wrote:
> Should the installation paths be the same in all the nodes? Most
> documenta
Hi, do you have a testcase that we can run to reproduce this? Thanks!
> -Original Message-
> From: jkupferman [mailto:[EMAIL PROTECTED]
> Sent: Monday, June 02, 2008 9:22 AM
> To: core-user@hadoop.apache.org
> Subject: Stack Overflow When Running Job
>
>
> Hi everyone,
> I have a job ru
obviously this isn't the best solution if you need to let many semi
trusted users browse your cluster.
Actually, it would be much more secure if the tunnel service ran on a
trusted server letting your users connect remotely via SOCKS and then
browse the cluster. These users wouldn't need
Is it possible that your different data-nodes point to the same storage
directory on
the hard drive? If so one of the data-nodes will be shut down.
In general this is impossible because storage directories are locked once one
of the nodes
claims them under its authority. But I don't know whether
if you use the new scripts in 0.17.0, just run
> hadoop-ec2 proxy
this starts a ssh tunnel to your cluster.
installing foxy proxy in FF gives you whole cluster visibility..
obviously this isn't the best solution if you need to let many semi
trusted users browse your cluster.
On May 28, 20
Alejandro Abdelnur wrote:
Yes you would have to do it with classloaders (not 'hello world' but not
'rocket science' either).
That's where we differ.
I do actually think that classloaders are incredibly hard to get right,
and I say that as someone who has single stepped through the Axis2 code
In MetricsIntValue, incrMetrics() was being called on pushMetrics(), instead of
setMetrics(). This used to cause the values to be incremented periodically.
Thanks,
Lohit
- Original Message
From: Ion Badita <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Saturday, May 31, 2008 4
You should file a Jira, make the change and submit a patch!
On Sun, Jun 1, 2008 at 11:19 PM, NOMURA Yoshihide <[EMAIL PROTECTED]>
wrote:
> Hello,
> I'm using Hadoop 0.17.0 to analyze some large amount of CSV files.
>
> And I need to read such files in different character encoding from UTF-8,
> bu
Yes you would have to do it with classloaders (not 'hello world' but not
'rocket science' either).
You'll be limited on using native libraries, even if you use classloaders
properly as native libs can be loaded only once.
You will have to ensure you get rid of the task classloader once the task i
Hi Steve,
On Mon, Jun 2, 2008 at 12:23 PM, Steve Loughran <[EMAIL PROTECTED]> wrote:
> Christophe Taton wrote:
>
>> Actually Hadoop could be made more friendly to such realtime Map/Reduce
>> jobs.
>> For instance, we could consider running all tasks inside the task tracker
>> jvm as separate thre
I downloaded the Matrix Multiplication code from:
http://code.google.com/p/hama/source/browse/trunk/src/java/org/apache/hama/
but I do not know how can I run it in the right way.
Could you please give steps how to run the code?
--
View this message in context:
http://www.nabble.com/Matrix-Mult
Hi
I am simulating a 4-DataNodes environment using VMWare.
I found some data nodes often self-stopped after receiving a large file (or
block).
In fact , not so large , it is just smaller than 10MB.
This is the error messages :
2008-05-27 16:40:54,727 INFO org.apache.hadoop.dfs.DataNode: Received
Christophe Taton wrote:
Actually Hadoop could be made more friendly to such realtime Map/Reduce
jobs.
For instance, we could consider running all tasks inside the task tracker
jvm as separate threads, which could be implemented as another personality
of the TaskRunner.
I have been looking into th
Hi,
In javadoc for MetricsIntValue and MetricsLongValue is written: "Each
time its value is set, it is published only *once* at the next update
call". Looking at the those classes is right they "push" the data into
the MetricsRecord only once, but digging dipper into the
AbstractMericsContext
Hi Tom.
Ah... From reading (your?) article:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873&categoryID=112
I got confused; it seems to suggest that distcp is used to move
ordinary S3 objects onto HDFS..
Thanks for the clarification.
Cheers,
Einar
On Sat, May 31, 200
I'm starting to use Hadoop as a simple "storage pool" to store backups of large
things (currently Oracle database backups). My Hadoop usage is at a pretty
primitive level so far and I am really only scratching the surface of what it
can do. I haven't used map/reduce at all--so far it's just be
24 matches
Mail list logo