date:20080602

Re: Stackoverflow

2008-06-02 Thread Chris Douglas

I have no Java implementation of my job, sorry. Since it's all in the map side, IdentityMapper/IdentityReducer is fine, as long as both the splits and the number of reduce tasks are the same. The data is a representation for loglines, and not exactly small, e.g. the stuff has already be

Re: Stackoverflow

2008-06-02 Thread Andreas Kostyrka

On Tuesday 03 June 2008 04:53:22 Chris Douglas wrote: > Is anyone observing this outside of streaming? > > We've been able to reproduce this trace with a bad comparator that > only returns negative values, but haven't found any uncontrived > patterns in data that produce this, nor any comparators i

Re: Matrix Multiplication Problem

2008-06-02 Thread Edward J. Yoon

Hi Hama project has accepted into the Apache Incubator, and There's a work on. Please subscribe the mailing list. ([EMAIL PROTECTED]) Regards, Edward. > On Mon, Jun 2, 2008 at 9:11 PM, Hadoop <[EMAIL PROTECTED]> wrote: > > I downloaded the Matrix Multiplication code from: > http://code.google.co

creating less than 10G data with RandomWriter

2008-06-02 Thread Richard Zhang

Hello Hadoopers: I am running the RandomWrite on a 8 nodes cluster. Because the default setting is creating 1G/mapper, 10mappers/host. Considering replications, it is essentially creating 30G/host. Because each node in the cluster has at most 30G. So my cluster is full and can not execute further c

create less than 10G data/host with RandomWrite

2008-06-02 Thread Richard Zhang

Hello Hadoopers: I am running the RandomWrite on a 8 nodes cluster. Because the default setting is creating 1G/mapper, 10mappers/host. Considering replications, it is essentially creating 30G/host. Because each node in the cluster has at most 30G. So my cluster is full and can not execute further c

Re: DataNode often self-stopped

2008-06-02 Thread Konstantin Shvachko

> No , it is in different storage file. What is in different storage file? All data-nodes should have different configuration files, and each configuration file should set a different storage directory property: "dfs.data.dir" It is not a file, it is directory with all data-blocks. > the data-n

Re: DataNode often self-stopped

2008-06-02 Thread smallufo

2008/6/3 Konstantin Shvachko <[EMAIL PROTECTED]>: > Is it possible that your different data-nodes point to the same storage > directory on > the hard drive? If so one of the data-nodes will be shut down. > In general this is impossible because storage directories are locked once > one of the nod

Re: Hadoop installation folders in multiple nodes

2008-06-02 Thread Michael Di Domenico

Oops, missed the part where you already tried that. On Mon, Jun 2, 2008 at 3:23 PM, Michael Di Domenico <[EMAIL PROTECTED]> wrote: > Depending on your windows version, there is a dos command called "subst" > which you could use to virtualize a drive letter on your third machine > > > On Fri, May

Re: Hadoop installation folders in multiple nodes

2008-06-02 Thread Michael Di Domenico

Depending on your windows version, there is a dos command called "subst" which you could use to virtualize a drive letter on your third machine On Fri, May 30, 2008 at 4:35 AM, Sridhar Raman <[EMAIL PROTECTED]> wrote: > Should the installation paths be the same in all the nodes? Most > documenta

RE: Stack Overflow When Running Job

2008-06-02 Thread Devaraj Das

Hi, do you have a testcase that we can run to reproduce this? Thanks! > -Original Message- > From: jkupferman [mailto:[EMAIL PROTECTED] > Sent: Monday, June 02, 2008 9:22 AM > To: core-user@hadoop.apache.org > Subject: Stack Overflow When Running Job > > > Hi everyone, > I have a job ru

Re: hadoop on EC2

2008-06-02 Thread Chris K Wensel

obviously this isn't the best solution if you need to let many semi trusted users browse your cluster. Actually, it would be much more secure if the tunnel service ran on a trusted server letting your users connect remotely via SOCKS and then browse the cluster. These users wouldn't need

Re: DataNode often self-stopped

2008-06-02 Thread Konstantin Shvachko

Is it possible that your different data-nodes point to the same storage directory on the hard drive? If so one of the data-nodes will be shut down. In general this is impossible because storage directories are locked once one of the nodes claims them under its authority. But I don't know whether

Re: hadoop on EC2

2008-06-02 Thread Chris K Wensel

if you use the new scripts in 0.17.0, just run > hadoop-ec2 proxy this starts a ssh tunnel to your cluster. installing foxy proxy in FF gives you whole cluster visibility.. obviously this isn't the best solution if you need to let many semi trusted users browse your cluster. On May 28, 20

Re: Realtime Map Reduce = Supercomputing for the Masses?

2008-06-02 Thread Steve Loughran

Alejandro Abdelnur wrote: Yes you would have to do it with classloaders (not 'hello world' but not 'rocket science' either). That's where we differ. I do actually think that classloaders are incredibly hard to get right, and I say that as someone who has single stepped through the Axis2 code

Re: About Metrics update

2008-06-02 Thread lohit

In MetricsIntValue, incrMetrics() was being called on pushMetrics(), instead of setMetrics(). This used to cause the values to be incremented periodically. Thanks, Lohit - Original Message From: Ion Badita <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Saturday, May 31, 2008 4

Re: Text file character encoding

2008-06-02 Thread Ted Dunning

You should file a Jira, make the change and submit a patch! On Sun, Jun 1, 2008 at 11:19 PM, NOMURA Yoshihide <[EMAIL PROTECTED]> wrote: > Hello, > I'm using Hadoop 0.17.0 to analyze some large amount of CSV files. > > And I need to read such files in different character encoding from UTF-8, > bu

Re: Realtime Map Reduce = Supercomputing for the Masses?

2008-06-02 Thread Alejandro Abdelnur

Yes you would have to do it with classloaders (not 'hello world' but not 'rocket science' either). You'll be limited on using native libraries, even if you use classloaders properly as native libs can be loaded only once. You will have to ensure you get rid of the task classloader once the task i

Re: Realtime Map Reduce = Supercomputing for the Masses?

2008-06-02 Thread Christophe Taton

Hi Steve, On Mon, Jun 2, 2008 at 12:23 PM, Steve Loughran <[EMAIL PROTECTED]> wrote: > Christophe Taton wrote: > >> Actually Hadoop could be made more friendly to such realtime Map/Reduce >> jobs. >> For instance, we could consider running all tasks inside the task tracker >> jvm as separate thre

Matrix Multiplication Problem

2008-06-02 Thread Hadoop

I downloaded the Matrix Multiplication code from: http://code.google.com/p/hama/source/browse/trunk/src/java/org/apache/hama/ but I do not know how can I run it in the right way. Could you please give steps how to run the code? -- View this message in context: http://www.nabble.com/Matrix-Mult

DataNode often self-stopped

2008-06-02 Thread smallufo

Hi I am simulating a 4-DataNodes environment using VMWare. I found some data nodes often self-stopped after receiving a large file (or block). In fact , not so large , it is just smaller than 10MB. This is the error messages : 2008-05-27 16:40:54,727 INFO org.apache.hadoop.dfs.DataNode: Received

Re: Realtime Map Reduce = Supercomputing for the Masses?

2008-06-02 Thread Steve Loughran

Christophe Taton wrote: Actually Hadoop could be made more friendly to such realtime Map/Reduce jobs. For instance, we could consider running all tasks inside the task tracker jvm as separate threads, which could be implemented as another personality of the TaskRunner. I have been looking into th

MetricsIntValue/MetricsLongValue publish once

2008-06-02 Thread Ion Badita

Hi, In javadoc for MetricsIntValue and MetricsLongValue is written: "Each time its value is set, it is published only *once* at the next update call". Looking at the those classes is right they "push" the data into the MetricsRecord only once, but digging dipper into the AbstractMericsContext

Re: distcp/ls fails on Hadoop-0.17.0 on ec2.

2008-06-02 Thread Einar Vollset

Hi Tom. Ah... From reading (your?) article: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873&categoryID=112 I got confused; it seems to suggest that distcp is used to move ordinary S3 objects onto HDFS.. Thanks for the clarification. Cheers, Einar On Sat, May 31, 200

Using hadoop to store large backups

2008-06-02 Thread Greg Connor

I'm starting to use Hadoop as a simple "storage pool" to store backups of large things (currently Oracle database backups). My Hadoop usage is at a pretty primitive level so far and I am really only scratching the surface of what it can do. I haven't used map/reduce at all--so far it's just be

Re: Stackoverflow

Re: Stackoverflow

Re: Matrix Multiplication Problem

creating less than 10G data with RandomWriter

create less than 10G data/host with RandomWrite

Re: DataNode often self-stopped

Re: DataNode often self-stopped

Re: Hadoop installation folders in multiple nodes

Re: Hadoop installation folders in multiple nodes

RE: Stack Overflow When Running Job

Re: hadoop on EC2

Re: DataNode often self-stopped

Re: hadoop on EC2

Re: Realtime Map Reduce = Supercomputing for the Masses?

Re: About Metrics update

Re: Text file character encoding

Re: Realtime Map Reduce = Supercomputing for the Masses?

Re: Realtime Map Reduce = Supercomputing for the Masses?

Matrix Multiplication Problem

DataNode often self-stopped

Re: Realtime Map Reduce = Supercomputing for the Masses?

MetricsIntValue/MetricsLongValue publish once

Re: distcp/ls fails on Hadoop-0.17.0 on ec2.

Using hadoop to store large backups

24 matches

Site Navigation

Mail list logo

Footer information