date:20090210

Copying a file to specified nodes

2009-02-10 Thread Rasit OZDAS

Hi, We have thousands of files, each dedicated to a user. (Each user has access to other users' files, but they do this not very often.) Each user runs map-reduce jobs on the cluster. So we should seperate his/her files equally across the cluster, so that every machine can take part in the

Re: Re: Re: Re: Re: Regarding Hadoop multi cluster set-up

2009-02-10 Thread nitesh bhatia

in hadoop-site.xml change valuemaster:54311/value to valuehdfs://master:54311/value --nitesh On Tue, Feb 10, 2009 at 9:50 PM, shefali pawar shefal...@rediffmail.comwrote: I tried that, but it is not working either! Shefali On Sun, 08 Feb 2009 05:27:54 +0530 wrote I ran into this

Weird Results with Streaming

2009-02-10 Thread S D

[I'm starting a new thread because there is sufficiently new/weird info but I've attached an initial thread in case that might be useful.] I'm running Hadoop 0.19.0. I have an input file consisting of several lines that each contain a name and URL. The idea is pretty simple: download the contents

Loading native libraries

2009-02-10 Thread Mimi Sun

Hi, I'm new to Hadoop and I'm wondering what the recommended method is for using native libraries in mapred jobs. I've tried the following separately: 1. set LD_LIBRARY_PATH in .bashrc 2. set LD_LIBRARY_PATH and JAVA_LIBRARY_PATH in hadoop-env.sh 3. set -Djava.library.path=... for

Re: Loading native libraries

2009-02-10 Thread Arun C Murthy

On Feb 10, 2009, at 11:06 AM, Mimi Sun wrote: Hi, I'm new to Hadoop and I'm wondering what the recommended method is for using native libraries in mapred jobs. I've tried the following separately: 1. set LD_LIBRARY_PATH in .bashrc 2. set LD_LIBRARY_PATH and JAVA_LIBRARY_PATH in

anybody knows an apache-license-compatible impl of Integer.parseInt?

2009-02-10 Thread Zheng Shao

We need to implement a version of Integer.parseInt/atoi from byte[] instead of String to avoid the high cost of creating a String object. I wanted to take the open jdk code but the license is GPL: http://www.docjar.com/html/api/java/lang/Integer.java.html Does anybody know an implementation

Best practices on spliltting an input line?

2009-02-10 Thread Andy Sautins

I have question. I've dabbled with different ways of tokenizing an input file line for processing. I've noticed in my somewhat limited tests that there seem to be some pretty reasonable performance differences between different tokenizing methods. For example, roughly it seems to split a

Re: Loading native libraries

2009-02-10 Thread Mimi Sun

I see UnsatisfiedLinkError. Also I'm calling System.getProperty(java.library.path) in the reducer and logging it. The only thing that prints out is ...hadoop-0.18.2/bin/../lib/native/ Mac_OS_X-i386-32 I'm using Cascading, not sure if that affects anything. - Mimi On Feb 10, 2009, at

Re: Copying a file to specified nodes

2009-02-10 Thread Jeff Hammerbacher

Hey Rasit, I'm not sure I fully understand your description of the problem, but you might want to check out the JIRA ticket for making the replica placement algorithms in HDFS pluggable (https://issues.apache.org/jira/browse/HADOOP-3799) and add your use case there. Regards, Jeff On Tue, Feb

Re: what's going on :( ?

2009-02-10 Thread Jeff Hammerbacher

Hey Mark, In NameNode.java, the DEFAULT_PORT specified for NameNode RPC is 8020. From my understanding of the code, your fs.default.name setting should have overridden this port to be 9000. It appears your Hadoop installation has not picked up the configuration settings appropriately. You might

File Transfer Rates

2009-02-10 Thread Wasim Bari

Hi, Could someone help me to find some real Figures (transfer rate) about Hadoop File transfer from local filesystem to HDFS, S3 etc and among Storage Systems (HDFS to S3 etc) Thanks, Wasim

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman

On Feb 10, 2009, at 4:10 PM, Wasim Bari wrote: Hi, Could someone help me to find some real Figures (transfer rate) about Hadoop File transfer from local filesystem to HDFS, S3 etc and among Storage Systems (HDFS to S3 etc) Thanks, Wasim What are you looking for? Maximum possible

Re: File Transfer Rates

2009-02-10 Thread Mark Kerzner

Brian, I have a similar question: why does transfer from a local filesystem to SequenceFile takes so long (about 1 second per Meg)? Thank you, Mark On Tue, Feb 10, 2009 at 4:46 PM, Brian Bockelman bbock...@cse.unl.eduwrote: On Feb 10, 2009, at 4:10 PM, Wasim Bari wrote: Hi, Could someone

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman

On Feb 10, 2009, at 4:53 PM, Mark Kerzner wrote: Brian, I have a similar question: why does transfer from a local filesystem to SequenceFile takes so long (about 1 second per Meg)? Hey Mark, I saw your question about speed the other day ... unfortunately, I didn't have any specific

Re: File Transfer Rates

2009-02-10 Thread Amit Chandel

With my setup. I have been able to get 10MBps write speed, 40MBps read speed while writing multiple files (ranging a few Bytes to 100MB) into SequenceFiles, and reading them back. The cluster has 1Gbps backbone. On Tue, Feb 10, 2009 at 5:53 PM, Mark Kerzner markkerz...@gmail.com wrote: Brian,

Re: Reporter for Hadoop Streaming?

2009-02-10 Thread scruffy323

Do you know how to access those counters programmatically after the job has run? S D-5 wrote: This does it. Thanks! On Thu, Feb 5, 2009 at 9:14 PM, Arun C Murthy a...@yahoo-inc.com wrote: On Feb 5, 2009, at 1:40 PM, S D wrote: Is there a way to use the Reporter interface (or

Is there a way to tell whether you're in a map task or a reduce task?

2009-02-10 Thread Matei Zaharia

I'd like to write a combiner that shares a lot of code with a reducer, except that the reducer updates an external database at the end. As far as I can tell, since both combiners and reducers must implement the Reducer interface, there is no way to have this be the same class. Is there a

Re: anybody knows an apache-license-compatible impl of Integer.parseInt?

2009-02-10 Thread Min Zhou

Hey zheng, Maybe you can try ragel ,which can compile very effective codes for your fsm from regex. The atoi function produced by ragel can run faster than glibc's. It also targets java. http://www.complang.org/ragel/ On Wed, Feb 11, 2009 at 4:18 AM, Zheng Shao zs...@facebook.com wrote: We

Testing with Distributed Cache

2009-02-10 Thread Nathan Marz

I have some unit tests which run MapReduce jobs and test the inputs/ outputs in standalone mode. I recently started using DistributedCache in one of these jobs, but now my tests fail with errors such as: Caused by: java.io.IOException: Incomplete HDFS URI, no host: hdfs:/// tmp/file.data at

Re: Is there a way to tell whether you're in a map task or a reduce task?

2009-02-10 Thread Owen O'Malley

On Feb 10, 2009, at 5:20 PM, Matei Zaharia wrote: I'd like to write a combiner that shares a lot of code with a reducer, except that the reducer updates an external database at the end. The right way to do this is to either do the update in the output format or do something like: class

stable version

2009-02-10 Thread Vadim Zaliva

Hi! Kind of novice question, but I need to know, what Hadoop version is considered stable. I was trying to run version 0.19, and I've seen numerous stability issues with it. Maybe version 0.18 is better suited for production environment? Vadim

Re: Testing with Distributed Cache

2009-02-10 Thread Amareshwari Sriramadasu

Nathan Marz wrote: I have some unit tests which run MapReduce jobs and test the inputs/outputs in standalone mode. I recently started using DistributedCache in one of these jobs, but now my tests fail with errors such as: Caused by: java.io.IOException: Incomplete HDFS URI, no host:

could this be an error in hadoop documentation of a bug

2009-02-10 Thread Mark Kerzner

Hi, the Quick Starthttp://hadoop.apache.org/core/docs/current/quickstart.htmlhas this sample configuration namefs.default.name/name valuehdfs://localhost:9000/value but it does not seem to work: even though the daemons do listen to 9000, the following command always uses 8020 hadoop fs

Re: File Transfer Rates

2009-02-10 Thread Mark Kerzner

Brian, large files using command-line hadoop go fast, so it is something about my computer or network. I won't worry about this now, especially in light of Amit reporting fast writes and reads. Mark On Tue, Feb 10, 2009 at 5:00 PM, Brian Bockelman bbock...@cse.unl.eduwrote: On Feb 10, 2009,

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman

On Feb 10, 2009, at 11:09 PM, Mark Kerzner wrote: Brian, large files using command-line hadoop go fast, so it is something about my computer or network. I won't worry about this now, especially in light of Amit reporting fast writes and reads. You're creating files using SequenceFile,

Re: File Transfer Rates

2009-02-10 Thread Mark Kerzner

Brian, I saw that Stuart herehttp://stuartsierra.com/2008/04/24/a-million-little-filesmentions slow writes to SequenceFile. If so, I will either use his tar approach or try to parallelize it if I can. On Tue, Feb 10, 2009 at 11:14 PM, Brian Bockelman bbock...@cse.unl.eduwrote: On Feb 10, 2009,

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman

Just to toss out some numbers (and because our users are making interesting numbers right now) Here's our external network router: http://mrtg.unl.edu/~cricket/?target=%2Frouter-interfaces%2Fborder2%2Ftengigabitethernet2_2;view=Octets Here's the application-level transfer graph:

Re: File Transfer Rates

2009-02-10 Thread Mark Kerzner

I say, that's very interesting and useful. On Tue, Feb 10, 2009 at 11:37 PM, Brian Bockelman bbock...@cse.unl.eduwrote: Just to toss out some numbers (and because our users are making interesting numbers right now) Here's our external network router:

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-10 Thread Wu Wei

We got the same problem as you when using MultipleOutputFormat both on hadoop 0.18 and 0.19. On hadoop 0.18, increasing the xceivers count does not fix the problem. But we found many error message complaining that xceiverCount exceeded the limit of concurrent xcievers in datanode (running on

Copying a file to specified nodes

Re: Re: Re: Re: Re: Regarding Hadoop multi cluster set-up

Weird Results with Streaming

Loading native libraries

Re: Loading native libraries

anybody knows an apache-license-compatible impl of Integer.parseInt?

Best practices on spliltting an input line?

Re: Loading native libraries

Re: Copying a file to specified nodes

Re: what's going on :( ?

File Transfer Rates

Re: File Transfer Rates

Re: File Transfer Rates

Re: File Transfer Rates

Re: File Transfer Rates

Re: Reporter for Hadoop Streaming?

Is there a way to tell whether you're in a map task or a reduce task?

Re: anybody knows an apache-license-compatible impl of Integer.parseInt?

Testing with Distributed Cache

Re: Is there a way to tell whether you're in a map task or a reduce task?

stable version

Re: Testing with Distributed Cache

could this be an error in hadoop documentation of a bug

Re: File Transfer Rates

Re: File Transfer Rates

Re: File Transfer Rates

Re: File Transfer Rates

Re: File Transfer Rates

Re: java.io.IOException: Could not get block locations. Aborting...

29 matches

Site Navigation

Mail list logo

Footer information