date:20100225

Re: cluster involvement trigger

2010-02-25 Thread Amogh Vasekar

Hi,
The number of mappers initialized depends largely on your input format ( the 
getSplits of your input format) , (almost all) input formats available in 
hadoop derive from fileinputformat, hence the 1 mapper per file block notion ( 
this actually is 1 mapper per split ).
You say that you have too many small files. In general each of these small 
files  ( < 64 mb ) will be executed by a single mapper. However, I would 
suggest looking at CombineFileInputFormat which does the job of packaging many 
small files together depending on data locality for better performance ( 
initialization time is a significant factor in hadoop's performance ).
On the other side, many small files will hamper your namenode performance since 
file metadata is stored in memory and limit its overall capacity wrt number of 
files.

Amogh


On 2/25/10 11:15 PM, "Michael Kintzer"  wrote:

Hi,

We are using the streaming API.We are trying to understand what hadoop uses 
as a threshold or trigger to involve more TaskTracker nodes in a given 
Map-Reduce execution.

With default settings (64MB chunk size in HDFS), if the input file is less than 
64MB, will the data processing only occur on a single TaskTracker Node, even if 
our cluster size is greater than 1?

For example, we are trying to figure out if hadoop is more efficient at 
processing:
a) a single input file which is just an index file that refers to a jar archive 
of 100K or 1M individual small files, where the jar file is passed as the 
"-archives" argument, or
b) a single input file containing all the raw data represented by the 100K or 
1M small files.

With (a), our input file is <64MB.   With (b) our input file is very large.

Thanks for any insight,

-Michael

Re: On CDH2, (Cloudera EC2) No valid local directories in property: mapred.local.dir

2010-02-25 Thread Saptarshi Guha

Hello,
I fixed this by running more than >=2 slaves.
I was testing with 1 when this error occurred.

Regards
Saptarshi

On Tue, Feb 23, 2010 at 2:57 PM, Todd Lipcon  wrote:
> Hi Saptarshi,
>
> Can you please ssh into the JobTracker node and check that this
> directory is mounted, writable by the hadoop user, and not full?
>
> -Todd
>
> On Fri, Feb 19, 2010 at 2:13 PM, Saptarshi Guha
>  wrote:
>> Hello,
>> Not sure if i should post this here or on Cloudera's message board,
>> but here goes.
>> When I run EC2 using the latest CDH2 and Hadoop 0.20 (by settiing the
>> env variables are hadoop-ec2),
>> and launch a job
>> hadoop jar ...
>>
>> I get the following error
>>
>>
>> 10/02/19 17:04:55 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the
>> same.
>> org.apache.hadoop.ipc.RemoteException: java.io.IOException: No valid
>> local directories in property: mapred.local.dir
>>        at 
>> org.apache.hadoop.conf.Configuration.getLocalPath(Configuration.java:975)
>>        at org.apache.hadoop.mapred.JobConf.getLocalPath(JobConf.java:279)
>>        at 
>> org.apache.hadoop.mapred.JobInProgress.(JobInProgress.java:256)
>>        at 
>> org.apache.hadoop.mapred.JobInProgress.(JobInProgress.java:240)
>>        at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3026)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
>>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)
>>
>>        at org.apache.hadoop.ipc.Client.call(Client.java:740)
>>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>        at org.apache.hadoop.mapred.$Proxy0.submitJob(Unknown Source)
>>        at 
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:841)
>>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>>
>>        at org.godhuli.f.RHMR.submitAndMonitorJob(RHMR.java:195)
>>
>> but the value  of mapred.local.dir is "/mnt/hadoop/mapred/local"
>>
>> Any ideas?
>>
>

Hadoop freeze?

2010-02-25 Thread jiang licht

I ran into the following problem running a hadoop job written in pig.Pls help 
check what caused the issue. As I could tell, it seems to me the job/task 
tracker failed for some reason but 
name/data nodes still functioning. 

The job simply seems to make no progress at all (no output, no log). But couple 
of other hadoop jobs ran successfully before this one. hadoop fs -ls can still 
list files. But I did "Hadoop job -list", it took too long and then failed with 
error message as follows.

Exception in thread "main" java.io.IOException: Call to 
hostname/ip-address:50002 failed on
 local exception: Connection reset by peer  at 
org.apache.hadoop.ipc.Client.call(Client.java:699)  at 
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)  at 
org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) at 
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)at 
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435)   at 
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429) at 
org.apache.hadoop.mapred.JobClient.run(JobClient.java:1512) at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)   at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)   at 
org.apache.hadoop.mapred.JobClient.main(JobClient.java:1727)Caused
 by: java.io.IOException: Connection reset by peer  at 
sun.nio.ch.FileDispatcher.read0(Native Method)  at 
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)  at 
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at 
sun.nio.ch.IOUtil.read(IOUtil.java:206) at 
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)   at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
 at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
at 
java.io.FilterInputStream.read(FilterInputStream.java:116)  at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:271)   
at 
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)  at 
java.io.BufferedInputStream.read(BufferedInputStream.java:237)  at 
java.io.DataInputStream.readInt(DataInputStream.java:370)   at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:493)
at 
org.apache.hadoop.ipc.Client$Connection.run(Client.java:438)
Web interface to job trac...@50030 simply came with no response at all.

By checking netstat, sometimes it shows 50030 and sometimes not. connections 
and ports with data nodes were shown there.

Then, if I ran another pig, it failed with the following error:

Error before Pig is launchedERROR
 6009: Failed to create job client:Call to hostname/ip-address:50002 failed on
 local exception: Connection reset by peer
org.apache.pig.backend.executionengine.ExecException:
 ERROR 6009: Failed to create job client:Call to hostname/ip-address:50002 
failed on
 local exception: Connection reset by peer  at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:217)
  at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:137)
  at 
org.apache.pig.impl.PigContext.connect(PigContext.java:199) at 
org.apache.pig.PigServer.(PigServer.java:169) at 
org.apache.pig.PigServer.(PigServer.java:158) at 
org.apache.pig.tools.grunt.Grunt.(Grunt.java:54)  at 
org.apache.pig.Main.main(Main.java:395)Caused by: 
java.io.IOException: Call to hostname/ip-address:50002 failed on
 local exception: Connection reset by peer  at 
org.apache.hadoop.ipc.Client.call(Client.java:699)  at 
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)  at 
org.apache.hadoop.mapred.$Proxy1.getProtocolVersion(Unknown Source) at 
org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)at 
org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:435)   at 
org.apache.hadoop.mapred.JobClient.init(JobClient.java:429) at 
org.apache.hadoop.mapred.JobClient.(JobClient.java:398)   at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:212)
  ... 6 moreCaused
 by: java.io.IOException: Connection reset by peer  at 
sun.nio.ch.FileDispatcher.read0(Native Method)  at 
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)  at 
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233) at 
sun.nio.ch.IOUtil.read(IOUtil.java:206) at 
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)   at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
 at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInput

Re: CDH2 or Apache Hadoop - Official Debian packages

2010-02-25 Thread Allen Wittenauer

On 2/25/10 8:39 AM, "Thomas Koch"  wrote:
>>> - no version namespace, everything is called just "hadoop", not
>>> "hadoop-0.18" or "hadoop-0.20" as in the cloudera package
>> 
>> ... and thus making upgrades really hard and not suitable for anything
>> "real".
> Actually my hope is in the plan of hadoop to once establish a stable API (as
> planned) so that an upgrade will be backwards compatible.

History shows you are in for a long wait.

It is also worth pointing out that API compat is only part of the issue.
Without ABI compat, it is still a very rough road.  [A point lost on way too
many in the Hadoop community; too many devs, not enough ops.]

Use intermediate compression for Map output or not?

2010-02-25 Thread jiang licht

Hi hadoop Gurus, here's a question about intermediate compression.

As I understand, the point to compress Map output is to reduce network traffic 
that occur when feeding sequence files from Map to Reduce tasks which do not 
reside on the same boxes as Map tasks. So, depending on various factors such as 
how cluster is set up, data size, property of problem to solve and the quality 
of m/r program (e.g. a pig script), etc, this reduce in network traffic (due to 
compressed data) may or may not compensate the time for compression and 
decompression. In other words, intermediate compression may not reach its goal 
to reduce the overall time cost of a m/r job.

As I know, a blog 
(http://blog.oskarsson.nu/2009/03/hadoop-feat-lzo-save-disk-space-and.html) 
gives compression and decompression factor and speed and reports a positive 
result of using compression on raw data as input to m/r job, but no test or 
insight about intermediate compression. So, I am wondering if there is any case 
study or test results guiding when to use intermediate compression, pros and 
cons, settings, pitfalls and gains...

Thanks,

Michael

Re: Sun JVM 1.6.0u18

2010-02-25 Thread Todd Lipcon

On Thu, Feb 25, 2010 at 11:09 AM, Scott Carey wrote:

> On Feb 15, 2010, at 9:54 PM, Todd Lipcon wrote:
>
> > Hey all,
> >
> > Just a note that you should avoid upgrading your clusters to 1.6.0u18.
> > We've seen a lot of segfaults or bus errors on the DN when running
> > with this JVM - Stack found the ame thing on one of his clusters as
> > well.
> >
>
> Have you seen this for 32bit, 64 bit, or both?  If 64 bit, was it with
> -XX:+UseCompressedOops?
>

Just 64-bit, no compressed oops. But I haven't tested other variables.


>
> Any idea if there are Sun bugs open for the crashes?
>
>
I opened one, yes. I think Stack opened a separate one. Haven't heard back.


> I have found some notes that suggest that "-XX:-ReduceInitialCardMarks"
> will work around some known crash problems with 6u18, but that may be
> unrelated.
>
>
Yep, I think that is probably a likely workaround as well. For now I'm
recommending downgrade to our clients, rather than introducing cryptic XX
flags :)



> Lastly, I assume that Java 6u17 should work the same as 6u16, since it is a
> minor patch over 6u16 where 6u18 includes a new version of Hotspot.  Can
> anyone confirm that?
>
>
>
I haven't heard anything bad about u17 either. But since we know 16 to be
very good and nothing important is new in 17, I like to recommend 16 still.

-Todd

Re: Sun JVM 1.6.0u18

2010-02-25 Thread Scott Carey

On Feb 15, 2010, at 9:54 PM, Todd Lipcon wrote:

> Hey all,
> 
> Just a note that you should avoid upgrading your clusters to 1.6.0u18.
> We've seen a lot of segfaults or bus errors on the DN when running
> with this JVM - Stack found the ame thing on one of his clusters as
> well.
> 

Have you seen this for 32bit, 64 bit, or both?  If 64 bit, was it with 
-XX:+UseCompressedOops?

Any idea if there are Sun bugs open for the crashes?

I have found some notes that suggest that "-XX:-ReduceInitialCardMarks" will 
work around some known crash problems with 6u18, but that may be unrelated.  

Lastly, I assume that Java 6u17 should work the same as 6u16, since it is a 
minor patch over 6u16 where 6u18 includes a new version of Hotspot.  Can anyone 
confirm that?

> We've found 1.6.0u16 to be very stable.
> 
> -Todd

Re: CDH2 or Apache Hadoop - Official Debian packages

2010-02-25 Thread Owen O'Malley



On Feb 25, 2010, at 10:20 AM, Allen Wittenauer wrote:

Actually my hope is in the plan of hadoop to once establish a  
stable API (as

planned) so that an upgrade will be backwards compatible.


History shows you are in for a long wait.


I hope not and I'm trying to make sure that isn't true. At this point,  
we have a lot of customers inside Yahoo who yell at our SVP when  
anyone breaks API compatibility with the previous release.


My hope to get to the point where we do one major release a year and  
each major release is backwards compatible with the previous major  
release (as in you don't need to recompile your code). Bonus points if  
we can get a minor release out at the half year point. And of course  
bug fix releases as needed...


-- Owen

Re: CDH2 or Apache Hadoop - Official Debian packages

2010-02-25 Thread Thomas Koch

Allen, 
> For all intents and purposes, the Debian package sounds just like a
> re-packaging of the Apache distribution in .deb form.
You're perfectly right. Most Debian packages are "just" a re-packaging of the 
upstream projects, but with additional management information and logic to 
ease the installation and make them work well on the plattform and together 
with other programs.
It's the beautiful world of package management:
apt-get install hadoop
less /usr/share/doc/hadoop/README
... Have fun with hadoop

> > - no version namespace, everything is called just "hadoop", not
> > "hadoop-0.18" or "hadoop-0.20" as in the cloudera package
> 
> ... and thus making upgrades really hard and not suitable for anything
> "real".
Actually my hope is in the plan of hadoop to once establish a stable API (as 
planned) so that an upgrade will be backwards compatible.
As long as that isn't the case, the Debian package is intended only for three 
audiencens:
- People who are willing to deal with any upgrade hassles for the benefit of 
an official Debian package
- People who'd like to try out and learn hadoop with an easily installable 
package
- Me

That said, I'm going to use the Debian package on a tiny production cluster of 
5 machines.
 
Thomas Koch, http://www.koch.ro

Re: Hadoop key mismatch

2010-02-25 Thread Edward Capriolo

On Wed, Feb 24, 2010 at 3:30 PM, Larry Homes  wrote:
> Hello,
>
> I am trying to sort some values by using a simple map and reduce
> without any processing, but I think I messed up my data types somehow.
>
> Rather than try to paste code in an email, I have described the
> problem and pasted all the code (nicely formatted) here:
> http://www.coderanch.com/t/484435/Distributed-Java/java/Hadoop-key-mismatch
>
> Thanks
>

I think the first problem you are having is that you changed the
signature of the map method incorrectly.

 public void map(Text key, Text value, Context context)

The type of key should be LongWritable. Key is an integer representing
the offset of the line. Value is the entire line of text.

Try :
public void map(LongWritable key, Text value, Context context)

Adjust accordingly and you should be ok. (at least until the next problem :)

Re: java.net.SocketException: Network is unreachable

2010-02-25 Thread neo anderson



Finally get this problem solved. Edit /etc/sysctl.d/bindv6only.conf. Set
net.ipv6.bindv6only=1 to net.ipv6.bindv6only=0 the error would go away.


neo anderson wrote:
> 
> While running example programe ('hadoop jar *example*jar pi 2 2'), I
> encounter 'Network is unreachable' problem (at
> $HADOOP_HOME/logs/userlogs/.../stderr), as below:
> 
> Exception in thread "main" java.io.IOException: Call to /127.0.0.1:
> failed on local exception: java.net.SocketException: Network is
> unreachable
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:774)
> at org.apache.hadoop.ipc.Client.call(Client.java:742)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown
> Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
> at org.apache.hadoop.mapred.Child.main(Child.java:64)
> Caused by: java.net.SocketException: Network is unreachable
> at sun.nio.ch.Net.connect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:304)
> at
> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:859)
> at org.apache.hadoop.ipc.Client.call(Client.java:719)
> ... 6 more
> 
> Initially, it seems to me that is firewall issue, but after disabling
> iptables  the example programe still can not execute correctly. 
> 
> command for disabling iptables. 
> iptables -P INPUT ACCEPT
> iptables -P FORWARD ACCEPT
> iptables -P OUTPUT ACCEPT
> iptables -X
> iptables -F
> 
> When starting up hadoop cluster (start-dfs.sh and start-mapred.sh), it
> looks like the namenode was correctly started up because the log in name
> node contains information
> 
> ... org.apache.hadoop.net.NetworkTopology: Adding a new node:
> /default-rack/111.222.333.5:10010
> ... org.apache.hadoop.net.NetworkTopology: Adding a new node:
> /default-rack/111.222.333.4:10010
> ... org.apache.hadoop.net.NetworkTopology: Adding a new node:
> /default-rack/111.222.333.3:10010
> 
> Also, in datanode 
> ...
> INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /111.222.333.4:34539, dest: /111.222.333.5:50010, bytes: 4, op:
> HDFS_WRITE, ...
> INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src:
> /111.222.333.4:51610, dest: /111.222.333.3:50010, bytes: 118, op:
> HDFS_WRITE, cliID: ...
> ...
> 
> The command 'hadoop fs -ls' can list the data uploaded to the hdfs without
> a problem. And jps shows the necessary processes are running. 
> 
> name node:
> 7710 SecondaryNameNode
> 7594 NameNode
> 8038 JobTracker
> 
> data nodes:
> 3181 TaskTracker
> 3000 DataNode
> 
> Environment: Debian squeeze, hadoop 0.20.1, jdk 1.6.x
> 
> I search online and couldn't find the possible root cause. Is there any
> possibility that may cause such issue? Or any place that I may be able to
> check for more deatail information?
> 
> Thanks for help.
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/java.net.SocketException%3A-Network-is-unreachable-tp27714253p27714443.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Reduce step never starts, can't read output from mappers? (Too many fetch-failures)

2010-02-25 Thread Martin Häger

(re-posted from the mapreduce-user list in case anyone here might have
an answer)

Hello,

I have set up a cluster with one NameNode/JobTracker and three
DataNode/TaskTrackers, and having some issues with the reduce step
being unable to start. Masters and slaves can ping and ssh each other.
Attaching conf files (same on all machines).

Is there anything else I should be looking at?

Log output for JobTracker and one of the TaskTrackers that seems suspicious:

JobTracker
=

exj...@exjobb-1:~$ hadoop jar /opt/hadoop/hadoop-0.20.1-examples.jar
wordcount input/sessions-20100205145800.txt output-wordcount
10/02/24 11:15:24 INFO input.FileInputFormat: Total input paths to process : 1
10/02/24 11:15:25 INFO mapred.JobClient: Running job: job_201002240852_0003
10/02/24 11:15:26 INFO mapred.JobClient:  map 0% reduce 0%
10/02/24 11:15:49 INFO mapred.JobClient:  map 1% reduce 0%
10/02/24 11:15:58 INFO mapred.JobClient:  map 2% reduce 0%
10/02/24 11:16:06 INFO mapred.JobClient:  map 3% reduce 0%
10/02/24 11:16:15 INFO mapred.JobClient:  map 4% reduce 0%
10/02/24 11:16:23 INFO mapred.JobClient:  map 5% reduce 0%
10/02/24 11:16:32 INFO mapred.JobClient:  map 6% reduce 0%
10/02/24 11:16:40 INFO mapred.JobClient:  map 7% reduce 0%
10/02/24 11:16:51 INFO mapred.JobClient:  map 8% reduce 0%
10/02/24 11:16:59 INFO mapred.JobClient:  map 9% reduce 0%
10/02/24 11:17:07 INFO mapred.JobClient:  map 10% reduce 0%
10/02/24 11:17:31 INFO mapred.JobClient:  map 11% reduce 0%
10/02/24 11:17:39 INFO mapred.JobClient:  map 12% reduce 0%
10/02/24 11:17:49 INFO mapred.JobClient:  map 13% reduce 0%
10/02/24 11:17:57 INFO mapred.JobClient:  map 14% reduce 0%
10/02/24 11:18:05 INFO mapred.JobClient:  map 15% reduce 0%
10/02/24 11:18:15 INFO mapred.JobClient:  map 16% reduce 0%
10/02/24 11:18:23 INFO mapred.JobClient:  map 17% reduce 0%
10/02/24 11:18:32 INFO mapred.JobClient:  map 18% reduce 0%
10/02/24 11:18:42 INFO mapred.JobClient:  map 19% reduce 0%
10/02/24 11:18:51 INFO mapred.JobClient:  map 20% reduce 0%
10/02/24 11:19:11 INFO mapred.JobClient:  map 21% reduce 0%
10/02/24 11:19:22 INFO mapred.JobClient:  map 22% reduce 0%
10/02/24 11:19:32 INFO mapred.JobClient:  map 23% reduce 0%
10/02/24 11:19:40 INFO mapred.JobClient:  map 24% reduce 0%
10/02/24 11:19:49 INFO mapred.JobClient:  map 25% reduce 0%
10/02/24 11:19:57 INFO mapred.JobClient:  map 26% reduce 0%
10/02/24 11:20:05 INFO mapred.JobClient:  map 27% reduce 0%
10/02/24 11:20:15 INFO mapred.JobClient:  map 28% reduce 0%
10/02/24 11:20:24 INFO mapred.JobClient:  map 29% reduce 0%
10/02/24 11:20:34 INFO mapred.JobClient:  map 30% reduce 0%
10/02/24 11:20:52 INFO mapred.JobClient:  map 31% reduce 0%
10/02/24 11:21:02 INFO mapred.JobClient:  map 32% reduce 0%
10/02/24 11:21:12 INFO mapred.JobClient:  map 33% reduce 0%
10/02/24 11:21:21 INFO mapred.JobClient:  map 34% reduce 0%
10/02/24 11:21:31 INFO mapred.JobClient:  map 35% reduce 0%
10/02/24 11:21:40 INFO mapred.JobClient:  map 36% reduce 0%
10/02/24 11:21:49 INFO mapred.JobClient:  map 37% reduce 0%
10/02/24 11:21:58 INFO mapred.JobClient:  map 38% reduce 0%
10/02/24 11:22:07 INFO mapred.JobClient:  map 39% reduce 0%
10/02/24 11:22:17 INFO mapred.JobClient:  map 40% reduce 0%
10/02/24 11:22:35 INFO mapred.JobClient:  map 41% reduce 0%
10/02/24 11:22:44 INFO mapred.JobClient:  map 42% reduce 0%
10/02/24 11:22:53 INFO mapred.JobClient:  map 43% reduce 0%
10/02/24 11:23:05 INFO mapred.JobClient:  map 44% reduce 0%
10/02/24 11:23:14 INFO mapred.JobClient:  map 45% reduce 0%
10/02/24 11:23:22 INFO mapred.JobClient:  map 46% reduce 0%
10/02/24 11:23:32 INFO mapred.JobClient:  map 47% reduce 0%
10/02/24 11:23:40 INFO mapred.JobClient:  map 48% reduce 0%
10/02/24 11:23:50 INFO mapred.JobClient:  map 49% reduce 0%
10/02/24 11:23:59 INFO mapred.JobClient:  map 50% reduce 0%
10/02/24 11:24:17 INFO mapred.JobClient:  map 51% reduce 0%
10/02/24 11:24:27 INFO mapred.JobClient:  map 52% reduce 0%
10/02/24 11:24:34 INFO mapred.JobClient:  map 53% reduce 0%
10/02/24 11:24:45 INFO mapred.JobClient:  map 54% reduce 0%
10/02/24 11:24:57 INFO mapred.JobClient:  map 55% reduce 0%
10/02/24 11:25:04 INFO mapred.JobClient:  map 56% reduce 0%
10/02/24 11:25:15 INFO mapred.JobClient:  map 57% reduce 0%
10/02/24 11:25:22 INFO mapred.JobClient:  map 58% reduce 0%
10/02/24 11:25:32 INFO mapred.JobClient:  map 59% reduce 0%
10/02/24 11:25:42 INFO mapred.JobClient:  map 60% reduce 0%
10/02/24 11:25:57 INFO mapred.JobClient:  map 61% reduce 0%
10/02/24 11:26:07 INFO mapred.JobClient:  map 62% reduce 0%
10/02/24 11:26:16 INFO mapred.JobClient:  map 63% reduce 0%
10/02/24 11:26:24 INFO mapred.JobClient:  map 64% reduce 0%
10/02/24 11:26:34 INFO mapred.JobClient:  map 65% reduce 0%
10/02/24 11:26:45 INFO mapred.JobClient:  map 66% reduce 0%
10/02/24 11:26:56 INFO mapred.JobClient:  map 67% reduce 0%
10/02/24 11:27:05 INFO mapred.JobClient:  map 68% reduce 0%
10/02/24 11:27:13 INFO mapred.JobClient:  map 69% reduce 0%
10/02/24 11:27:17 INFO mapred.JobClien

Re: Seattle Hadoop/Scalability/NoSQL Meetup Tonight!

2010-02-25 Thread Bradford Stephens

Thanks for coming, everyone! We had around 25 people. A *huge*
success, for Seattle. And a big thanks to 10gen for sending Richard.

Can't wait to see you all next month.

On Wed, Feb 24, 2010 at 2:15 PM, Bradford Stephens
 wrote:
> The Seattle Hadoop/Scalability/NoSQL (yeah, we vary the title) meetup
> is tonight! We're going to have a guest speaker from MongoDB :)
>
> As always, it's at the University of Washington, Allen Computer
> Science building, Room 303 at 6:45pm. You can find a map here:
> http://www.washington.edu/home/maps/southcentral.html?cse
>
> If you can, please RSVP here (not required, but very nice):
> http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/
>
> --
> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
> solution. Process, store, query, search, and serve all your data.
>
> http://www.roadtofailure.com -- The Fringes of Scalability, Social
> Media, and Computer Science
>



-- 
http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science

Re: cluster involvement trigger

Re: On CDH2, (Cloudera EC2) No valid local directories in property: mapred.local.dir

Hadoop freeze?

Re: CDH2 or Apache Hadoop - Official Debian packages

Use intermediate compression for Map output or not?

Re: Sun JVM 1.6.0u18

Re: Sun JVM 1.6.0u18

Re: CDH2 or Apache Hadoop - Official Debian packages

Re: CDH2 or Apache Hadoop - Official Debian packages

Re: Hadoop key mismatch

Re: java.net.SocketException: Network is unreachable

Reduce step never starts, can't read output from mappers? (Too many fetch-failures)

Re: Seattle Hadoop/Scalability/NoSQL Meetup Tonight!

13 matches

Site Navigation

Mail list logo

Footer information