date:20080604

Re: Percent progress of map/reduce in JobClient

2008-06-04 Thread Tanton Gibbs

Doesn't seem to be, to me. Seems to be an indicator of records On Wed, Jun 4, 2008 at 5:53 PM, Daniel Blaisdell <[EMAIL PROTECTED]> wrote: > Is the map progress indicator computed as a percentage of maps completed? > > -Daniel > > On Wed, Jun 4, 2008 at 6:51 PM, Tanton Gibbs <[EMAIL PROTECTED]>

Re: Monthly user group meeting

2008-06-04 Thread Otis Gospodnetic

Hi, Any chance the videos will be taken *and* made available outside Yahoo? The videos from the Hadoop summit are still not available: http://developer.yahoo.com/blogs/hadoop/2008/04/hadoop_summit_slides_and_video.html And at this point it looks like they never will be available :( Thanks, Ot

Re: [core-user] Help deflating output files

2008-06-04 Thread Jim R. Wilson

Has someone already written a generic deflator program? It would be a great util to add to the core :) -- Jim On Wed, Jun 4, 2008 at 7:27 PM, Runping Qi <[EMAIL PROTECTED]> wrote: > > You can run another map-only job to read convert the deflated files and > write them out in the format you want.

RE: [core-user] Help deflating output files

2008-06-04 Thread Runping Qi

You can run another map-only job to read convert the deflated files and write them out in the format you want. Runping > -Original Message- > From: Jim R. Wilson [mailto:[EMAIL PROTECTED] > Sent: Wednesday, June 04, 2008 4:13 PM > To: core-user@hadoop.apache.org > Subject: [core-user] H

Re: compressed/encrypted file

2008-06-04 Thread Parand Darugar

- Original Message - From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Wed Jun 04 15:06:42 2008 Subject: Re: compressed/encrypted file You can compress / decompress at many points: --prior to mapping --after mapping --after reducing (I've been experim

[core-user] Help deflating output files

2008-06-04 Thread Jim R. Wilson

Hi all, I'm using hadoop-streaming to execute Python jobs in an EC2 cluster. The output directory in HDFS has part-0.deflate files - how can I deflate them back into regular text? In my hadoop-site.xml, I unfortunately have: mapred.output.compress true mapred.output.compression.type

Re: Percent progress of map/reduce in JobClient

2008-06-04 Thread Daniel Blaisdell

Is the map progress indicator computed as a percentage of maps completed? -Daniel On Wed, Jun 4, 2008 at 6:51 PM, Tanton Gibbs <[EMAIL PROTECTED]> wrote: > From what I've read, there are three reduce phases 1. copy 2. sort 3. > reduce > From 0 - 33% is the copy phase. I guess if you don't need

Re: compressed/encrypted file

2008-06-04 Thread Arun C Murthy

Haijun, On Jun 4, 2008, at 3:45 PM, Haijun Cao wrote: Mile, Thanks. "If your inputs to maps are compressed, then you don't get any automatic assignment of mappers to your data: each gzipped file gets assigned a mapper." <--- this is the case I am talking about. With the current compres

Re: Percent progress of map/reduce in JobClient

2008-06-04 Thread Tanton Gibbs

>From what I've read, there are three reduce phases 1. copy 2. sort 3. reduce >From 0 - 33% is the copy phase. I guess if you don't need that phase it could skip this completely. After 33%, it waits until it is done sorting before outputting status again at 66%, then it updates regularly during th

RE: compressed/encrypted file

2008-06-04 Thread Haijun Cao

Mile, Thanks. "If your inputs to maps are compressed, then you don't get any automatic assignment of mappers to your data: each gzipped file gets assigned a mapper." <--- this is the case I am talking about. Haijun -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] O

Re: compressed/encrypted file

2008-06-04 Thread Miles Osborne

You can compress / decompress at many points: --prior to mapping --after mapping --after reducing (I've been experimenting with all these options; we have been crawling blogs every day since Feb and we store on DFS compressed sets of posts) If your inputs to maps are compressed, then you don't

Monthly user group meeting

2008-06-04 Thread Ajay Anand

The next user group meeting is scheduled for June 18th from 6-7:30 pm at the Yahoo! Mission College campus (2821 Mission College, Santa Clara). Registration, driving directions etc are at http://upcoming.yahoo.com/event/760573/ Agenda: 1) Hadoop at Facebook, Hive - Jeff Hammerbacher 2)

compressed/encrypted file

2008-06-04 Thread Haijun Cao

If a file is compressed and encrypted, then is it still possible to split it and run mappers in parallel? Do people compress their files stored in hadoop? If yes, how do you go about processing them in parallel? Thanks Haijun

Percent progress of map/reduce in JobClient

2008-06-04 Thread Stuart Sierra

How does Hadoop decide when to update the "percent complete" for map/reduce tasks? I've been running a small job (~150 MB) on a pseudo-distributed cluster. "bin/hadoop jar" prints: 08/06/04 17:02:16 INFO mapred.JobClient: map 0% reduce 0% 08/06/04 17:05:52 INFO mapred.JobClient: map 100% reduc

Re: confusing about decommission in HDFS

2008-06-04 Thread lohit

The 3 steps you mentioned, were they done while namenode was still running? I think (I might be wrong as well), that the config is read only once, when the namenode is started. So, you should have defined dfs.hosts.exclude file before hand. When you want to refresh, you just updated the file alr

Re: Stackoverflow

2008-06-04 Thread Chris Douglas

The pivot selection is the median of the first, middle, and last elements; it should be the best choice for sorted data. It's still possible to pick bad pivots, but data that forces hundreds of consecutive bad pivot selections should be exceedingly rare. -C On Jun 4, 2008, at 9:24 AM, Doug

Re: Upgrade from 0.16.3 to 0.17.0

2008-06-04 Thread Tanton Gibbs

1) Yes, that is normal. You have to manually finalize the upgrade. 2) Probably, because (as I understand it), it keeps a backup of the pre-upgraded state. 3) you can use hadoop dfsadmin -finalizeUpgrade to finalize it. See here: http://wiki.apache.org/hadoop/Hadoop_Upgrade 4) I assume the finaliz

checking per-node health (jobs, tasks, failures)?

2008-06-04 Thread Meng Mao

I'm trying to implement Nagios health monitoring of a Hadoop grid. If anyone has general tips to share, those would be welcome, too. For those who don't know, Nagios is monitoring software that organizes and manages checking of services. As best as I know, the easiest, most decoupled way to monito

Re: Stackoverflow

2008-06-04 Thread Doug Cutting

Andreas Kostyrka wrote: java.lang.StackOverflowError at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:494) at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:29) at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:58) at org.apache.

confusing about decommission in HDFS

2008-06-04 Thread Xiangna Li

hi, I try to decommission a node by the following the steps: (1) write the hostname of decommission node in a file as the exclude file. (2) let the exclude file be specified as a configuration parameter dfs.hosts.exclude. (3) run "bin/hadoop dfsadmin -refreshNodes". It

Re: hadoop on EC2

2008-06-04 Thread Chris K Wensel

These are the FoxyProxy wildcards I use *compute-1.amazonaws.com* *.ec2.internal* *.compute-1.internal* and w/ hadoop 0.17.0, just type (after booting your cluster) hadoop-ec2 proxy to start the tunnel for that cluster On Jun 3, 2008, at 11:26 PM, James Moore wrote: On Tue, Jun 3, 2008 at

RE: Stackoverflow

2008-06-04 Thread Devaraj Das

Hi Andreas, Here is what I did: bin/hadoop jar build/hadoop-0.18.0-dev-examples.jar randomtextwriter -Dtest.randomtextwrite.min_words_key=40 -Dtest.randomtextwrite.max_words_key=50 -Dtest.randomtextwrite.maps_per_host=1 textinput (this would generate 1GB of text data with pretty long sentences. R

Upgrade from 0.16.3 to 0.17.0

2008-06-04 Thread Iván de Prado

I have upgraded from 0.16.3 to 0.17.0 correctly. But after a few days the disk usage has been increased. I have notice that there are two folder in the data nodes: - current -> With version -13 - previous -> With version -11 And I have this message in the HDFS webapp: Upgrade for version -13 ha

Re: hadoop on EC2

2008-06-04 Thread Steve Loughran

Andreas Kostyrka wrote: Well, the basic "trouble" with EC2 is that clusters usually are not networks in the TCP/IP sense. This makes it painful to decide which URLs should be resolved where. Plus to make it even more painful, you cannot easily run it with one simple SOCKS server, because you

Re: Stackoverflow

2008-06-04 Thread Steve Loughran

Andreas Kostyrka wrote: Ok, a new dead job: ;( This time after 2.4GB/11,3M lines ;( Any idea what I could do debug this? (No idea how to go at debugging a Java process that is distributed and does GBs of data. Its one of the big problems of distributed computing; distributed debugging How

RE: setrep

2008-06-04 Thread Haijun Cao

Lohit, Thanks for the explanation. If that's the case, then it is not slower than expected. Haijun -Original Message- From: lohit [mailto:[EMAIL PROTECTED] Sent: Wed 6/4/2008 2:11 AM To: core-user@hadoop.apache.org Subject: Re: setrep >It seems that setrep won't force replicatio

Re: Percent progress of map/reduce in JobClient

Re: Monthly user group meeting

Re: [core-user] Help deflating output files

RE: [core-user] Help deflating output files

Re: compressed/encrypted file

[core-user] Help deflating output files

Re: Percent progress of map/reduce in JobClient

Re: compressed/encrypted file

Re: Percent progress of map/reduce in JobClient

RE: compressed/encrypted file

Re: compressed/encrypted file

Monthly user group meeting

compressed/encrypted file

Percent progress of map/reduce in JobClient

Re: confusing about decommission in HDFS

Re: Stackoverflow

Re: Upgrade from 0.16.3 to 0.17.0

checking per-node health (jobs, tasks, failures)?

Re: Stackoverflow

confusing about decommission in HDFS

Re: hadoop on EC2

RE: Stackoverflow

Upgrade from 0.16.3 to 0.17.0

Re: hadoop on EC2

Re: Stackoverflow

RE: setrep

26 matches

Site Navigation

Mail list logo

Footer information