map tasks fail to report status

2008-11-03 Thread Sébastien Rainville
Hi,

I have a map task that works most of the time but fails on some data. I keep
getting these exceptions:

Task attempt_200811031947_0003_m_95_0 failed to report status for 600
seconds. Killing!


I noticed that the tasks that fail have a lot of these at the end of the
syslogs:

2008-11-03 21:05:52,745 INFO org.apache.hadoop.mapred.Merger: Merging 41
sorted segments
2008-11-03 21:05:52,746 INFO org.apache.hadoop.mapred.Merger: Merging 5
intermediate segments out of a total of 41
2008-11-03 21:05:53,016 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 37
2008-11-03 21:05:53,147 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 28
2008-11-03 21:05:53,329 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 19
2008-11-03 21:05:53,525 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 10 segments left of total size: 7866139 bytes
2008-11-03 21:05:53,848 INFO org.apache.hadoop.mapred.MapTask: Index:
(2465254733, 7866121, 7866121)
2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 41
sorted segments
2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 5
intermediate segments out of a total of 41
2008-11-03 21:05:53,963 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 37
2008-11-03 21:05:53,976 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 28
2008-11-03 21:05:53,996 INFO org.apache.hadoop.mapred.Merger: Merging 10
intermediate segments out of a total of 19
2008-11-03 21:05:54,013 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 10 segments left of total size: 4290611 bytes
...


Sure the ones that succeed have them too but the number of segments is
always significantly lower:

2008-11-03 20:42:38,214 INFO org.apache.hadoop.mapred.MapTask: Index:
(125745724, 351203, 351203)
2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Merging 2
sorted segments
2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 2 segments left of total size: 345895 bytes
2008-11-03 20:42:38,226 INFO org.apache.hadoop.mapred.MapTask: Index:
(126096927, 345893, 345893)
2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Merging 2
sorted segments
2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 2 segments left of total size: 364718 bytes
2008-11-03 20:42:38,237 INFO org.apache.hadoop.mapred.MapTask: Index:
(126442820, 364716, 364716)
2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Merging 2
sorted segments
2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Down to the
last merge-pass, with 2 segments left of total size: 440435 bytes
2008-11-03 20:42:38,247 INFO org.apache.hadoop.mapred.MapTask: Index:
(126807536, 440433, 440433)


I don't get any exceptions beside the timeouts because the tasks don't
report their status. So, my questions are:
- what exactly is the Merger? Why is it only merging at the end of the
tasks? Why does it seems to merge several times the same data?
- Can it really be causing the problem or should I look somewhere else
(there's no exception after all) ? It's most probably in my code but I don't
see any exception so it's kind of hard to tell what's happening.

Thanks in advance,
Sebastien


Interaction between JobTracker / TaskTracker

2008-11-03 Thread Ricky Ho
Hi,
As a relatively new users of Hadoop, I am trying to construct an architectural 
picture about how Hadoop processes interact when performing a Map/Reduce job.  
I haven't find much documentation on it.  Can someone suggest any links or 
documentation that describe such detail ?
Here I am making some guess based on what I've seen from the API, Admin 
interface and what the simplest possible implementation could be.  I haven't 
looked at the source code and I know my guess is wrong because the 
implementation is simple, but unsophisticated.  I am trying to layout the 
ground for the expert who is familiar with the underlying implementation to 
correct me.
So here is my guess ... I appreciate if anyone can correct me with the actual 
implementation knowledge.

1.   To start the map/reduce cluster, run the "start-mapred.sh" script, 
which based on the "hadoop-site.xml" file, starts (via SSH) one "JobTracker" 
process and multiple "TaskTracker" processes across multiple machines.  (Of 
course, here I assume the HDFS is already started).

2.   Now the client program submit a Map/Reduce job by creating a "JobConf" 
object and invoking "JobClient.runJob(jobConf)", this API will upload the 
Mapper and Reducer implementation classes, as well as the job config to the 
JobTracker daemon.  (at this moment, I expect the directory of the input path 
is freezed.  In other words, no new files can be created or removed from the 
input path).

3.   After assigning a unique Job id, the JobTracker look at the 
"jobConf.inputPath" to determine how many TaskTrackers are needed for the 
mapper phase, based on the number of files in the input path, as well as how 
busy existing "TaskTrackers" are.  Then it will select a number of 
"Map-Phase-TaskTrackers" who is relatively idle as well as physically close to 
the HDFS that host a copy (e.g. on the same machine or same rack).

4.   Based on some scheduling algorithm, the JobTracker determines multiple 
files (from the input path) to assign to each selected Map-Phase-TaskTracker.  
It sends to each TaskTracker the job id, the mapper implementation class 
bytecode, as well as the name of the assigned input files.

5.   Each Map-Phase-TaskTracker process will spawn multiple threads, one 
for each assigned input file.  (so there is a 1 to 1 correspondence between 
threads and files).  The TaskTracker also monitor the progress of each thread 
associated with this specific job id.  The map phase is now started ...

a.   Each thread will start to read the assigned input file "sequentially", 
one record at a time using the "InputFormat class" specified in the jobConf.

b.  For each record it read, it invoke the uploaded Mapper implementation 
class's map() method.  Whenever the "output.collect(key, value)" method is 
called, a record is added to an in-memory mapResultHashtable.  This step is 
repeated until the whole input file is consumed (EOF is true).

c.   Then the thread inspect the mapResultHashtable.  For each key, it 
invoke the uploaded combiner class's reduce() method.  Whenever the 
"output.collect(key, value)" method is called, a record is added to an 
in-memory combineResultHashtable. Then the thread persist the 
combineResultHashtable into a local file (not HDFS).  Finally the thread quits.

6.   When all the threads associated with the Job ID quits, the 
Map-Phase-TaskTracker send a "map_complete" notification to the JobTracker.

7.   When the JobTracker receives all the "map_complete" notifications, it 
know the map phases is completed.  Now it is time to start the "reduce" phase.

8.   The JobTracker look at the "jobConf.numReduceTask" to determine how 
many Reduce-phase-TaskTrackers are needed.  It will also selected (randomly) 
those Reduce-Phase-TaskTrackers.  For each of them, the jobTracker will send 
the job id as well as the reducer implementation class bytecode.

9.   Now, for each of the previous Map-phase-TaskTracker, the JobTracker 
send a "reduce_ready" message, as well as an array of addresses of the 
Reduce-Phase-TaskTrackers.  Each Map-Phase-TaskTracker will start a thread.

10.   The thread iterate the persisted combineResultHashtable.  For each key, 
it invoke the partitioner class's getPartition() function to determine the 
partition number and then the address of the Reduce-PhaseTaskTracker. And then 
open a network connection to the Reduce-Phase-TaskTracker and pass along the 
key and values.

11.   At each Reduce-Phase-TaskTracker, for each new key received, it will 
spawn a new thread.  This thread will invoke the Reducer.reduce() method.  (so 
there is a 1 to 1 correspondence between threads and unique keys).  The 
Reduce-Phase-TaskTracker also monitor the progress of each thread associated 
with this specific job id.

a.   Within the reduce() method, whenever the "output.collect(key, value)" 
method is called, a record is written using the "OutputFormat class" specified 
in the jobConf.

b. 

Re: How to cache a folder in HDFS?

2008-11-03 Thread Karl Anderson


On 1-Nov-08, at 11:57 AM, lamfeeling wrote:


Hi all!

 I have a problem here. In my program, my code will read some config  
files in a folder, but it always fail on hadoop and says "Can not  
find the file...",  I looked up the reference, it told me to cache  
the files in HDFS instead read the file from local.

 Now I can cache a file, But I really dont know how to cache a folder.
 May I need cache the file one by one?



I am able to cache a directory the same way I cache files, I just give  
the HDFS directory name as I would a file name.  The link created  
points to the directory.


Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra





Re: Passing Constants from One Job to the Next

2008-11-03 Thread Aaron Kimball
The Mapper and Reducer interfaces both provide a method 'void
configure(JobConf conf) throws IOException'; if you extend MapReduceBase,
this will provide a dummy implementation of configure(). You can add your
own implementation; it will be called before the first call to map() or
reduce(). You can read your initialization data at this time.

- Aaron

On Thu, Oct 30, 2008 at 4:02 PM, Erik Holstad <[EMAIL PROTECTED]> wrote:

> Hi!
> Is there a way of using the value read in the configure() in the Map or
> Reduce phase?
>
> Erik
>
> On Thu, Oct 23, 2008 at 2:40 AM, Aaron Kimball <[EMAIL PROTECTED]> wrote:
>
> > See Configuration.setInt() in the API. (JobConf inherits from
> > Configuration). You can read it back in the configure() method of your
> > mappers/reducers
> > - Aaron
> >
> > On Wed, Oct 22, 2008 at 3:03 PM, Yih Sun Khoo <[EMAIL PROTECTED]> wrote:
> >
> > > Are you saying that I can pass, say, a single integer constant with
> > either
> > > of these three: JobConf? A HDFS file? DistributedCache?
> > > Or are you asking if I can pass given the context of: JobConf? A HDFS
> > file?
> > > DistributedCache?
> > > I'm thinking of how to pass a single int so from one Jobconf to the
> next
> > >
> > > On Wed, Oct 22, 2008 at 2:57 PM, Arun C Murthy <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > >
> > > > On Oct 22, 2008, at 2:52 PM, Yih Sun Khoo wrote:
> > > >
> > > >  I like to hear some good ways of passing constants from one job to
> the
> > > >> next.
> > > >>
> > > >
> > > > Unless I'm missing something: JobConf? A HDFS file? DistributedCache?
> > > >
> > > > Arun
> > > >
> > > >
> > > >
> > > >> These are some ways that I can think of:
> > > >> 1)  The obvious solution is to carry the constant as part of your
> > value
> > > >> from
> > > >> one job to the next, but that would mean every value would hold that
> > > >> constant
> > > >> 2)  Use the reporter as a hack so that you can set the status
> message
> > > and
> > > >> then get the status message back when u need the constant
> > > >>
> > > >> Any other ideas?  (Also please do not include code)
> > > >>
> > > >
> > > >
> > >
> >
>


Re: Can anyone recommend me a inter-language data file format?

2008-11-03 Thread Chris Dyer
I've been using protocol buffers to serialize the data and then
encoding them in base64 so that I can then treat them like text.  This
obviously isn't optimal, but I'm assuming that this is only a short
term solution which won't be necessary when non-Java clients become
first class citizens of the Hadoop world.

Chris

On Mon, Nov 3, 2008 at 2:24 PM, Pete Wyckoff <[EMAIL PROTECTED]> wrote:
>
> Protocol buffers, thrift?
>
>
> On 11/3/08 4:07 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote:
>
> Zhou, Yunqing wrote:
>> embedded database cannot handle large-scale data, not very efficient
>> I have about 1 billion records.
>> these records should be passed through some modules.
>> I mean a data exchange format similar to XML but more flexible and
>> efficient.
>
>
> JSON
> CSV
> erlang-style records (name,value,value,value)
> RDF-triples in non-XML representations
>
> For all of these, you need to test with data that includes things like
> high unicode characters, single and double quotes, to see how well they
> get handled.
>
> you can actually append with XML by not having opening/closing tags,
> just stream out the entries to the tail of the file
> ...
>
> To read this in an XML parser, include it inside another XML file:
>
> 
>   
> ]>
>
> 
> &log;
> 
>
> I've done this for very big files, as long as you aren't trying to load
> it in-memory to a DOM, things should work
>
> --
> Steve Loughran  http://www.1060.org/blogxter/publish/5
> Author: Ant in Action   http://antbook.org/
>
>
>


Re: hadoop 0.18.1 x-trace

2008-11-03 Thread George Porter

Hi Veiko,

Right now the patches represent an instrumentation API for the RPC  
layer (the X-Trace implementation is not currently part of the patch-- 
I'm hoping to submit it as a contrib/ project).  I'll be talking at  
the Hadoop Camp later this week about X-Trace and Hadoop.  There is  
much to do in terms of building UIs, analysis tools, and trace  
storage/query interfaces.  So stay tuned (and if you would be willing  
to talk more about your anticipated uses--please let me know.  I'd be  
very interested in talking with you).


Thanks,
George

On Nov 3, 2008, at 11:32 AM, Michael Bieniosek wrote:


Try applying the last one only.

Let us know if it works!

-Michael

On 11/3/08 6:23 AM, "Veiko Schnabel" <[EMAIL PROTECTED]> wrote:

Dear Hadoop Users and Developers,

I have a requirement of monitoring the hadoop-cluster by using x- 
trace.


i found these pathes on

http://issues.apache.org/jira/browse/HADOOP-4049


but when i try to integrate them with 0.18.1, then i cannot build  
hadoop anymore


first of all , the patch-order is not clear to me,
can anyone explain to me which patches i really need and the order  
to bring in these patches


thanks

Veiko




--
Veiko Schnabel
System-Administration
optivo GmbH
Stralauer Allee 2
10245 Berlin

Tel.: +49 30 41724241
Fax: +49 30 41724239
Email:   mailto:[EMAIL PROTECTED]
Website: http://www.optivo.de

Handelsregister: HRB Berlin 88738
Geschäftsführer: Peter Romianowski, Ulf Richter






Re: hadoop 0.18.1 x-trace

2008-11-03 Thread Michael Bieniosek
Try applying the last one only.

Let us know if it works!

-Michael

On 11/3/08 6:23 AM, "Veiko Schnabel" <[EMAIL PROTECTED]> wrote:

Dear Hadoop Users and Developers,

I have a requirement of monitoring the hadoop-cluster by using x-trace.

i found these pathes on

http://issues.apache.org/jira/browse/HADOOP-4049


but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore

first of all , the patch-order is not clear to me,
can anyone explain to me which patches i really need and the order to bring in 
these patches

thanks

Veiko




--
Veiko Schnabel
System-Administration
optivo GmbH
Stralauer Allee 2
10245 Berlin

Tel.: +49 30 41724241
Fax: +49 30 41724239
Email:   mailto:[EMAIL PROTECTED]
Website: http://www.optivo.de

Handelsregister: HRB Berlin 88738
Geschäftsführer: Peter Romianowski, Ulf Richter




Re: Can anyone recommend me a inter-language data file format?

2008-11-03 Thread Pete Wyckoff

Protocol buffers, thrift?


On 11/3/08 4:07 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote:

Zhou, Yunqing wrote:
> embedded database cannot handle large-scale data, not very efficient
> I have about 1 billion records.
> these records should be passed through some modules.
> I mean a data exchange format similar to XML but more flexible and
> efficient.


JSON
CSV
erlang-style records (name,value,value,value)
RDF-triples in non-XML representations

For all of these, you need to test with data that includes things like
high unicode characters, single and double quotes, to see how well they
get handled.

you can actually append with XML by not having opening/closing tags,
just stream out the entries to the tail of the file
...

To read this in an XML parser, include it inside another XML file:



]>


&log;


I've done this for very big files, as long as you aren't trying to load
it in-memory to a DOM, things should work

--
Steve Loughran  http://www.1060.org/blogxter/publish/5
Author: Ant in Action   http://antbook.org/




Re: Status FUSE-Support of HDFS

2008-11-03 Thread Pete Wyckoff

+1 but since hadoop deals well with such directories currently, fuse-dfs will 
basically lock up on such directories - this is because ls -color=blah causes a 
stat on every file in a directory.  There is a JIRA open for this but it is a 
pretty rare case although it has happened to me at facebook.

-- pete


>It's good for a portable application to
keep the #of files/directory low by having two levels of directory for
storing files -just use a hash operation to determine which dir to store
a specific file in.


On 11/3/08 4:00 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote:

Pete Wyckoff wrote:
> It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted 
> via fuse and uses that for some operations.
>
> There have recently been some problems with fuse-dfs when used in a 
> multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do 
> not use 0.18 or 0.18.1)
>
> The current (known) issues are:

>   2. When directories have 10s of thousands of files, performance can be very 
> poor.

I've known other filesystems to top out at 64k-1 files per directory,
even if they don't slow down. It's good for a portable application to
keep the #of files/directory low by having two levels of directory for
storing files -just use a hash operation to determine which dir to store
a specific file in.




Re: SecondaryNameNode on separate machine

2008-11-03 Thread Konstantin Shvachko

You can either do what you just described with dfs.name.dir = dirX
or you can start name-node with -importCheckpoint option.
This is an automation for copying image files from secondary to primary.

See here:
http://hadoop.apache.org/core/docs/current/commands_manual.html#namenode
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode
http://issues.apache.org/jira/browse/HADOOP-2585#action_12584755

--Konstantin

Tomislav Poljak wrote:

Hi,
Thank you all for your time and your answers!

Now SecondaryNameNode connects to the NameNode (after I configured
dfs.http.address to the NN's http server -> NN hostname on port 50070)
and creates(transfers) edits and fsimage from NameNode.

Can you explain me a little bit more how NameNode failover should work
now? 


For example, SecondaryNameNode now stores fsimage and edits to (SNN's)
dirX and let's say NameNode goes down (disk becomes unreadable). Now I
create/dedicate a new machine for NameNode (also change DNS to point to
this new NameNode machine as nameNode host) and take the data dirX from
SNN and copy it to new NameNode. How do I configure new NameNode to use
data from dirX (do I configure dfs.name.dir to point to dirX and start
new NameNode)?

Thanks,
Tomislav



On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote:

True, dfs.http.address is the NN Web UI address.
This where the NN http server runs. Besides the Web UI there also
a servlet running on that server which is used to transfer image
and edits from NN to the secondary using http get.
So SNN uses both addresses fs.default.name and dfs.http.address.

When SNN finishes the checkpoint the primary needs to transfer the
resulting image back. This is done via the http server running on SNN.

Answering Tomislav's question:
The difference between fs.default.name and dfs.http.address is that
fs.default.name is the name-node's PRC address, where clients and
data-nodes connect to, while dfs.http.address is the NN's http server
address where our browsers connect to, but it is also used for
transferring image and edits files.

--Konstantin

Otis Gospodnetic wrote:

Konstantin & Co, please correct me if I'm wrong, but looking at 
hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN 
*Web UI*.  In other words, this is where we people go look at the NN.

The secondary NN must then be using only the Primary NN URL specified in fs.default.name. 
 This URL looks like hdfs://name-node-hostname-here/.  Something in Hadoop then knows the 
exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this 
URL.

Is this correct?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Tomislav Poljak <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Thursday, October 30, 2008 1:52:18 PM
Subject: Re: SecondaryNameNode on separate machine

Hi,
can you, please, explain the difference between fs.default.name and
dfs.http.address (like how and when is SecondaryNameNode using
fs.default.name and how/when dfs.http.address). I have set them both to
same (namenode's) hostname:port. Is this correct (or dfs.http.address
needs some other port)? 


Thanks,

Tomislav

On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote:

SecondaryNameNode uses http protocol to transfer the image and the edits
from the primary name-node and vise versa.
So the secondary does not access local files on the primary directly.
The primary NN should know the secondary's http address.
And the secondary NN need to know both fs.default.name and dfs.http.address of 

the primary.

In general we usually create one configuration file hadoop-site.xml
and copy it to all other machines. So you don't need to set up different
values for all servers.

Regards,
--Konstantin

Tomislav Poljak wrote:

Hi,
I'm not clear on how does SecondaryNameNode communicates with NameNode
(if deployed on separate machine). Does SecondaryNameNode uses direct
connection (over some port and protocol) or is it enough for
SecondaryNameNode to have access to data which NameNode writes locally
on disk?

Tomislav

On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote:

I think a lot of the confusion comes from this thread :
http://www.nabble.com/NameNode-failover-procedure-td11711842.html

Particularly because the wiki was updated with wrong information, not
maliciously I'm sure. This information is now gone for good.

Otis, your solution is pretty much like the one given by Dhruba Borthakur
and augmented by Konstantin Shvachko later in the thread but I never did it
myself.

One thing should be clear though, the NN is and will remain a SPOF (just
like HBase's Master) as long as a distributed manager service (like
Zookeeper) is not plugged into Hadoop to help with failover.

J-D

On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:


Hi,
So what is the "recipe" for avoiding NN SPOF using only what comes with
H

Re: hadoop mapside joins

2008-11-03 Thread Chris Douglas

There is an example in src/examples/.../examples/Join.java

-C

On Nov 3, 2008, at 3:13 AM, meda vijendharreddy wrote:



Hi,
As my dataset is quite large I wanted to do use map side joins.   
I have got the basic idea from http://issues.apache.org/jira/browse/HADOOP-2085


Can you anybody please provide me the working code.

Thanks in Advance

Vijen



 Connect with friends all over the world. Get Yahoo! India  
Messenger at http://in.messenger.yahoo.com/?wm=n/




Re: Question on opening file info from namenode in DFSClient

2008-11-03 Thread Dhruba Borthakur
In the current code, details about block locations of a file are
cached on the client when the file is opened. This cache remains with
the client until the file is closed. If the same file is re-opened by
the same DFSClient, it re-contacts the namenode and refetches the
block locations. This works ok for most map-reduce apps because it is
rare that the same DSClient re-opens the same file again.

Can you pl explain your use-case?

thanks,
dhruba


On Sun, Nov 2, 2008 at 10:57 PM, Taeho Kang <[EMAIL PROTECTED]> wrote:
> Dear Hadoop Users and Developers,
>
> I was wondering if there's a plan to add "file info cache" in DFSClient?
>
> It could eliminate network travelling cost for contacting Namenode and I
> think it would greatly improve the DFSClient's performance.
> The code I was looking at was this
>
> ---
> DFSClient.java
>
>/**
> * Grab the open-file info from namenode
> */
>synchronized void openInfo() throws IOException {
>  /* Maybe, we could add a file info cache here! */
>  LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize);
>  if (newInfo == null) {
>throw new IOException("Cannot open filename " + src);
>  }
>  if (locatedBlocks != null) {
>Iterator oldIter =
> locatedBlocks.getLocatedBlocks().iterator();
>Iterator newIter =
> newInfo.getLocatedBlocks().iterator();
>while (oldIter.hasNext() && newIter.hasNext()) {
>  if (! oldIter.next().getBlock().equals(newIter.next().getBlock()))
> {
>throw new IOException("Blocklist for " + src + " has changed!");
>  }
>}
>  }
>  this.locatedBlocks = newInfo;
>  this.currentNode = null;
>}
> ---
>
> Does anybody have an opinion on this matter?
>
> Thank you in advance,
>
> Taeho
>


Re: Any Way to Skip Mapping?

2008-11-03 Thread Billy Pearson
I need the Reduce to Sort so I can merge the records and output in a sorted 
order.
I do not need to join any data just merge rows together so I do not thank 
the join will be any help.


I am storing the data like >> with a 
sorted map as the value.
and on the merge I need to take all the rows that have the same key and 
merge all the sorted maps together and output one row that has all the data 
for that key

something like what hbase is doing but without the in memory index's

Maybe it will be come an option later down the row to skip the maps and let 
the reduce Shuffle directly from the inputSplits.


Billy




"Owen O'Malley" <[EMAIL PROTECTED]> wrote in 
message news:[EMAIL PROTECTED]
If you don't need a sort, which is what it sounds like, Hadoop  supports 
that by turning off the reduce. That is done by setting the  number of 
reduces to 0. This typically is much faster than if you need  the sort. It 
also sounds like you may need/want the library that does  map-side joins.

http://tinyurl.com/43j5pp

-- Owen






Re: Question regarding reduce tasks

2008-11-03 Thread Miles Osborne
i believe so, yes:  but note that individual reducer task needs to
finish, not just when processing a given key/value pair

Miles
2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>:
> What happens when the reducer task gets invoked more than once? My
> guess is once a reducer task finishes writing the data for a
> particular key to HDFS, it won't somehow get re-executed again for the
> same key right?
>
>
> On Mon, Nov 3, 2008 at 11:28 AM, Miles Osborne <[EMAIL PROTECTED]> wrote:
>> you can't guarantee that a reducer (or mapper for that matter) will be
>> executed exactly once unless you turn-off preemptive scheduling.  but,
>> a distinct key gets sent to a single reducer, so yes, only one reducer
>> will see a particulat key + associated values
>>
>> Miles
>>
>> 2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>:
>>> Hello,
>>>
>>> Is it safe to assume that only one reduce task will ever operate on
>>> values for a particular key? Or is it possible that more than one
>>> reduce task can work on values for the same key? The reason I ask is
>>> because I want to ensure that a piece of code that I write at the end
>>> of my reducer method will only ever be executed once after all values
>>> for a particular key are aggregated/summed.
>>>
>>> Thanks,
>>> Ryan
>>>
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


Re: Status FUSE-Support of HDFS

2008-11-03 Thread Pete Wyckoff

Reads are 20-30% slower
Writes are 33% slower before https://issues.apache.org/jira/browse/HADOOP-3805 
- You need a kernel > 2.6.26-rc* to test 3805, which I don't have :(

These #s are with hadoop 0.17 and the 0.18.2 version of fuse-dfs.

-- pete


On 11/2/08 6:23 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:



Hi Pete,

thanks for the info. That helps a lot. We will probably test it for our
use cases then. Did you benchmark throughput when reading writing files
through fuse-dfs and compared it to command line tool or API access? Is
there a notable difference?

Thanks again,

Robert



Pete Wyckoff wrote:
> It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted 
> via fuse and uses that for some operations.
>
> There have recently been some problems with fuse-dfs when used in a 
> multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do 
> not use 0.18 or 0.18.1)
>
> The current (known) issues are:
>   1. Wrong semantics when copying over an existing file - namely it does a 
> delete and then re-creates the file, so ownership/permissions may end up 
> wrong. There is a patch for this.
>   2. When directories have 10s of thousands of files, performance can be very 
> poor.
>   3. Posix truncate is supported only for truncating it to 0 size since hdfs 
> doesn't support truncate.
>   4. Appends are not supported - this is a libhdfs problem and there is a 
> patch for it.
>
> It is still a pre-1.0 product for sure, but it has been pretty stable for us.
>
>
> -- pete
>
>
> On 10/31/08 9:08 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:
>
>
>
> Hi,
>
> could anyone tell me what the current Status of FUSE support for HDFS
> is? Is this something that can be expected to be usable in a few
> weeks/months in a production environment? We have been really
> happy/successful with HDFS in our production system. However, some
> software we use in our application simply requires an OS-Level file
> system which currently requires us to do a lot of copying between HDFS
> and a regular file system for processes which require that software and
> FUSE support would really eliminate that one disadvantage we have with
> HDFS. We wouldn't even require the performance of that to be outstanding
> because just by eliminatimng the copy step, we would greatly increase
> the thruput of those processes.
>
> Thanks for sharing any thoughts on this.
>
> Regards,
>
> Robert
>
>
>





Re: Question regarding reduce tasks

2008-11-03 Thread Ryan LeCompte
What happens when the reducer task gets invoked more than once? My
guess is once a reducer task finishes writing the data for a
particular key to HDFS, it won't somehow get re-executed again for the
same key right?


On Mon, Nov 3, 2008 at 11:28 AM, Miles Osborne <[EMAIL PROTECTED]> wrote:
> you can't guarantee that a reducer (or mapper for that matter) will be
> executed exactly once unless you turn-off preemptive scheduling.  but,
> a distinct key gets sent to a single reducer, so yes, only one reducer
> will see a particulat key + associated values
>
> Miles
>
> 2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>:
>> Hello,
>>
>> Is it safe to assume that only one reduce task will ever operate on
>> values for a particular key? Or is it possible that more than one
>> reduce task can work on values for the same key? The reason I ask is
>> because I want to ensure that a piece of code that I write at the end
>> of my reducer method will only ever be executed once after all values
>> for a particular key are aggregated/summed.
>>
>> Thanks,
>> Ryan
>>
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>


Re: Question regarding reduce tasks

2008-11-03 Thread Miles Osborne
you can't guarantee that a reducer (or mapper for that matter) will be
executed exactly once unless you turn-off preemptive scheduling.  but,
a distinct key gets sent to a single reducer, so yes, only one reducer
will see a particulat key + associated values

Miles

2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>:
> Hello,
>
> Is it safe to assume that only one reduce task will ever operate on
> values for a particular key? Or is it possible that more than one
> reduce task can work on values for the same key? The reason I ask is
> because I want to ensure that a piece of code that I write at the end
> of my reducer method will only ever be executed once after all values
> for a particular key are aggregated/summed.
>
> Thanks,
> Ryan
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


Question regarding reduce tasks

2008-11-03 Thread Ryan LeCompte
Hello,

Is it safe to assume that only one reduce task will ever operate on
values for a particular key? Or is it possible that more than one
reduce task can work on values for the same key? The reason I ask is
because I want to ensure that a piece of code that I write at the end
of my reducer method will only ever be executed once after all values
for a particular key are aggregated/summed.

Thanks,
Ryan


hadoop 0.18.1 x-trace

2008-11-03 Thread Veiko Schnabel
Dear Hadoop Users and Developers,

I have a requirement of monitoring the hadoop-cluster by using x-trace.

i found these pathes on 

http://issues.apache.org/jira/browse/HADOOP-4049


but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore

first of all , the patch-order is not clear to me, 
can anyone explain to me which patches i really need and the order to bring in 
these patches 

thanks

Veiko




-- 
Veiko Schnabel
System-Administration
optivo GmbH
Stralauer Allee 2 
10245 Berlin

Tel.: +49 30 41724241
Fax: +49 30 41724239
Email:   mailto:[EMAIL PROTECTED]
Website: http://www.optivo.de

Handelsregister: HRB Berlin 88738
Geschäftsführer: Peter Romianowski, Ulf Richter



Nutch/Hadoop: Crawl is crashing

2008-11-03 Thread P.ILAYARAJA
Hi,

I started an internet crawl of 30 million pages in a single segment.
The crawl was crashing with the following exception:

java.lang.ArrayIndexOutOfBoundsException: 17
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:540)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:607)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193)
 at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)


Any idea on why is it happenning and what would be the soln.

... am using hadoop 0.15.3 and nutch 1.0 versions.

Regards,
Ilay

Re: SecondaryNameNode on separate machine

2008-11-03 Thread Tomislav Poljak
Hi,
Thank you all for your time and your answers!

Now SecondaryNameNode connects to the NameNode (after I configured
dfs.http.address to the NN's http server -> NN hostname on port 50070)
and creates(transfers) edits and fsimage from NameNode.

Can you explain me a little bit more how NameNode failover should work
now? 

For example, SecondaryNameNode now stores fsimage and edits to (SNN's)
dirX and let's say NameNode goes down (disk becomes unreadable). Now I
create/dedicate a new machine for NameNode (also change DNS to point to
this new NameNode machine as nameNode host) and take the data dirX from
SNN and copy it to new NameNode. How do I configure new NameNode to use
data from dirX (do I configure dfs.name.dir to point to dirX and start
new NameNode)?

Thanks,
Tomislav



On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote:
> True, dfs.http.address is the NN Web UI address.
> This where the NN http server runs. Besides the Web UI there also
> a servlet running on that server which is used to transfer image
> and edits from NN to the secondary using http get.
> So SNN uses both addresses fs.default.name and dfs.http.address.
> 
> When SNN finishes the checkpoint the primary needs to transfer the
> resulting image back. This is done via the http server running on SNN.
> 
> Answering Tomislav's question:
> The difference between fs.default.name and dfs.http.address is that
> fs.default.name is the name-node's PRC address, where clients and
> data-nodes connect to, while dfs.http.address is the NN's http server
> address where our browsers connect to, but it is also used for
> transferring image and edits files.
> 
> --Konstantin
> 
> Otis Gospodnetic wrote:
> > Konstantin & Co, please correct me if I'm wrong, but looking at 
> > hadoop-default.xml makes me think that dfs.http.address is only the URL for 
> > the NN *Web UI*.  In other words, this is where we people go look at the NN.
> > 
> > The secondary NN must then be using only the Primary NN URL specified in 
> > fs.default.name.  This URL looks like hdfs://name-node-hostname-here/.  
> > Something in Hadoop then knows the exact port for the Primary NN based on 
> > the URI schema (e.g. "hdfs://") in this URL.
> > 
> > Is this correct?
> > 
> > 
> > Thanks,
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > - Original Message 
> >> From: Tomislav Poljak <[EMAIL PROTECTED]>
> >> To: core-user@hadoop.apache.org
> >> Sent: Thursday, October 30, 2008 1:52:18 PM
> >> Subject: Re: SecondaryNameNode on separate machine
> >>
> >> Hi,
> >> can you, please, explain the difference between fs.default.name and
> >> dfs.http.address (like how and when is SecondaryNameNode using
> >> fs.default.name and how/when dfs.http.address). I have set them both to
> >> same (namenode's) hostname:port. Is this correct (or dfs.http.address
> >> needs some other port)? 
> >>
> >> Thanks,
> >>
> >> Tomislav
> >>
> >> On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote:
> >>> SecondaryNameNode uses http protocol to transfer the image and the edits
> >>> from the primary name-node and vise versa.
> >>> So the secondary does not access local files on the primary directly.
> >>> The primary NN should know the secondary's http address.
> >>> And the secondary NN need to know both fs.default.name and 
> >>> dfs.http.address of 
> >> the primary.
> >>> In general we usually create one configuration file hadoop-site.xml
> >>> and copy it to all other machines. So you don't need to set up different
> >>> values for all servers.
> >>>
> >>> Regards,
> >>> --Konstantin
> >>>
> >>> Tomislav Poljak wrote:
>  Hi,
>  I'm not clear on how does SecondaryNameNode communicates with NameNode
>  (if deployed on separate machine). Does SecondaryNameNode uses direct
>  connection (over some port and protocol) or is it enough for
>  SecondaryNameNode to have access to data which NameNode writes locally
>  on disk?
> 
>  Tomislav
> 
>  On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote:
> > I think a lot of the confusion comes from this thread :
> > http://www.nabble.com/NameNode-failover-procedure-td11711842.html
> >
> > Particularly because the wiki was updated with wrong information, not
> > maliciously I'm sure. This information is now gone for good.
> >
> > Otis, your solution is pretty much like the one given by Dhruba 
> > Borthakur
> > and augmented by Konstantin Shvachko later in the thread but I never 
> > did it
> > myself.
> >
> > One thing should be clear though, the NN is and will remain a SPOF (just
> > like HBase's Master) as long as a distributed manager service (like
> > Zookeeper) is not plugged into Hadoop to help with failover.
> >
> > J-D
> >
> > On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Hi,
> >> So what is the "r

hadoop 0.18.1 x-trace

2008-11-03 Thread v.schnabel
Dear Hadoop Users and Developers,

I have a requirement of monitoring the hadoop-cluster by using x-trace.

i found these pathes on 

http://issues.apache.org/jira/browse/HADOOP-4049


but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore

first of all , the patch-order is not clear to me, 
can anyone explain to me which patches i really need and the order to bring in 
these patches 

thanks

Veiko




Re: Question on opening file info from namenode in DFSClient

2008-11-03 Thread Mridul Muralidharan


Consider case of file getting removed & recreated with same name .. 
while there is cached info about the file (and job is running) in the 
DFSClient's (mapper/reducer).


- Mridul


Taeho Kang wrote:

Dear Hadoop Users and Developers,

I was wondering if there's a plan to add "file info cache" in DFSClient?

It could eliminate network travelling cost for contacting Namenode and I
think it would greatly improve the DFSClient's performance.
The code I was looking at was this

---
DFSClient.java

/**
 * Grab the open-file info from namenode
 */
synchronized void openInfo() throws IOException {
  /* Maybe, we could add a file info cache here! */
  LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize);
  if (newInfo == null) {
throw new IOException("Cannot open filename " + src);
  }
  if (locatedBlocks != null) {
Iterator oldIter =
locatedBlocks.getLocatedBlocks().iterator();
Iterator newIter =
newInfo.getLocatedBlocks().iterator();
while (oldIter.hasNext() && newIter.hasNext()) {
  if (! oldIter.next().getBlock().equals(newIter.next().getBlock()))
{
throw new IOException("Blocklist for " + src + " has changed!");
  }
}
  }
  this.locatedBlocks = newInfo;
  this.currentNode = null;
}
---

Does anybody have an opinion on this matter?

Thank you in advance,

Taeho





hadoop mapside joins

2008-11-03 Thread meda vijendharreddy

Hi,
 As my dataset is quite large I wanted to do use map side joins.  I have 
got the basic idea from http://issues.apache.org/jira/browse/HADOOP-2085

Can you anybody please provide me the working code.

Thanks in Advance

Vijen



  Connect with friends all over the world. Get Yahoo! India Messenger at 
http://in.messenger.yahoo.com/?wm=n/


Re: Can anyone recommend me a inter-language data file format?

2008-11-03 Thread Steve Loughran

Zhou, Yunqing wrote:

embedded database cannot handle large-scale data, not very efficient
I have about 1 billion records.
these records should be passed through some modules.
I mean a data exchange format similar to XML but more flexible and
efficient.



JSON
CSV
erlang-style records (name,value,value,value)
RDF-triples in non-XML representations

For all of these, you need to test with data that includes things like 
high unicode characters, single and double quotes, to see how well they 
get handled.


you can actually append with XML by not having opening/closing tags, 
just stream out the entries to the tail of the file

...

To read this in an XML parser, include it inside another XML file:



]>


&log;


I've done this for very big files, as long as you aren't trying to load 
it in-memory to a DOM, things should work


--
Steve Loughran  http://www.1060.org/blogxter/publish/5
Author: Ant in Action   http://antbook.org/


Re: Status FUSE-Support of HDFS

2008-11-03 Thread Steve Loughran

Pete Wyckoff wrote:

It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted via 
fuse and uses that for some operations.

There have recently been some problems with fuse-dfs when used in a 
multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do 
not use 0.18 or 0.18.1)

The current (known) issues are:



  2. When directories have 10s of thousands of files, performance can be very 
poor.


I've known other filesystems to top out at 64k-1 files per directory, 
even if they don't slow down. It's good for a portable application to 
keep the #of files/directory low by having two levels of directory for 
storing files -just use a hash operation to determine which dir to store 
a specific file in.