Is this architecture possible (Hadoop, HBase)?

2010-02-15 Thread Raghava Mutharaju
Hello all,

  I am relatively new to MapReduce and haven't used HBase at all. Is the
following architecture possible?

A distributed key-value store is used (HBase). So along with values, there
would be a timestamp associated with the values. Map  Reduce tasks are
executed iteratively. Map, in each iteration should take in values which
were added in the previous iteration to the store (perhaps the ones with
latest timestamp?). Reduce should take in Map's output as well as the
key,value pairs from the store whose key(s) match the key(s) that reduce
has to process in the current iteration. The output of reduce goes to the
store.

If this is possible, which classes (eg: InputFormat, run() of Reduce) should
be extended so that instead of the regular operation the above operation
takes place. If this is not possible, are there any alternatives to achieve
the same?

Thank you.

PS: I have put the same question on mapreduce-user apache mailing list (but
haven't got any replies yet). I found many topics on mapreduce in this
mailing list as well, so thought of posting it here also.

Regards,
Raghava.


Re: Is this architecture possible (Hadoop, HBase)?

2010-02-15 Thread Ankur C. Goel
Raghava,
   What you ask should be easily doable using HBase specific input 
and output format and I believe they already exist in a form in which you can 
use them with minimal or no modification. For specifics of the format classes 
check out Hbase's API documentation. Also you are better off posting Hbase 
specific questions on Hbase user group mailing list.

Regards
-...@nkur


On 2/15/10 1:35 PM, Raghava Mutharaju m.vijayaragh...@gmail.com wrote:

Hello all,

  I am relatively new to MapReduce and haven't used HBase at all. Is the
following architecture possible?

A distributed key-value store is used (HBase). So along with values, there
would be a timestamp associated with the values. Map  Reduce tasks are
executed iteratively. Map, in each iteration should take in values which
were added in the previous iteration to the store (perhaps the ones with
latest timestamp?). Reduce should take in Map's output as well as the
key,value pairs from the store whose key(s) match the key(s) that reduce
has to process in the current iteration. The output of reduce goes to the
store.

If this is possible, which classes (eg: InputFormat, run() of Reduce) should
be extended so that instead of the regular operation the above operation
takes place. If this is not possible, are there any alternatives to achieve
the same?

Thank you.

PS: I have put the same question on mapreduce-user apache mailing list (but
haven't got any replies yet). I found many topics on mapreduce in this
mailing list as well, so thought of posting it here also.

Regards,
Raghava.



Re: Is this architecture possible (Hadoop, HBase)?

2010-02-15 Thread Raghava Mutharaju
Hi Ankur,

   Thank you for the reply. I will check out HBase API. Since my
question involved a mix of MapReduce  HBase, I thought of posting it here.
If I have any HBase specific questions, I would post them there.

Regards,
Raghava.

On Mon, Feb 15, 2010 at 3:55 AM, Ankur C. Goel gan...@yahoo-inc.com wrote:

 Raghava,
   What you ask should be easily doable using HBase specific
 input and output format and I believe they already exist in a form in which
 you can use them with minimal or no modification. For specifics of the
 format classes check out Hbase's API documentation. Also you are better off
 posting Hbase specific questions on Hbase user group mailing list.

 Regards
 -...@nkur


 On 2/15/10 1:35 PM, Raghava Mutharaju m.vijayaragh...@gmail.com wrote:

 Hello all,

  I am relatively new to MapReduce and haven't used HBase at all. Is the
 following architecture possible?

 A distributed key-value store is used (HBase). So along with values, there
 would be a timestamp associated with the values. Map  Reduce tasks are
 executed iteratively. Map, in each iteration should take in values which
 were added in the previous iteration to the store (perhaps the ones with
 latest timestamp?). Reduce should take in Map's output as well as the
 key,value pairs from the store whose key(s) match the key(s) that reduce
 has to process in the current iteration. The output of reduce goes to the
 store.

 If this is possible, which classes (eg: InputFormat, run() of Reduce)
 should
 be extended so that instead of the regular operation the above operation
 takes place. If this is not possible, are there any alternatives to achieve
 the same?

 Thank you.

 PS: I have put the same question on mapreduce-user apache mailing list (but
 haven't got any replies yet). I found many topics on mapreduce in this
 mailing list as well, so thought of posting it here also.

 Regards,
 Raghava.




RE: sort at reduce side

2010-02-15 Thread Srigurunath Chakravarthi
So, after shuffle at reduce side,  are the spills actually stored as map
files?

Yes.

When a reducer receives map output from multiple maps worth fs.inmemorysize.mb 
in size, it sorts and spills the data to disk.

If the number of map output data files spilt to disk exceeds io.sort.factor, 
additional merge step(s) are performed to reduce the file count to lesser than 
that number.

When all map output data is spilt to disk, the reduce tasks starts invoking the 
reduce function and passes key-value pairs in sorted order by reading them off 
from the above files.

Regards,
Sriguru

-Original Message-
From: Gang Luo [mailto:lgpub...@yahoo.com.cn]
Sent: Thursday, February 04, 2010 1:58 AM
To: common-user@hadoop.apache.org
Subject: Re: sort at reduce side

Thanks for reply, Sriguru.
So, after shuffle at reduce side,  are the spills actually stored as map
files?

Why I ask these questions is based on some observations as following. On a
16 nodes cluster, when I do a map join, it takes 3 and a half minutes. When
I do a reduce side join on nearly the same amount of data, it take 8
minutes before map phase complete. I am sure the computation (map function)
will not cause so much difference, the extra 4 minutes time could be only
spent on sorting at map side for reduce side join. While I also notice that
the sort time at reduce side is only 30 sec (I cannot access the online
jobtracker, the 30 sec time is actually the time reduce takes from 33%
completeness to 66% completeness).  The number of reduce tasks is much
fewer than that of map tasks, which means each reduce task sort more data
than each map task (I use hash partitioner and data is uniformly
distributed).  The only reason I come up with for the big difference
between the sort at map side and reduce side is the different behaviors of
these two sorts.

Anybody has some ideas why the map takes so much time for reduce side join
compared to map side join, and why there is big difference between sort at
map side and reduce side?

P.S. I join a 7.5G file with a 100M file. the sort buffer at reduce is
slightly large than that at map side.


-Gang



- 原始邮件 
发件人: Srigurunath Chakravarthi srig...@yahoo-inc.com
收件人: common-user@hadoop.apache.org common-user@hadoop.apache.org
发送日期: 2010/2/3 (周三) 12:50:08 上午
主   题: RE: sort at reduce side

Hi Gang,

kept in map file. If so, in order to efficiently sort the data, reducer
actually only read the index part of each spill (which is a map file) and
sort the keys, instead of reading whole records from disk and sort them.

afaik, no. Reduces always fetches map output data and not indexes (even if
the data is from the local node, where an index may be sufficient).

Regards,
Sriguru

-Original Message-
From: Gang Luo [mailto:lgpub...@yahoo.com.cn]
Sent: Wednesday, February 03, 2010 10:40 AM
To: common-user@hadoop.apache.org
Subject: sort at reduce side

Hi all,
I want to know some more details about the sorting at the reduce side.

The intermediate result generated at the map side is stored as map file
which actually consists of two sub-files, namely index file and data file.
The index file stores the keys and it could point to corresponding record
stored in the data file.  What I think is that when intermediate result
(even only part of it for each mapper) is shuffled to reducer, it is still
kept in map file. If so, in order to efficiently sort the data, reducer
actually only read the index part of each spill (which is a map file) and
sort the keys, instead of reading whole records from disk and sort them.

Does reducer actually do as what I expect?

-Gang


  ___
  好玩贺卡等你发,邮箱贺卡全新上线!
http://card.mail.cn.yahoo.com/


  ___
  好玩贺卡等你发,邮箱贺卡全新上线!
http://card.mail.cn.yahoo.com/


Re: datanode can not connect to the namenode

2010-02-15 Thread Steve Loughran

Marc Sturlese wrote:

Hey there I have a hadoop cluster build with 2 servers. One node (A) contains
the namenode, a datanode, the jobtraker and a tasktraker.
The other node(B) just has a datanode and a tasktraker.
I set up correctly hdfs with ./start-hdfs.sh

When I try to set up MapReduce with ./start-mapred.sh the TaskTraker of node
(B) can not connect to the namenode. The log will keep throwing:

INFO org.apache.hadoop.ipc.Client: Retrying connect to server:
mynamenode/192.168.0.13:8021. Already tried 8 time(s)

I think maybe something is missing in /etc/hosts file or this hadoop
property is not set correctly:
property
  namedfs.datanode.address/name
  value0.0.0.0:50010/value
  description
The address where the datanode server will listen to.
If the port is 0 then the server will start on a free port.
  /description
/property

I try on the namenode:
telnet localhost 8021
telnet 192.168.0.10 8021
Both cases I get:
telnet: Unable to connect to remote host: Connection refused



That's the Namenode that isn't there, which should be on 192.168.0.13 at 
8021


You can see what ports really are open on the namenode by identifying 
the process (jps -v will do that) then netstat -a -p | grep PID , where 
PID= process ID of the namenode.


If you can telnet to port 8021 when logged in to the NN, but not 
remotely, it's your firewall or routing interfering


Re: Problem with large .lzo files

2010-02-15 Thread Steve Kuo
On Sun, Feb 14, 2010 at 12:46 PM, Todd Lipcon t...@cloudera.com wrote:

 Hi Steve,

 I'm not sure here whether you mean that the DistributedLzoIndexer job
 is failing, or if the job on the resulting split file is faiing. Could
 you clarify?


DistributedLzoIndexer job did successfully complete.  It was one of the jobs
on the resulting split file always failed while other splits succeeded.

By the way, if all files have been indexed, DistributedLzoIndexer does not
detect that and hadoop throws an exception complaining that the input dir
(or file) does not exist.  I work around this by catching the exception.


- It's possible to sacrifice parallelism by having hadoop work on each
.lzo file without indexing.  This worked well until the file size
 exceeded
30G when array indexing exception got thrown.  Apparently the code
 processed
the file in chunks and stored the references to the chunk in an array.
  When
the number of chunks was greater than a certain number (around 256 was
 my
recollection), exception was thrown.
- My current work around is to increase the number of reducers to keep
the .lzo file size low.
 
  I would like to get advices on how people handle large .lzo files.  Any
  pointers on the cause of the stack trace below and best way to resolve it
  are greatly appreciated.
 

 Is this reproducible every time? If so, is it always at the same point
 in the LZO file that it occurs?

 It's at the same point.  Do you know how to print out the lzo index for the
task?  I only print out the input file now.


 Would it be possible to download that lzo file to your local box and
 use lzop -d to see if it decompresses successfully? That way we can
 isolate whether it's a compression bug or decompression.

 Bothe java LzoDecompressor and lzop -d were able to decompress the file
correctly.  As a matter of fact, my job does not index .lzo files now but
process each as a whole and it works


Why is $JAVA_HOME/lib/tools.jar in the classpath?

2010-02-15 Thread Thomas Koch
Hi,

I'm working on the Debian package for hadoop (the first version is already in 
the new queue for Debian unstable).
Now I stumpled about $JAVA_HOME/lib/tools.jar in the classpath. Since Debian 
supports different JAVA runtimes, it's not that easy to know, which one the 
user currently uses and therefor I'd would make things easier if this jar 
would not be necessary.
From searching and inspecting the SVN history I got the impression, that this 
is an ancient legacy that's not necessary (anymore)?

Best regards,

Thomas Koch, http://www.koch.ro


Trouble with secondary indexing?

2010-02-15 Thread Michael Segel

Hi, First time posting...

Got an interesting problem with setting up secondary indexing in Hbase.

We're running with Cloudera's distribution, and we have secondary indexing up 
and running on our 'sandbox' machines.

I'm installing it on our development cloud, and I know I did something totally 
'brain dead' but can't figure out what it is...
(Sort of like looking for your keys and they are in plain sight on the table in 
front of you...)

We've got Hadoop up and running, and Hbase up and running without the secondary 
indexing.
We've added the two properties in to the hbase-site.xml and then in our .kshrc 
(yes we use ksh :-)


I've added $HBASE_HOME/contrib/transactional to the CLASSPATH shell variable.

The error we get is:
2010-02-15 08:39:30,263 ERROR org.apache.hadoop.hbase.master.HMaster: Can not 
start master
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1241)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1282)
Caused by: java.lang.UnsupportedOperationException: Unable to find region 
server interface org.apache.hadoop.hbase.ipc.IndexedRegionInterface


When I look at the error, I'm thinking that I'm missing the path to the CLASS, 
however on a different cluster of machines running Cloudera's +152 release 
(HBase-0.20.0) don't have a problem.

So what am I missing?

TIA!

-Mikey
  
_
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
http://clk.atdmt.com/GBL/go/201469229/direct/01/

Re: hive-0.4.0 build

2010-02-15 Thread Carl Steinbach
The MD5 checksum file for the 0.20.1 tarball contains an MD5 checksum along
with
various SHA checksums and an RMD160 checksum:

hadoop-0.20.1.tar.gz:MD5 = 71 9E 16 9B 77 60 C1 68  44 1B 49 F4 05 85 5B
72
hadoop-0.20.1.tar.gz:   SHA1 = 712A EE9C 279F 1031 1F83  657B 2B82 7ACA 0374
   6613
hadoop-0.20.1.tar.gz: RMD160 = 4331 4350 27E9 E16D 055C  F23F FFEF 1564 E206
   B144
hadoop-0.20.1.tar.gz: SHA224 = A5E4CBE9 EBBE5FE1 2020F3F1 BFBC6D3C C77A8E9B
   E6C6062C A6484BDB
...

The convention that most tools expect (including Ivy) is that a file named
xyz.tar.gz.md5 was generated by running the following command:

% md5sum xyz.tar.gz  xyz.tar.gz.md5

All of the other Hadoop releases on archive.apache.org follow this
convention, with
the exception of hadoop-0.20.1.

Can someone with access to the Apache archive repository please fix the
0.20.1
md5 checksum file?

Thanks.

Carl

On Mon, Oct 19, 2009 at 9:02 AM, Schubert Zhang zson...@gmail.com wrote:

 1. When I build hive-0.4.0, ivy would try to download hadoop 0.17.2.1,
 0.18.3, 0.19.0 and 0.20.0. But always fail for 0.17.2.1.
 2. Then I modified shims/ivy.xml and shims/build.xml to remove dependencies
 of 0.17.2.1, 0.18.3, 0.19.0. It works find to only download hadoop-0.20.0.

 3. Now, I want do depend hadoop-0.20.1, I modified the above two xml to add
 0.20.1, but failed for md5, following is the errors.

  [ivy:retrieve]
 [ivy:retrieve] :: problems summary ::
 [ivy:retrieve]  WARNINGS
 [ivy:retrieve]  [FAILED ]
 hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
 expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72
 (138662ms)
 [ivy:retrieve]  [FAILED ]
 hadoop#core;0.20.1!hadoop.tar.gz(source): invalid md5:
 expected=hadoop-0.20.1.tar.gz: computed=719e169b7760c168441b49f405855b72
 (138662ms)
 [ivy:retrieve]   hadoop-resolver: tried
 [ivy:retrieve]

 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  ::
 [ivy:retrieve]  ::  FAILED DOWNLOADS::
 [ivy:retrieve]  :: ^ see resolution messages for details  ^ ::
 [ivy:retrieve]  ::
 [ivy:retrieve]  :: hadoop#core;0.20.1!hadoop.tar.gz(source)
 [ivy:retrieve]  ::
 [ivy:retrieve]
 [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

 I checked the

 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz.md5
 ,
 and found the format of this file is different from other hadoop releases.

 Please the releaser of hadoop and hive to have a check.

 Schubert



Re: Problem with large .lzo files

2010-02-15 Thread Todd Lipcon
On Mon, Feb 15, 2010 at 8:07 AM, Steve Kuo kuosen...@gmail.com wrote:
 On Sun, Feb 14, 2010 at 12:46 PM, Todd Lipcon t...@cloudera.com wrote:

 By the way, if all files have been indexed, DistributedLzoIndexer does not
 detect that and hadoop throws an exception complaining that the input dir
 (or file) does not exist.  I work around this by catching the exception.


Just fixed that in my github repo. Thanks for the bug report.


    - It's possible to sacrifice parallelism by having hadoop work on each
    .lzo file without indexing.  This worked well until the file size
 exceeded
    30G when array indexing exception got thrown.  Apparently the code
 processed
    the file in chunks and stored the references to the chunk in an array.
  When
    the number of chunks was greater than a certain number (around 256 was
 my
    recollection), exception was thrown.
    - My current work around is to increase the number of reducers to keep
    the .lzo file size low.
 
  I would like to get advices on how people handle large .lzo files.  Any
  pointers on the cause of the stack trace below and best way to resolve it
  are greatly appreciated.
 

 Is this reproducible every time? If so, is it always at the same point
 in the LZO file that it occurs?

 It's at the same point.  Do you know how to print out the lzo index for the
 task?  I only print out the input file now.


You should be able to downcast the InputSplit to FileSplit, if you're
using the new API. From there you can get the start and length of the
split.


 Would it be possible to download that lzo file to your local box and
 use lzop -d to see if it decompresses successfully? That way we can
 isolate whether it's a compression bug or decompression.

 Bothe java LzoDecompressor and lzop -d were able to decompress the file
 correctly.  As a matter of fact, my job does not index .lzo files now but
 process each as a whole and it works


Interesting. If you can somehow make a reproducible test case I'd be
happy to look into this.

Thanks
-Todd


Check out our new, easy, scalable, real-time Data Platform :) And vote!

2010-02-15 Thread Bradford Stephens
Hey guys!

As some of you know from my blog (and occasional posts here), Drawn to
Scale been building a complete end-to-end platform that makes dealing
with data easy and scalable. You can Process, Query, Serve, Store, and
Search *all* you data.

We've made our big public announcement today:
http://www.roadtofailure.com/2010/02/15/announcing-the-drawn-to-scale-platform/

Part of our product (storage and processing) runs on Hadoop and Hbase.
Check it out, would love to get feedback.

We also have a video up describing our product. We're trying to get a
slot at the CloudConnect Launchpad. We would love your vote -- we'll
bribe you with beer.
Vote here:  http://bit.ly/9fgH6p

-- 
http://www.drawntoscalehq.com -- Big Data for all. The Big Data Platform.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science


Re: Problem with large .lzo files

2010-02-15 Thread Steve Kuo
You should be able to downcast the InputSplit to FileSplit, if you're
 using the new API. From there you can get the start and length of the
 split.

 Cool, let me give it a shot.


 Interesting. If you can somehow make a reproducible test case I'd be
 happy to look into this.

 This sounds great.  As the input file is 1G, let me do some work on my side
to see if I can pinpoint it so as not have to transfer a 1G file around.

Thanks.


Sun JVM 1.6.0u18

2010-02-15 Thread Todd Lipcon
Hey all,

Just a note that you should avoid upgrading your clusters to 1.6.0u18.
We've seen a lot of segfaults or bus errors on the DN when running
with this JVM - Stack found the ame thing on one of his clusters as
well.

We've found 1.6.0u16 to be very stable.

-Todd