Re: instantiation of classes in MR

2011-12-30 Thread Anirudh
Where are you creating this new class. If it is in the map function, then it will be create a new object for each record in the split. Also you may need to see how the JVM reuse option works. I am not too sure of this and you may want to look at the code. If the option for JVM reuse is set, then m

Re: Hadoop with Spring / Guice

2011-12-30 Thread Owen O'Malley
Using guice in your MapReduce task is easy. Just create the injector in the configure/setup method and include the jars in the distributed cache. -- Owen On Dec 27, 2011, at 4:01 PM, Eyal Golan wrote: Thanks. I did see it before. Will check it a little bit more. Eyal Golan egola...@gmail.com

Re: instantiation of classes in MR

2011-12-30 Thread Eyal Golan
Great News !! Thanks for the info. So using reflection, I can inject different implementations of interfaces (services) for the mapper (or reducer). And this way I can test a mapper (or reducer). Just by reflecting a stub instead of a real implementation. Thanks, Eyal Golan egola...@gmail.com

Re: I/O errors reading task output on 20.205.0

2011-12-30 Thread Markus Jelsma
Thanks, i'll look into it! > Yes your .205 release should have it. It should fix your issue! > > On Fri, Dec 30, 2011 at 6:24 PM, Markus Jelsma > > wrote: > > Hi, (didn't reply to list before) > > > >> Does your DN log show up any form of errors when you run into this? > > > > Actually, i loo

Re: I/O errors reading task output on 20.205.0

2011-12-30 Thread Harsh J
Yes your .205 release should have it. It should fix your issue! On Fri, Dec 30, 2011 at 6:24 PM, Markus Jelsma wrote: > Hi, (didn't reply to list before) > >> Does your DN log show up any form of errors when you run into this? > > Actually, i looked checked again to be sure and noticed errors tha

Re: I/O errors reading task output on 20.205.0

2011-12-30 Thread Markus Jelsma
Hi, (didn't reply to list before) > Does your DN log show up any form of errors when you run into this? Actually, i looked checked again to be sure and noticed errors that i didn't notice before: 2011-12-29 19:51:01,799 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistratio

Re: instantiation of classes in MR

2011-12-30 Thread Harsh J
Eyal, Yes, it is right to think of each Task attempt being one individual JVM running individually on any added Node. Multiple slots would mean multiple VMs in parallel as well. Yes, your use of reflection to build your objects will work just fine -- its all user-side java code that is executed

Re: Hadoop with Spring / Guice

2011-12-30 Thread Costin Leau
Hi, This tends to be a problem with any library that you use inside Hadoop. One of the easiest way to distribute your dependencies (w/o managing) is to put them under lib/ folder in your job and Hadoop will take care of the rest - if there are any better suggestions, I'd be interested in hearing

Re: HBase support in SHDP Was: Hadoop with Spring / Guice

2011-12-30 Thread Costin Leau
On 12/30/2011 12:43 PM, Ted Yu wrote: > Costin: > Please include u...@hbase.apache.org in > your reply. > My discussion would mostly be HBase-related. # included hbase though I'm not subscribed to that list so I don't think my messages will go through. > > I guess

instantiation of classes in MR

2011-12-30 Thread Eyal Golan
Hi, I want to understand a basic concept in MR. If a mapper creates an instance of some class (using the 'new' operator), then the created class exists ONCE in the VM of this node. For each node. Correct? Now, what if instead of using the 'new' operator, the class is created using reflection. Is

Re: Hadoop with Spring / Guice

2011-12-30 Thread Eyal Golan
Thank you Costin. I have worked with Spring before but never set up a project. So I guess I'll need to do a little bit more ramp-up to use Spring-Hadoop. Another issue that concerns me is the Hadoop cluster management. I don't have access for the cluster, besides adding my job and use some kind o

Re: HBase support in SHDP Was: Hadoop with Spring / Guice

2011-12-30 Thread Ted Yu
Costin: Please include u...@hbase.apache.org in your reply. My discussion would mostly be HBase-related. I guess you're dealing with HBase 0.90.4 where connection sharing provided by HBASE-3777 is absent. HBase 0.90.5, just released, contains HBASE-4508 which backports HBASE-3777 to 0.90 codebase.

Re: HBase support in SHDP Was: Hadoop with Spring / Guice

2011-12-30 Thread Costin Leau
Hi, Thanks for spotting the typo. We're currently running against 0.90.x. As for the lack of connection, we use HConnectionManager in the background to clean connections up as it already provides tracking and a cleanup mechanism (based on the HbaseConfiguration object). Cheers and keep the feedb

HBase support in SHDP Was: Hadoop with Spring / Guice

2011-12-30 Thread Ted Yu
Hi, Costin: I work on HBase. I went over http://static.springsource.org/spring-hadoop/docs/current/reference/hbase.htmlbut didn't have time to download the source code. Is there a typo: 'does more then easily' Should 'then' be 'than' ? For the following config: May I ask what would the proxies

Re: Hadoop with Spring / Guice

2011-12-30 Thread Costin Leau
Hi, My name is Costin Leau and I'm the lead of Spring Hadoop (SHDP) project. SHDP provides DI support allowing basic POJOs to be used as mapper/reducers. This feature is currently developed on a dedicated branch [1] and we plan to merge it in master in the near future. In addition to the pojo/DI