Re: Hadoopn1.03 There is insufficient memory for the Java Runtime Environment to continue.

2012-10-08 Thread Attila Csordas
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) might the official Oracle Java be better? Thanks, Attila On Sun, Oct 7, 2012 at 8:37 PM, Arpit Gupta ar...@hortonworks.com wrote: are you using 32bit jdk for your task trackers? If so reduce the mem setting in mapred.child.java.opts --

Re: Hadoopn1.03 There is insufficient memory for the Java Runtime Environment to continue.

2012-10-08 Thread Attila Csordas
ulimit was set to property namemapred.child.ulimit/name value6291456/value /property -Xmx4096M stayed for heap still getting the very same error any other tips? Thanks, Attila On Sun, Oct 7, 2012 at 6:34 AM, Harsh J ha...@cloudera.com wrote: Hi, What is your # of slots per TaskTracker?

Task Attempt Failed

2012-10-08 Thread Dave Shine
I'm starting to see the following error show up, but I don't know what it is trying to tell me. Does anyone know what this error means? 12/10/08 03:13:12 INFO mapred.JobClient: Task Id : attempt_201209281640_6895_m_007292_0, Status : FAILED java.lang.Throwable: Child Error at

Re: Hadoopn1.03 There is insufficient memory for the Java Runtime Environment to continue.

2012-10-08 Thread Arpit Gupta
i would recommended using the oracle jdk. Also i tried your configs on a single node setup of 1.0.3 and the mr jobs went through. So i suspect this is something specific to your env. Also from your email below you mention that mapred.child.java.opts and mapred.child.ulimit were added to try

Re: pojo writables

2012-10-08 Thread Dave Beech
Hi Jay Hadoop supports this already using the JavaSerialization class. I'm not sure it is very performant though, when compared to usual Writable serialization. Alternatively you could use Avro (SpecificRecord objects auto-generated from a schema file). Cheers Dave On 8 October 2012 15:13, Jay

HDFS-347 and HDFS-2246 issues different

2012-10-08 Thread jlei liu
The two issues both implement DFSClient to directly open data blocks that happen to be on the same machine function. What are advantage of HDFS-347? Thanks, LiuLei

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Bertrand Dechoux
Have you looked at graph processing for Hadoop? Like Hama ( http://hama.apache.org/) or Giraph (http://incubator.apache.org/giraph/). I can't say for sure it would help you but it seems to be in the same problem domain. With regard to the chaining reducer issue this is indeed a general

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Fabio Pitzolu
Isn't also of some help using Cascading (http://www.cascading.org/) ? *Fabio Pitzolu* Consultant - BI Infrastructure Mob. +39 3356033776 Telefono 02 87157239 Fax. 02 93664786 *Gruppo Consulenza Innovazione - http://www.gr-ci.com* 2012/10/8 Bertrand Dechoux decho...@gmail.com Have you

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Bertrand Dechoux
The question is not how to sequence all. Cascading could indeed help in that case. But how to skip the map phase and do the split/local sort directly at the end of the reduce so that the next reduce need only to do a merge on the sorted files obtained from the previous reduce. This is basically a

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Edward J. Yoon
call context.write() in my mapper class)? If not, are there any other MR platforms that can do this? I've been searching around and couldn't You can use Hama BSP[1] instead of Map/Reduce. No stable release yet but I confirmed that large graph with billions of nodes and edges can be crunched in

How to change topology

2012-10-08 Thread Shinichi Yamashita
Hi, I know that DataNode and TaskTracker must restart to change topology. Is there the method to execute the topology change without restart of DataNode and TaskTracker? In other words, can I change the topology by a command? Thanks in advance! Shinichi

Collecting error messages from Mappers

2012-10-08 Thread Terry Healy
Hi- Is there a simple way for error output / debug information generated by a Mapper to be collected in one place for a given M/R job run? I guess what I'm hoping for is sort of the reverse of a Distributed Cache function. Can anyone suggest an approach? Thanks, Terry

Re: sym Links in hadoop

2012-10-08 Thread Visioner Sadak
act i have to access my archived hadoop har files thru http webhdfs normal files i am able to read thru http but once my files are HAR archived i m not able to read it ...thts why creating a symlink so tht the url remains same On Mon, Oct 8, 2012 at 7:50 PM, Visioner Sadak

Re: Collecting error messages from Mappers

2012-10-08 Thread Bertrand Dechoux
If it is only to get a parse overview of the run you can use counters. But you shouldn't overuse them. You can for example count exceptions by type (but by message is not a good approach unless you are sure that the message is constant.) Regards Bertrand On Mon, Oct 8, 2012 at 4:13 PM, Terry

Re: One file per mapper?

2012-10-08 Thread Terry Healy
thanks Bejoy. ...Feeling a bit foolish as Tom White's book was 2 feet away On 10/08/2012 10:28 AM, Bejoy Ks wrote: Hi Terry If you are having files smaller than hdfs block size and if you are using Default TextInputFormat with the default properties for split sizes there would be just

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Jim Twensky
Thank you for the comments. Some similar frameworks I looked at include Haloop, Twister, Hama, Giraph and Cascading. I am also doing large scale graph processing so I assumed one of them could serve the purpose. Here is a summary of what I found out about them that is relevant: 1) Haloop and

Secure hadoop and group permission on HDFS

2012-10-08 Thread Koert Kuipers
With secure hadoop the user name is authenticated by the kerberos server. But what about the groups that the user is a member of? Are these simple the groups that the user is a member of on the namenode machine? Is it viable to manage access to files on HDFS using groups on a secure hadoop

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Edward J. Yoon
asking for. If anyone who used Hama can point a few articles about how the framework actually works and handles the messages passed between vertices, I'd really appreciate that. Hama Architecture: https://issues.apache.org/jira/secure/attachment/12528219/ApacheHamaDesign.pdf Hama BSP

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Michael Segel
Jim, You can use the combiner as a reducer albeit you won't get down to a single reduce file output. But you don't need that. As long as the output from the combiner matches the input to the next reducer you should be ok. Without knowing the specifics, all I can say is TANSTAAFL that is to

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-08 Thread Edward J. Yoon
Mike, just FYI, it's my 08's approach[1]. 1. https://blogs.apache.org/hama/entry/how_will_hama_bsp_different On Tue, Oct 9, 2012 at 7:50 AM, Michael Segel michael_se...@hotmail.com wrote: Jim, You can use the combiner as a reducer albeit you won't get down to a single reduce file output.

Re: Secure hadoop and group permission on HDFS

2012-10-08 Thread Harsh J
Koert, If you use the org.apache.hadoop.security.ShellBasedUnixGroupsMapping class (via hadoop.security.group.mapping), then yes the NameNode's view of the local unix groups (and the primary group) of the user is the final say on what groups the user belongs to. This can be relied on - but note