Re: How to verify all my master/slave name/data nodes have been configured correctly?

2012-03-08 Thread madhu phatak
Hi, Use the JobTracker WEB UI at master:50030 and Namenode web UI at master:50070. On Fri, Feb 10, 2012 at 9:03 AM, Wq Az azq...@gmail.com wrote: Hi, Is there a quick way to check this? Thanks ahead, Will -- Join me at http://hadoopworkshop.eventbrite.com/

Re: Can I start a Hadoop job from an EJB?

2012-03-08 Thread madhu phatak
Yes you can . Please make sure all Hadoop jars and conf directory is in classpath. On Thu, Feb 9, 2012 at 7:02 AM, Sanjeev Verma sanjeev.x.ve...@gmail.comwrote: This is based on my understanding and no real life experience, so going to go out on a limb here :-)...assuming that you are planning

Re: Standalone operation - file permission, Pseudo-Distributed operation - no output

2012-03-08 Thread Jagat
Hello Can you please tell which version of Hadoop you are using and also Does your error matches with below message? Failed to set permissions of path: file:/tmp/hadoop-jj/mapred/staging/jj-1931875024/.staging to 0700 Thanks Jagat On Thu, Mar 8, 2012 at 5:10 PM, madhu phatak

Convergence on File Format?

2012-03-08 Thread Michal Klos
Hi, It seems that Avro is poised to become the file format, is that still the case? We've looked at Text, RCFile and Avro. Text is nice, but we'd really need to extend it. RCFile is great for Hive, but it has been a challenge using it outside of Hive. Avro has a great feature set, but is

Re: Convergence on File Format?

2012-03-08 Thread Serge Blazhievsky
We started using Avro few month ago and results are great! Easy to use, reliable, feature rich, great integration with MapReduce On 3/8/12 3:07 PM, Michal Klos mk...@compete.com wrote: Hi, It seems that Avro is poised to become the file format, is that still the case? We've looked at Text,

Re: Profiling Hadoop Job

2012-03-08 Thread Leonardo Urbina
Does anyone have any idea how to solve this problem? Regardless of whether I'm using plain HPROF or profiling through Starfish, I am getting the same error: Exception in thread main java.io.FileNotFoundException: attempt_201203071311_0004_m_ 00_0.profile (Permission denied) at

Re: Why is hadoop build I generated from a release branch different from release build?

2012-03-08 Thread Matt Foley
Hi Pawan, The complete way releases are built (for v0.20/v1.0) is documented at http://wiki.apache.org/hadoop/HowToRelease#Building However, that does a bunch of stuff you don't need, like generate the documentation and do a ton of cross-checks. The full set of ant build targets are defined

Re: Convergence on File Format?

2012-03-08 Thread Russell Jurney
Avro support in Pig will be fairly mature in 0.10. Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com On Mar 8, 2012, at 3:10 PM, Serge Blazhievsky serge.blazhiyevs...@nice.com wrote: We started using Avro few month ago and results are great! Easy to use, reliable,

Re: Profiling Hadoop Job

2012-03-08 Thread Mohit Anchlia
Can you check which user you are running this process as and compare it with the ownership on the directory? On Thu, Mar 8, 2012 at 3:13 PM, Leonardo Urbina lurb...@mit.edu wrote: Does anyone have any idea how to solve this problem? Regardless of whether I'm using plain HPROF or profiling

RE: Why is hadoop build I generated from a release branch different from release build?

2012-03-08 Thread Leo Leung
Hi Pawan, ant -p (not for 0.23+) will tell you the available build targets. Use mvn (maven) for 0.23 or newer -Original Message- From: Matt Foley [mailto:mfo...@hortonworks.com] Sent: Thursday, March 08, 2012 3:52 PM To: common-user@hadoop.apache.org Subject: Re: Why is hadoop

does hadoop always respect setNumReduceTasks?

2012-03-08 Thread Jane Wayne
i am wondering if hadoop always respect Job.setNumReduceTasks(int)? as i am emitting items from the mapper, i expect/desire only 1 reducer to get these items because i want to assign each key of the key-value input pair a unique integer id. if i had 1 reducer, i can just keep a local counter

Re: does hadoop always respect setNumReduceTasks?

2012-03-08 Thread Lance Norskog
Instead of String.hashCode() you can use the MD5 hashcode generator. This has not in the wild created a duplicate. (It has been hacked, but that's not relevant here.) http://snippets.dzone.com/posts/show/3686 I think the Partitioner class guarantees that you will have multiple reducers. On Thu,

Best way for setting up a large cluster

2012-03-08 Thread Masoud
Hi all, I installed hadoop in a pilot cluster with 3 machines and now going to make our actual cluster with 32 nodes. as you know setting up hadoop separately in every nodes is time consuming and not perfect way. whats the best way or tool to setup hadoop cluster (expect cloudera)? Thanks,

Re: Best way for setting up a large cluster

2012-03-08 Thread Joey Echeverria
Something like puppet it is a good choice. There are example puppet manifests available for most Hadoop-related projects in Apache BigTop, for example: https://svn.apache.org/repos/asf/incubator/bigtop/branches/branch-0.2/bigtop-deploy/puppet/ -Joey On Thu, Mar 8, 2012 at 9:42 PM, Masoud

Getting different results every time I run the same job on the cluster

2012-03-08 Thread Mark Kerzner
Hi, I have to admit, I am lost. My code http://frd.org/ is stable on a pseudo distributed cluster, but every time I run it one a 4 - slave cluster, I get different results, ranging from 100 output lines to 4,000 output lines, whereas the real answer on my standalone is about 2000. I look at

Re: Why is hadoop build I generated from a release branch different from release build?

2012-03-08 Thread Pawan Agarwal
Thanks for all the replies. It turns out that build generated by ant has bin conf etc folders in one level above. And I looked at hadoop scripts and apparently it looks for right jars both in root directory and root/build/ directory as well. so I think I am covered for now. Thanks again! On

Hadoop-Pig setup question

2012-03-08 Thread Atul Thapliyal
Hi Hadoop users, I am new member and please let me know if this is not the correct format to ask questions. I am trying to setup a small Hadoop cluster where I will run Pig queries. Hadoop cluster is running fine but when I run a pig query it just hangs. Note - Pig runs fine in local mode So I

Hadoop node name problem

2012-03-08 Thread 韶隆吴
Hi All: I'm trying to use hadoop,zookeeper and hbase to build a NoSQL database,but when I make hadoop and zookeeper work well and going to install hbase,it report an exception: BindException:Problem binding to /202.106.199.37:60020:Cannot assign requested address My PC IPHost is 192.168.1.91

Re: Java Heap space error

2012-03-08 Thread hadoopman
I'm curious if you have been able to track down the cause of the error? We've seen similar problems with loading data and I've discovered if I presort my data before the load that things go a LOT smoother. When running queries against our data sometimes we've seen it where the jobtracker

state of HOD

2012-03-08 Thread Stijn De Weirdt
(my apologies for those who have received this already. i posted this mail a few days back on the common-dev list, as this is more a development related mail; but one of the original authors/maintainers suggested to also post this here) hi all, i am a system administrator/user support