Re: JVM option

2014-04-18 Thread Bjorn Jonsson
Hi Andy, This is a client side property that you can set in hadoop-env.sh (/etc/hadoop/conf). The hadoop script (/bin/hadoop) itself, which is a bash shell script, contains code like the following. if [ "$HADOOP_HEAPSIZE" != "" ]; then #echo "run with heapsize $HADOOP_HEAPSIZE" JAVA_HEAP_MAX=

Re: Few noob MR questions

2013-04-13 Thread Bjorn Jonsson
Correct, you can use java -jar to submit a job...with the "driver" code in a plain static main method. I do it all the time. You can of course run a Job straight from your IDE Java code also. You can check out the .runJar() method in the Hadoop API Javadoc to see what the hadoop command does essent

Re: Distributed cache: how big is too big?

2013-04-09 Thread Bjorn Jonsson
e is hdfs distribution of the file over all nodes ? > > On Apr 9, 2013, at 6:49 AM, Bjorn Jonsson wrote: > > Put it once on hdfs with a replication factor equal to the number of DN. > No startup latency on job submission or max size and access it from > anywhere with fs since

Re: Distributed cache: how big is too big?

2013-04-09 Thread Bjorn Jonsson
Put it once on hdfs with a replication factor equal to the number of DN. No startup latency on job submission or max size and access it from anywhere with fs since it sticks around untill you replace it? Just a thought. On Apr 8, 2013 9:59 PM, "John Meza" wrote: > I am researching a Hadoop soluti

Re: Problem accessing HDFS from a remote machine

2013-04-08 Thread Bjorn Jonsson
Yes, the namenode port is not open for your cluster. I had this problem to. First, log into your namenode and do netstat -nap to see what ports are listening. You can do service --status-all to see if the namenode service is running. Basically you need Hadoop to bind to the correct ip (an external