Re: where are my logging output files going to?

2012-03-28 Thread Michel Segel
First you really don't want to launch the job from the cluster but from an edge 
node.

To answer your question, in a word, yes, you should have a consistent set of 
configuration files as possible, noting that overtime this may not be possible 
as hardware configs may change,


Sent from a remote device. Please excuse any typos...

Mike Segel

On Mar 27, 2012, at 8:42 PM, Jane Wayne  wrote:

> if i have a hadoop cluster of 10 nodes, do i have to modify the
> /hadoop/conf/log4j.properties files on ALL 10 nodes to be the same?
> 
> currently, i ssh into the master node to execute a job. this node is the
> only place where i have modified the logj4.properties file. i notice that
> although my log files are being created, nothing is being written to them.
> when i test on cygwin, the logging works, however, when i go to a live
> cluster (i.e. amazon elastic mapreduce), the logging output on the master
> node no longer works. i wonder if logging is happening at each slave/task
> node?
> 
> could someone explain logging or point me to the documentation discussing
> this issue?


Re: where are my logging output files going to?

2012-03-28 Thread Jane Wayne
what do you mean by an edge node? do you mean any node that is not the
master node (or NameNode or JobTracker node)?

On Wed, Mar 28, 2012 at 3:51 AM, Michel Segel wrote:

> First you really don't want to launch the job from the cluster but from an
> edge node.
>
> To answer your question, in a word, yes, you should have a consistent set
> of configuration files as possible, noting that overtime this may not be
> possible as hardware configs may change,
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Mar 27, 2012, at 8:42 PM, Jane Wayne  wrote:
>
> > if i have a hadoop cluster of 10 nodes, do i have to modify the
> > /hadoop/conf/log4j.properties files on ALL 10 nodes to be the same?
> >
> > currently, i ssh into the master node to execute a job. this node is the
> > only place where i have modified the logj4.properties file. i notice that
> > although my log files are being created, nothing is being written to
> them.
> > when i test on cygwin, the logging works, however, when i go to a live
> > cluster (i.e. amazon elastic mapreduce), the logging output on the master
> > node no longer works. i wonder if logging is happening at each slave/task
> > node?
> >
> > could someone explain logging or point me to the documentation discussing
> > this issue?
>


In cygwin, hadoop throws exception when running wordcount.

2012-03-28 Thread Tim.Wu
Env:   windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0

So far, the standalone model is ok. But, in pseudo or cluster model, the
wordcount example always throw errors.

The HDFS works fine, but tasktracker can not create threads(jvm) for new
job.  It is empty under /logs/userlogs/job-/attempt-/.

The error log of tasktracker is like:


==
12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID:
jvm_201203280212_0005_m_-1386636958
12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner
jvm_201203280212_0005_m_-1386636958 spawned.
12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed
jvm_201203280212_0005_m_-1386636958 but just removed
12/03/28 14:35:17 INFO mapred.JvmManager: JVM :
jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of
tasks it ran: 0
12/03/28 14:35:17 WARN mapred.TaskRunner:
attempt_201203280212_0005_m_02_0 : Child Error
java.io.IOException: Task process exit with nonzero status of -1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots
: 2
12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_201203280212_0005_m_02_1 task's state:UNASSIGNED
12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch :
attempt_201203280212_0005_m_02_1 which needs 1 slots
12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free
slots : 2 and trying to launch attempt_201203280212_0005_m_02_1 which
needs 1 slots
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for
task: attempt_201203280212_0005_m_02_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_02_0\log.index
(The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:120)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423)
at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for
task: attempt_201203280212_0005_m_02_0
java.io.FileNotFoundException:
D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_02_0\log.index
(The system cannot find the path specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:120)
at
org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
at
org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
at org.apache.hadoop.mapred.TaskLog$Reader.(TaskLog.java:423)
at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
at
org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
at java

Re: Cannot renew lease for DFSClient_977492582. Name node is in safe mode in AWS

2012-03-28 Thread madhu phatak
Hi Mohit,
 HDFS is in safe mode which is read only mod. Run the following command to
get out of safemode

 bin/hadoop dfsadmin -safemode leave.

On Thu, Mar 15, 2012 at 5:54 AM, Mohit Anchlia wrote:

>  When I run client to create files in amazon HDFS I get this error. Does
> anyone know what it really means and how to resolve this?
>
> ---
>
>
> 2012-03-14 23:16:21,414 INFO org.apache.hadoop.ipc.Server (IPC Server
> handler 46 on 9000): IPC Server handler 46 on 9000, call
> renewLease(DFSClient_977492582) from 10.70.150.119:47240: error:
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot renew
> lease for DFSClient_977492582. Name node is in safe mode.
>
> The ratio of reported blocks 1. has reached the threshold 0.9990. Safe
> mode will be turned off automatically in 0 seconds.
>
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot renew
> lease for DFSClient_977492582. Name node is in safe mode.
>
> The ratio of reported blocks 1. has reached the threshold 0.9990. Safe
> mode will be turned off automatically in 0 seconds.
>
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewLease(FSNamesystem.java:2296)
>
> at
>
> org.apache.hadoop.hdfs.server.namenode.NameNode.renewLease(NameNode.java:814)
>
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>
> at java.lang.reflect.Method.invoke(Method.java:597)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
>



-- 
https://github.com/zinnia-phatak-dev/Nectar


Re: dynamic mapper?

2012-03-28 Thread madhu phatak
Hi,
 You can use java API's to compile custom java code and create jars. For
example , look at this code from Sqoop

/**
 * Licensed to Cloudera, Inc. under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  Cloudera, Inc. licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package com.cloudera.sqoop.orm;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.jar.JarOutputStream;
import java.util.zip.ZipEntry;

import javax.tools.JavaCompiler;
import javax.tools.JavaFileObject;
import javax.tools.StandardJavaFileManager;
import javax.tools.ToolProvider;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.mapred.JobConf;

import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.util.FileListing;

import com.cloudera.sqoop.util.Jars;

/**
 * Manages the compilation of a bunch of .java files into .class files
 * and eventually a jar.
 *
 * Also embeds this program's jar into the lib/ directory inside the
compiled
 * jar to ensure that the job runs correctly.
 */
public class CompilationManager {

  /** If we cannot infer a jar name from a table name, etc., use this. */
  public static final String DEFAULT_CODEGEN_JAR_NAME =
  "sqoop-codegen-created.jar";

  public static final Log LOG = LogFactory.getLog(
  CompilationManager.class.getName());

  private SqoopOptions options;
  private List sources;

  public CompilationManager(final SqoopOptions opts) {
options = opts;
sources = new ArrayList();
  }

  public void addSourceFile(String sourceName) {
sources.add(sourceName);
  }

  /**
   * locate the hadoop-*-core.jar in $HADOOP_HOME or --hadoop-home.
   * If that doesn't work, check our classpath.
   * @return the filename of the hadoop-*-core.jar file.
   */
  private String findHadoopCoreJar() {
String hadoopHome = options.getHadoopHome();

if (null == hadoopHome) {
  LOG.info("$HADOOP_HOME is not set");
  return Jars.getJarPathForClass(JobConf.class);
}

if (!hadoopHome.endsWith(File.separator)) {
  hadoopHome = hadoopHome + File.separator;
}

File hadoopHomeFile = new File(hadoopHome);
LOG.info("HADOOP_HOME is " + hadoopHomeFile.getAbsolutePath());
File [] entries = hadoopHomeFile.listFiles();

if (null == entries) {
  LOG.warn("HADOOP_HOME appears empty or missing");
  return Jars.getJarPathForClass(JobConf.class);
}

for (File f : entries) {
  if (f.getName().startsWith("hadoop-")
  && f.getName().endsWith("-core.jar")) {
LOG.info("Found hadoop core jar at: " + f.getAbsolutePath());
return f.getAbsolutePath();
  }
}

return Jars.getJarPathForClass(JobConf.class);
  }

  /**
   * Compile the .java files into .class files via embedded javac call.
   * On success, move .java files to the code output dir.
   */
  public void compile() throws IOException {
List args = new ArrayList();

// ensure that the jar output dir exists.
String jarOutDir = options.getJarOutputDir();
File jarOutDirObj = new File(jarOutDir);
if (!jarOutDirObj.exists()) {
  boolean mkdirSuccess = jarOutDirObj.mkdirs();
  if (!mkdirSuccess) {
LOG.debug("Warning: Could not make directories for " + jarOutDir);
  }
} else if (LOG.isDebugEnabled()) {
  LOG.debug("Found existing " + jarOutDir);
}

// Make sure jarOutDir ends with a '/'.
if (!jarOutDir.endsWith(File.separator)) {
  jarOutDir = jarOutDir + File.separator;
}

// find hadoop-*-core.jar for classpath.
String coreJar = findHadoopCoreJar();
if (null == coreJar) {
  // Couldn't find a core jar to insert into the CP for compilation.
If,
  // however, we're running this from a unit test, then the path to the
  // .class files might be set via the hadoop.alt.classpath property
  // instead. Check there first.
  String coreClassesPath = System.getProperty("hadoop.alt.classpath");
  if (null == coreClassesPath) {
// no -- we're out of options. Fail.
throw new IOException("Could not find hadoop core jar!");
  } else {
coreJar = coreClassesPath;
  }
}

// find sqoop jar for compilation

Why does just tasktracker run under cyg_server account in cygwin?

2012-03-28 Thread Tim.Wu
Hi all

I noticed that when I run bin/start-all.sh in cygwin. Namenode, datanode
and jobtracker are all running under the login account, e.g. timwu in my
case. Just tasktracker run under cyg_server, where cyg_server is the
account created by ssh-host-config when I set up sshd in cygwin.

As I install cygwin in d:\cygwin, all namenode, datanode and jobtracker use
the d:\\tmp\\hadoop-timwu as their tmp folder. And only tasktraker use
d:\\tmp\\hadoop-cyg_server.

Do I configure the hadoop correctly?

Best
Tim


MapReduce on autocomplete

2012-03-28 Thread Tony Burton
So I have a lot of small files on S3 that I need to consolidate, so headed to 
Google to see the best way to do it in a MapReduce job. Looks like someone's 
got a different idea, according to Google's autocomplete:

[cid:image001.jpg@01CD0D09.CDEB9E90]


**
This email and any attachments are confidential, protected by copyright and may 
be legally privileged.  If you are not the intended recipient, then the 
dissemination or copying of this email is prohibited. If you have received this 
in error, please notify the sender by replying by email and then delete the 
email completely from your system.  Neither Sporting Index nor the sender 
accepts responsibility for any virus, or any other defect which might affect 
any computer or IT system into which the email is received and/or opened.  It 
is the responsibility of the recipient to scan the email and no responsibility 
is accepted for any loss or damage arising in any way from receipt or use of 
this email.  Sporting Index Ltd is a company registered in England and Wales 
with company number 2636842, whose registered office is at Brookfield House, 
Green Lane, Ivinghoe, Leighton Buzzard, LU7 9ES.  Sporting Index Ltd is 
authorised and regulated by the UK Financial Services Authority (reg. no. 
150404). Any financial promotion contained herein has been issued 
and approved by Sporting Index Ltd.

Outbound email has been scanned for viruses and SPAM


Re: where are my logging output files going to?

2012-03-28 Thread Michael Segel
You don't want users actually running anything directly on the cluster. 
You would set up some machine to launch jobs. 
Essentially any sort of Linux machine where you can install Hadoop, but you 
don't run any jobs...

Sent from my iPhone

On Mar 28, 2012, at 3:30 AM, "Jane Wayne"  wrote:

> what do you mean by an edge node? do you mean any node that is not the
> master node (or NameNode or JobTracker node)?
> 
> On Wed, Mar 28, 2012 at 3:51 AM, Michel Segel 
> wrote:
> 
>> First you really don't want to launch the job from the cluster but from an
>> edge node.
>> 
>> To answer your question, in a word, yes, you should have a consistent set
>> of configuration files as possible, noting that overtime this may not be
>> possible as hardware configs may change,
>> 
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Mar 27, 2012, at 8:42 PM, Jane Wayne  wrote:
>> 
>>> if i have a hadoop cluster of 10 nodes, do i have to modify the
>>> /hadoop/conf/log4j.properties files on ALL 10 nodes to be the same?
>>> 
>>> currently, i ssh into the master node to execute a job. this node is the
>>> only place where i have modified the logj4.properties file. i notice that
>>> although my log files are being created, nothing is being written to
>> them.
>>> when i test on cygwin, the logging works, however, when i go to a live
>>> cluster (i.e. amazon elastic mapreduce), the logging output on the master
>>> node no longer works. i wonder if logging is happening at each slave/task
>>> node?
>>> 
>>> could someone explain logging or point me to the documentation discussing
>>> this issue?
>> 


Re: MapReduce on autocomplete

2012-03-28 Thread Harsh J
Looks like your mail client (or the list) stripped away your image
attachment. Could you post the image as a link from imageshack/etc.
instead?

On Wed, Mar 28, 2012 at 10:10 PM, Tony Burton  wrote:
>
> So I have a lot of small files on S3 that I need to consolidate, so headed to 
> Google to see the best way to do it in a MapReduce job. Looks like someone’s 
> got a different idea, according to Google’s autocomplete:
>
>
>
>
>
>
>
> P Think of the environment: please don't print this email unless you really 
> need to.
>
> Outbound Email has been scanned for viruses and SPAM
>
> This email and any attachments are confidential, protected by copyright and 
> may be legally privileged. If you are not the intended recipient, then the 
> dissemination or copying of this email is prohibited. If you have received 
> this in error, please notify the sender by replying by email and then delete 
> the email completely from your system. Neither Sporting Index nor the sender 
> accepts responsibility for any virus, or any other defect which might affect 
> any computer or IT system into which the email is received and/or opened. It 
> is the responsibility of the recipient to scan the email and no 
> responsibility is accepted for any loss or damage arising in any way from 
> receipt or use of this email. Sporting Index Ltd is a company registered in 
> England and Wales with company number 2636842, whose registered office is at 
> Brookfield House, Green Lane, Ivinghoe, Leighton Buzzard, LU7 9ES. Sporting 
> Index Ltd is authorised and regulated by the UK Financial Services Authority 
> (reg. no. 150404). Any financial promotion contained herein has been issued 
> and approved by Sporting Index Ltd.




--
Harsh J


Re: Hbase RegionServer stalls on initialization

2012-03-28 Thread N Keywal
It must be waiting for the master. Have you launched the master?

On Wed, Mar 28, 2012 at 7:40 PM, Nabib El-Rahman <
nabib.elrah...@tubemogul.com> wrote:

> Hi Guys,
>
> I'm starting up an region server and it stalls on initialization.  I took
> a thread dump and found it hanging on this spot:
>
> "regionserver60020" prio=10 tid=0x7fa90c5c4000 nid=0x4b50 in 
> Object.wait() [0x7fa9101b4000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xbc63b2b8> (a 
> org.apache.hadoop.hbase.MasterAddressTracker)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:122)
> - locked <0xbc63b2b8> (a 
> org.apache.hadoop.hbase.MasterAddressTracker)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:516)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:493)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initialize(HRegionServer.java:461)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:560)
> at java.lang.Thread.run(Thread.java:662)
>
>
>
> Any Idea on who or what its being blocked on?
>
> *Nabib El-Rahman *|  Senior Sofware Engineer
>
> *M:* 734.846.2529
> www.tubemogul.com | *twitter: @nabiber*
>
>  
>  
>
>


Re: Hbase RegionServer stalls on initialization

2012-03-28 Thread N Keywal
Then you should have an error in the master logs.
If not, it worths checking that the master & the region servers speak to
the same ZK...

As it's hbase related, I redirect the question to hbase user mailing list
(hadoop common is in bcc).

On Wed, Mar 28, 2012 at 8:03 PM, Nabib El-Rahman <
nabib.elrah...@tubemogul.com> wrote:

> The master is up. is it possible that zookeeper might not know about it?
>
>
>  *Nabib El-Rahman *|  Senior Sofware Engineer
>
> *M:* 734.846.25 <734.846.2529>
> www.tubemogul.com | *twitter: @nabiber*
>
>  
>  
>
> On Mar 28, 2012, at 10:42 AM, N Keywal wrote:
>
> It must be waiting for the master. Have you launched the master?
>
> On Wed, Mar 28, 2012 at 7:40 PM, Nabib El-Rahman <
> nabib.elrah...@tubemogul.com> wrote:
>
>> Hi Guys,
>>
>> I'm starting up an region server and it stalls on initialization.  I took
>> a thread dump and found it hanging on this spot:
>>
>> "regionserver60020" prio=10 tid=0x7fa90c5c4000 nid=0x4b50 in 
>> Object.wait() [0x7fa9101b4000]
>>
>>java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>
>> at java.lang.Object.wait(Native Method)
>>
>> - waiting on <0xbc63b2b8> (a 
>> org.apache.hadoop.hbase.MasterAddressTracker)
>>
>>
>> at 
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:122)
>>
>>
>> - locked <0xbc63b2b8> (a 
>> org.apache.hadoop.hbase.MasterAddressTracker)
>>
>>
>> at 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:516)
>>
>>
>> at 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:493)
>>
>>
>> at 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.initialize(HRegionServer.java:461)
>>
>>
>> at 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:560)
>>
>>
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>> Any Idea on who or what its being blocked on?
>>
>>  *Nabib El-Rahman *|  Senior Sofware Engineer
>>
>> *M:* 734.846.2529
>> www.tubemogul.com | *twitter: @nabiber*
>>
>>  
>>  
>>
>>
>
>


Re: activity on IRC .

2012-03-28 Thread Todd Lipcon
Hey Jay,

That's the only one I know of. Not a lot of idle chatter, but when
people have questions, discussions do start up. Much more active
during PST working hours, of course :)

-Todd

On Wed, Mar 28, 2012 at 8:05 AM, Jay Vyas  wrote:
> Hi guys : I notice the IRC activity is a little low.  Just wondering if
> theres a better chat channel for hadoop other than the official one
> (#hadoop on freenode)?
> In any case... Im on there :)   come say hi.
>
> --
> Jay Vyas
> MMSB/UCHC



-- 
Todd Lipcon
Software Engineer, Cloudera


Re: activity on IRC .

2012-03-28 Thread Russell Jurney
I get good answers on Twitter.

Russell Jurney
twitter.com/rjurney
russell.jur...@gmail.com
datasyndrome.com

On Mar 28, 2012, at 12:27 PM, Todd Lipcon  wrote:

> Hey Jay,
>
> That's the only one I know of. Not a lot of idle chatter, but when
> people have questions, discussions do start up. Much more active
> during PST working hours, of course :)
>
> -Todd
>
> On Wed, Mar 28, 2012 at 8:05 AM, Jay Vyas  wrote:
>> Hi guys : I notice the IRC activity is a little low.  Just wondering if
>> theres a better chat channel for hadoop other than the official one
>> (#hadoop on freenode)?
>> In any case... Im on there :)   come say hi.
>>
>> --
>> Jay Vyas
>> MMSB/UCHC
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera


Hadoop Roadmap

2012-03-28 Thread Edmon Begoli
I have seen several articles including recent one in SD Times on changes
that are about to happen in Hadoop and significant architectural upgrades.

Does anyone have a good, detailed resource to recommend where these changes
are outlined including long term roadmap.

I am interested in this because I would to see how other data parallel
algorithms other than M/R and other parallel file system approaches other
than HDFS could be fitted into this vision.
My team might have resources and interests to contribute.

Regards,
Edmon


Re: Hadoop Roadmap

2012-03-28 Thread Arun C Murthy
Edmon,

 Here is a brief overview:
 http://hadoop.apache.org/common/docs/r0.23.1/

 Ping me if you want more collateral.

thanks,
Arun

On Mar 28, 2012, at 3:35 PM, Edmon Begoli wrote:

> I have seen several articles including recent one in SD Times on changes
> that are about to happen in Hadoop and significant architectural upgrades.
> 
> Does anyone have a good, detailed resource to recommend where these changes
> are outlined including long term roadmap.
> 
> I am interested in this because I would to see how other data parallel
> algorithms other than M/R and other parallel file system approaches other
> than HDFS could be fitted into this vision.
> My team might have resources and interests to contribute.
> 
> Regards,
> Edmon

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/




Possible to poll JobTracker for information from any language?

2012-03-28 Thread Ryan Cole
Hello,

I'm interested in writing a library, to be used with Node.js, that can ask
the JobTracker for information about jobs. I see that this is possible
using the Java API, with the JobClient interface [1]. I also saw that on
the wiki, it mentions that clients can poll the JobTracker for information,
but does not go into detail [2]. Is it possible to get information about
jobs from the JobTracker using C, or C++, or Thrift, or something else?

Thanks,
Ryan


1.
http://hadoop.apache.org/common/docs/r1.0.1/mapred_tutorial.html#Job+Submission+and+Monitoring
2. http://wiki.apache.org/hadoop/JobTracker