Compile Hadoop 1.0.3 native library failed on mac 10.7.4
Hello I am trying to compile the hadoop native library on mac os. My Mac OS X is 10.7.4. My Hadoop is 1.0.3 I have installed the zlib 1.2.7 and lzo 2.0.6 like below: ./configure -shared --prefix=/usr/local/[zlib/lzo] make make install I check the /usr/local/zlib-1.2.7 and /usr/local/lzo-2.0.6, the header files and libraries are there. I change the .bash_profile like below export C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/zlib-1.2.7/include:/usr/local/lzo-2.06/include export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/zlib-1.2.7/lib:/usr/local/lzo-2.06/lib export CFLAGS="-arch x86_64" I switch to hadoop folder and run ant -Dcompile.native=true compile-native I got such information like below [exec] checking stddef.h usability... yes [exec] checking stddef.h presence... yes [exec] checking for stddef.h... yes [exec] checking jni.h usability... yes [exec] checking jni.h presence... yes [exec] checking for jni.h... yes [exec] checking zlib.h usability... yes [exec] checking zlib.h presence... yes [exec] checking for zlib.h... yes [exec] checking Checking for the 'actual' dynamic-library for '-lz'... [exec] configure: error: Can't find either 'objdump' or 'ldd' to compute the dynamic library for '-lz' BUILD FAILED Does anyone meet this issue before? Best Regards, --
memory usage tasks
silly question, but i have our hadoop slave boxes configured with 7 mappers each, yet i see java 14 process for user mapred on each box. and each process takes up about 2GB, which is equals to my memory allocation (mapred.child.java.opts=-Xmx2048m). so it is using twice as much memory as i expected! why is that?
Sync and Data Replication
I am wondering the role of sync in replication of data to other nodes. Say client writes a line to a file in Hadoop, at this point file handle is open and sync has not been called. In this scenario is data also replicated as defined by the replication factor to other nodes as well? I am wondering if at this point if crash occurs do I have data in other nodes?
hbase client security (cluster is secure)
Hi all, I have created a hadoop/hbase/zookeeper cluster that is secured and verified. Now a simple test is to connect an hbase client (e.g, shell) to see its behavior. Well, I get the following message on the hbase master: AccessControlException: authentication is required. Looking at the code it appears that the client passed "simple" authentication byte in the rpc header. Why, I don't know? My client configuration is as follows: hbase-site.xml: hbase.security.authentication kerberos hbase.rpc.engine org.apache.hadoop.hbase.ipc.SecureRpcEngine hbase-env.sh: export HBASE_OPTS="$HBASE_OPTS -Djava.security.auth.login.config=/usr/local/hadoop/hbase/conf/hbase.jaas" hbase.jaas: Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=false useTicketCache=true }; I issue kinit for the client I want to use. Then invoke hbase shell. I simply issue list and see the error on the server. Any ideas what I am doing wrong? Thanks so much! _ From: Tony Dean Sent: Tuesday, June 05, 2012 5:41 PM To: common-user@hadoop.apache.org Subject: hadoop file permission 1.0.3 (security) Can someone detail the options that are available to set file permissions at the hadoop and os level? Here's what I have discovered thus far: dfs.permissions = true|false (works as advertised) dfs.supergroup = supergroup (works as advertised) dfs.umaskmode = umask (I believe this should be used in lieu of dfs.umask) - it appears to set the permissions for files created in hadoop fs (minus execute permission). why was dffs.umask deprecated? what's difference between the 2. dfs.datanode.data.dir.perm = perm (not sure this is working at all?) I thought it was supposed to set permission on blks at the os level. Are there any other file permission configuration properties? What I would really like to do is set data blk file permissions at the os level so that the blocks can be locked down from all users except super and supergroup, but allow it to be used accessed by hadoop API as specified by hdfs permissions. Is this possible? Thanks. Tony Dean SAS Institute Inc. Senior Software Developer 919-531-6704 << OLE Object: Picture (Device Independent Bitmap) >>
Re: decommissioning datanodes
Thanks, this seems to work now. Note that the parameter is 'dfs.hosts' instead of 'dfs.hosts.include'. (Also, the normal caveats like hostnames are case sensitive). -Chris On Fri, Jun 8, 2012 at 12:19 PM, Serge Blazhiyevskyy < serge.blazhiyevs...@nice.com> wrote: > Your config should be something like this: > > > > >dfs.hosts.exclude > >/opt/hadoop/hadoop-1.0.0/conf/exclude > > > > > > >dfs.hosts.include > >/opt/hadoop/hadoop-1.0.0/conf/include > > > > > > > > >Add to exclude file: > > > >host1 > >host2 > > > > > > Add to include file > >host1 > >host2 > Plus the rest of the nodes > > > > > On 6/8/12 12:15 PM, "Chris Grier" wrote: > > >Do you mean the file specified by the 'dfs.hosts' parameter? That is not > >currently set in my configuration (the hosts are only specified in the > >slaves file). > > > >-Chris > > > >On Fri, Jun 8, 2012 at 11:56 AM, Serge Blazhiyevskyy < > >serge.blazhiyevs...@nice.com> wrote: > > > >> Your nodes need to be in include and exclude file in the same time > >> > >> > >> Do you use both files? > >> > >> On 6/8/12 11:46 AM, "Chris Grier" wrote: > >> > >> >Hello, > >> > > >> >I'm in the trying to figure out how to decommission data nodes. Here's > >> >what > >> >I do: > >> > > >> >In hdfs-site.xml I have: > >> > > >> > > >> >dfs.hosts.exclude > >> >/opt/hadoop/hadoop-1.0.0/conf/exclude > >> > > >> > > >> >Add to exclude file: > >> > > >> >host1 > >> >host2 > >> > > >> >Then I run 'hadoop dfsadmin -refreshNodes'. On the web interface the > >>two > >> >nodes now appear in both the 'Live Nodes' and 'Dead Nodes' (but there's > >> >nothing in the Decommissioning Nodes list). If I look at the datanode > >>logs > >> >running on host1 or host2, I still see blocks being copied in and it > >>does > >> >not appear that any additional replication was happening. > >> > > >> >What am I missing during the decommission process? > >> > > >> >-Chris > >> > >> > >
Re: decommissioning datanodes
Your config should be something like this: > >dfs.hosts.exclude >/opt/hadoop/hadoop-1.0.0/conf/exclude > > >dfs.hosts.include >/opt/hadoop/hadoop-1.0.0/conf/include > > >Add to exclude file: > >host1 >host2 > Add to include file >host1 >host2 Plus the rest of the nodes On 6/8/12 12:15 PM, "Chris Grier" wrote: >Do you mean the file specified by the 'dfs.hosts' parameter? That is not >currently set in my configuration (the hosts are only specified in the >slaves file). > >-Chris > >On Fri, Jun 8, 2012 at 11:56 AM, Serge Blazhiyevskyy < >serge.blazhiyevs...@nice.com> wrote: > >> Your nodes need to be in include and exclude file in the same time >> >> >> Do you use both files? >> >> On 6/8/12 11:46 AM, "Chris Grier" wrote: >> >> >Hello, >> > >> >I'm in the trying to figure out how to decommission data nodes. Here's >> >what >> >I do: >> > >> >In hdfs-site.xml I have: >> > >> > >> >dfs.hosts.exclude >> >/opt/hadoop/hadoop-1.0.0/conf/exclude >> > >> > >> >Add to exclude file: >> > >> >host1 >> >host2 >> > >> >Then I run 'hadoop dfsadmin -refreshNodes'. On the web interface the >>two >> >nodes now appear in both the 'Live Nodes' and 'Dead Nodes' (but there's >> >nothing in the Decommissioning Nodes list). If I look at the datanode >>logs >> >running on host1 or host2, I still see blocks being copied in and it >>does >> >not appear that any additional replication was happening. >> > >> >What am I missing during the decommission process? >> > >> >-Chris >> >>
Re: decommissioning datanodes
Do you mean the file specified by the 'dfs.hosts' parameter? That is not currently set in my configuration (the hosts are only specified in the slaves file). -Chris On Fri, Jun 8, 2012 at 11:56 AM, Serge Blazhiyevskyy < serge.blazhiyevs...@nice.com> wrote: > Your nodes need to be in include and exclude file in the same time > > > Do you use both files? > > On 6/8/12 11:46 AM, "Chris Grier" wrote: > > >Hello, > > > >I'm in the trying to figure out how to decommission data nodes. Here's > >what > >I do: > > > >In hdfs-site.xml I have: > > > > > >dfs.hosts.exclude > >/opt/hadoop/hadoop-1.0.0/conf/exclude > > > > > >Add to exclude file: > > > >host1 > >host2 > > > >Then I run 'hadoop dfsadmin -refreshNodes'. On the web interface the two > >nodes now appear in both the 'Live Nodes' and 'Dead Nodes' (but there's > >nothing in the Decommissioning Nodes list). If I look at the datanode logs > >running on host1 or host2, I still see blocks being copied in and it does > >not appear that any additional replication was happening. > > > >What am I missing during the decommission process? > > > >-Chris > >
Re: decommissioning datanodes
Your nodes need to be in include and exclude file in the same time Do you use both files? On 6/8/12 11:46 AM, "Chris Grier" wrote: >Hello, > >I'm in the trying to figure out how to decommission data nodes. Here's >what >I do: > >In hdfs-site.xml I have: > > >dfs.hosts.exclude >/opt/hadoop/hadoop-1.0.0/conf/exclude > > >Add to exclude file: > >host1 >host2 > >Then I run 'hadoop dfsadmin -refreshNodes'. On the web interface the two >nodes now appear in both the 'Live Nodes' and 'Dead Nodes' (but there's >nothing in the Decommissioning Nodes list). If I look at the datanode logs >running on host1 or host2, I still see blocks being copied in and it does >not appear that any additional replication was happening. > >What am I missing during the decommission process? > >-Chris
decommissioning datanodes
Hello, I'm in the trying to figure out how to decommission data nodes. Here's what I do: In hdfs-site.xml I have: dfs.hosts.exclude /opt/hadoop/hadoop-1.0.0/conf/exclude Add to exclude file: host1 host2 Then I run 'hadoop dfsadmin -refreshNodes'. On the web interface the two nodes now appear in both the 'Live Nodes' and 'Dead Nodes' (but there's nothing in the Decommissioning Nodes list). If I look at the datanode logs running on host1 or host2, I still see blocks being copied in and it does not appear that any additional replication was happening. What am I missing during the decommission process? -Chris
Re: Hadoop-Git-Eclipse
Check out these thread : http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/22976 http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201012.mbox/%3c4cff292d.3090...@corp.mail.ru%3E On Fri, Jun 8, 2012 at 6:24 PM, Prajakta Kalmegh wrote: > Hi > > Yes I did configure using the wiki link at > http://wiki.apache.org/hadoop/EclipseEnvironment. > I am facing a new problem while setting up Hadoop in Psuedo-distributed > mode on my laptop. I am trying to execute the following commands for > setting up Hadoop: > hdfs namenode -format > hdfs namenode > hdfs datanode > yarn resourcemanager > yarn nodemanager > > It gives me a "Hadoop Common not found." error for all the commands. When I > try to use "hadoop namenode -format" instead, it gives me a deprecated > command warning. > > I am following the instructions for setting up Hadoop with Eclipse given in > - http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment > - > > http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html > > This issue is discussed in JIRA < > https://issues.apache.org/jira/browse/HDFS-2014 > and is resolved. Not > sure > why I am getting the error. > > My environment variables look something like: > > HADOOP_COMMON_HOME=/home/Projects/hadoop-common/hadoop-common-project/hadoop-common/target/hadoop-common-3.0.0-SNAPSHOT > > HADOOP_CONF_DIR=/home/Projects/hadoop-common/hadoop-common-project/hadoop-common/target/hadoop-common-3.0.0-SNAPSHOT/etc/hadoop > > HADOOP_HDFS_HOME=/home/Projects/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-3.0.0-SNAPSHOT > > HADOOP_MAPRED_HOME=/home/Projects/hadoop-common/hadoop-mapreduce-project/target/hadoop-mapreduce-3.0.0-SNAPSHOT > > YARN_HOME=/home/Projects/hadoop-common/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/hadoop-yarn-common-3.0.0-SNAPSHOT > > YARN_CONF_DIR=/home/Projects/hadoop-common/hadoop-mapreduce-project/hadoop-yarn/conf > > I have included them in the PATH. I am trying to build and setup from > apache-hadoop-common git repository (my own cloned fork). Any idea why > 'Hadoop Common Not found' error is coming? Do I have to add anything to the > hadoop-config.sh or hdfs-config.sh? > > Regards, > Prajakta > > > > > > Deniz Demir > 06/08/2012 05:35 PM > Please respond to > common-user@hadoop.apache.org > To > common-user@hadoop.apache.org, > cc > Subject > Re: Hadoop-Git-Eclipse > > > I did not find that screencast useful. This one worked for me: > > http://wiki.apache.org/hadoop/EclipseEnvironment > > Best, > Deniz > > On Jun 8, 2012, at 1:08 AM, shashwat shriparv wrote: > > > Check out this link: > > > > http://www.cloudera.com/blog/2009/04/configuring-eclipse-for-hadoop-development-a-screencast/ > > > > Regards > > > > ∞ > > Shashwat Shriparv > > > > > > > > > > On Fri, Jun 8, 2012 at 1:32 PM, Prajakta Kalmegh >wrote: > > > >> Hi > >> > >> I have done MapReduce programming using Eclipse before but now I need to > >> learn the Hadoop code internals for one of my projects. > >> > >> I have forked Hadoop from github ( > https://github.com/apache/hadoop-common > >> ) and need to configure it to work with Eclipse. All the links I could > >> find list steps for earlier versions of Hadoop. I am right now following > >> instructions given in these links: > >> - http://wiki.apache.org/hadoop/GitAndHadoop > >> - http://wiki.apache.org/hadoop/EclipseEnvironment > >> - http://wiki.apache.org/hadoop/HowToContribute > >> > >> Can someone please give me a link to the steps to be followed for > getting > >> Hadoop (latest from trunk) started in Eclipse? I need to be able to > commit > >> changes to my forked repository on github. > >> > >> Thanks in advance. > >> Regards, > >> Prajakta > > > > > > > > > > -- > > > > > > ∞ > > Shashwat Shriparv > -- ∞ Shashwat Shriparv
Re: Hadoop-Git-Eclipse
Hi Yes I did configure using the wiki link at http://wiki.apache.org/hadoop/EclipseEnvironment. I am facing a new problem while setting up Hadoop in Psuedo-distributed mode on my laptop. I am trying to execute the following commands for setting up Hadoop: hdfs namenode -format hdfs namenode hdfs datanode yarn resourcemanager yarn nodemanager It gives me a "Hadoop Common not found." error for all the commands. When I try to use "hadoop namenode -format" instead, it gives me a deprecated command warning. I am following the instructions for setting up Hadoop with Eclipse given in - http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment - http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html This issue is discussed in JIRA < https://issues.apache.org/jira/browse/HDFS-2014 > and is resolved. Not sure why I am getting the error. My environment variables look something like: HADOOP_COMMON_HOME=/home/Projects/hadoop-common/hadoop-common-project/hadoop-common/target/hadoop-common-3.0.0-SNAPSHOT HADOOP_CONF_DIR=/home/Projects/hadoop-common/hadoop-common-project/hadoop-common/target/hadoop-common-3.0.0-SNAPSHOT/etc/hadoop HADOOP_HDFS_HOME=/home/Projects/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-3.0.0-SNAPSHOT HADOOP_MAPRED_HOME=/home/Projects/hadoop-common/hadoop-mapreduce-project/target/hadoop-mapreduce-3.0.0-SNAPSHOT YARN_HOME=/home/Projects/hadoop-common/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-common/target/hadoop-yarn-common-3.0.0-SNAPSHOT YARN_CONF_DIR=/home/Projects/hadoop-common/hadoop-mapreduce-project/hadoop-yarn/conf I have included them in the PATH. I am trying to build and setup from apache-hadoop-common git repository (my own cloned fork). Any idea why 'Hadoop Common Not found' error is coming? Do I have to add anything to the hadoop-config.sh or hdfs-config.sh? Regards, Prajakta Deniz Demir 06/08/2012 05:35 PM Please respond to common-user@hadoop.apache.org To common-user@hadoop.apache.org, cc Subject Re: Hadoop-Git-Eclipse I did not find that screencast useful. This one worked for me: http://wiki.apache.org/hadoop/EclipseEnvironment Best, Deniz On Jun 8, 2012, at 1:08 AM, shashwat shriparv wrote: > Check out this link: > http://www.cloudera.com/blog/2009/04/configuring-eclipse-for-hadoop-development-a-screencast/ > > Regards > > ∞ > Shashwat Shriparv > > > > > On Fri, Jun 8, 2012 at 1:32 PM, Prajakta Kalmegh wrote: > >> Hi >> >> I have done MapReduce programming using Eclipse before but now I need to >> learn the Hadoop code internals for one of my projects. >> >> I have forked Hadoop from github (https://github.com/apache/hadoop-common >> ) and need to configure it to work with Eclipse. All the links I could >> find list steps for earlier versions of Hadoop. I am right now following >> instructions given in these links: >> - http://wiki.apache.org/hadoop/GitAndHadoop >> - http://wiki.apache.org/hadoop/EclipseEnvironment >> - http://wiki.apache.org/hadoop/HowToContribute >> >> Can someone please give me a link to the steps to be followed for getting >> Hadoop (latest from trunk) started in Eclipse? I need to be able to commit >> changes to my forked repository on github. >> >> Thanks in advance. >> Regards, >> Prajakta > > > > > -- > > > ∞ > Shashwat Shriparv
Re: Hadoop-Git-Eclipse
I did not find that screencast useful. This one worked for me: http://wiki.apache.org/hadoop/EclipseEnvironment Best, Deniz On Jun 8, 2012, at 1:08 AM, shashwat shriparv wrote: > Check out this link: > http://www.cloudera.com/blog/2009/04/configuring-eclipse-for-hadoop-development-a-screencast/ > > Regards > > ∞ > Shashwat Shriparv > > > > > On Fri, Jun 8, 2012 at 1:32 PM, Prajakta Kalmegh wrote: > >> Hi >> >> I have done MapReduce programming using Eclipse before but now I need to >> learn the Hadoop code internals for one of my projects. >> >> I have forked Hadoop from github (https://github.com/apache/hadoop-common >> ) and need to configure it to work with Eclipse. All the links I could >> find list steps for earlier versions of Hadoop. I am right now following >> instructions given in these links: >> - http://wiki.apache.org/hadoop/GitAndHadoop >> - http://wiki.apache.org/hadoop/EclipseEnvironment >> - http://wiki.apache.org/hadoop/HowToContribute >> >> Can someone please give me a link to the steps to be followed for getting >> Hadoop (latest from trunk) started in Eclipse? I need to be able to commit >> changes to my forked repository on github. >> >> Thanks in advance. >> Regards, >> Prajakta > > > > > -- > > > ∞ > Shashwat Shriparv
RE: InvalidJobConfException
By default it uses the TextOutputFomat(subclass of FileOutputFormat) which checks for output path. You can use NullOuputFormat or your custom output format which doesn't do any thing for your job. Thanks Devaraj From: huanchen.zhang [huanchen.zh...@ipinyou.com] Sent: Friday, June 08, 2012 4:16 PM To: common-user Subject: InvalidJobConfException Hi, Here I'm developing a MapReduce web crawler which reads url lists and writes html to MongoDB. So, each map read one url list file, get the html and insert to MongoDB. There is no reduce and no output of map. So, how to set the output directory in this case? If I do not set the output directory, it gives me following exception, Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set. at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Thank you ! Best, Huanchen 2012-06-08 huanchen.zhang
Re: InvalidJobConfException
Hi Huanchen, Just set your output format class to NullOutputFormat http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html if you don't need any direct outputs to HDFS/etc. from your M/R classes. On Fri, Jun 8, 2012 at 4:16 PM, huanchen.zhang wrote: > Hi, > > Here I'm developing a MapReduce web crawler which reads url lists and writes > html to MongoDB. > So, each map read one url list file, get the html and insert to MongoDB. > There is no reduce and no output of map. So, how to set the output directory > in this case? If I do not set the output directory, it gives me following > exception, > > Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: > Output directory not set. > at > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) > at > com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:56) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > > Thank you ! > > Best, > Huanchen > > > 2012-06-08 > > > > huanchen.zhang -- Harsh J
InvalidJobConfException
Hi, Here I'm developing a MapReduce web crawler which reads url lists and writes html to MongoDB. So, each map read one url list file, get the html and insert to MongoDB. There is no reduce and no output of map. So, how to set the output directory in this case? If I do not set the output directory, it gives me following exception, Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: Output directory not set. at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Thank you ! Best, Huanchen 2012-06-08 huanchen.zhang
Re: Hadoop command not found:hdfs and yarn
Hello , Can you quickly review your hadoop install with below page may be you get some hints to install. http://jugnu-life.blogspot.in/2012/05/hadoop-20-install-tutorial-023x.html The depreciated warning is correct as hadoop jobs have been divided now. Regards, Jagat Singh On Fri, Jun 8, 2012 at 2:56 PM, Prajakta Kalmegh wrote: > Hi > > I am trying to execute the following commands for setting up Hadoop: > # Format the namenode > hdfs namenode -format > # Start the namenode > hdfs namenode > # Start a datanode > hdfs datanode > > yarn resourcemanager > yarn nodemanager > > It gives me a "Hadoop Command not found." error for all the commands. When > I try to use "hadoop namenode -format" instead, it gives me a deprecated > command warning. Can someone please tell me if I am missing including any > env variables? I have included HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, > HADOOP_MAPRED_HOME, YARN_HOME, HADOOP_CONF_DIR, YARN_CONF_DIR, > HADOOP_PREFIX in my path (apart from java etc). > > I am following the instructions for setting up Hadoop with Eclipse given > in > - http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment > - > > http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html > > Regards, > Prajakta > >
AUTO: Prabhat Pandey is out of the office (returning 06/28/2012)
I am out of the office until 06/28/2012. I am out of the office until 06/28/2012. For any issues please contact Dispatcher: dbqor...@us.ibm.com Thanks. Prabhat Pandey Note: This is an automated response to your message "Nutch hadoop integration" sent on 06/08/2012 1:59:22. This is the only notification you will receive while this person is away.
Re: Nutch hadoop integration
http://wiki.apache.org/nutch/NutchHadoopTutorial above tutorial is not working for me .. i am using nutch 1.4 .. can u give the steps.. what property i have to set in nutch-site.xml On Fri, Jun 8, 2012 at 1:34 PM, shashwat shriparv wrote: > Check out these links : > > http://wiki.apache.org/nutch/NutchHadoopTutorial > > http://wiki.apache.org/nutch/NutchTutorial > http://joey.mazzarelli.com/2007/07/25/nutch-and-hadoop-as-user-with-nfs/ > > http://stackoverflow.com/questions/5301883/run-nutch-on-existing-hadoop-cluster > > Regards > > ∞ > Shashwat Shriparv > > On Fri, Jun 8, 2012 at 1:29 PM, abhishek tiwari < > abhishektiwari.u...@gmail.com> wrote: > > > how can i integrate hadood and nutch ..anyone please brief me . > > > > > > -- > > > ∞ > Shashwat Shriparv >
Hadoop command not found:hdfs and yarn
Hi I am trying to execute the following commands for setting up Hadoop: # Format the namenode hdfs namenode -format # Start the namenode hdfs namenode # Start a datanode hdfs datanode yarn resourcemanager yarn nodemanager It gives me a "Hadoop Command not found." error for all the commands. When I try to use "hadoop namenode -format" instead, it gives me a deprecated command warning. Can someone please tell me if I am missing including any env variables? I have included HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, HADOOP_MAPRED_HOME, YARN_HOME, HADOOP_CONF_DIR, YARN_CONF_DIR, HADOOP_PREFIX in my path (apart from java etc). I am following the instructions for setting up Hadoop with Eclipse given in - http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment - http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html Regards, Prajakta
Re: Hadoop-Git-Eclipse
Check out this link: http://www.cloudera.com/blog/2009/04/configuring-eclipse-for-hadoop-development-a-screencast/ Regards ∞ Shashwat Shriparv On Fri, Jun 8, 2012 at 1:32 PM, Prajakta Kalmegh wrote: > Hi > > I have done MapReduce programming using Eclipse before but now I need to > learn the Hadoop code internals for one of my projects. > > I have forked Hadoop from github (https://github.com/apache/hadoop-common > ) and need to configure it to work with Eclipse. All the links I could > find list steps for earlier versions of Hadoop. I am right now following > instructions given in these links: > - http://wiki.apache.org/hadoop/GitAndHadoop > - http://wiki.apache.org/hadoop/EclipseEnvironment > - http://wiki.apache.org/hadoop/HowToContribute > > Can someone please give me a link to the steps to be followed for getting > Hadoop (latest from trunk) started in Eclipse? I need to be able to commit > changes to my forked repository on github. > > Thanks in advance. > Regards, > Prajakta -- ∞ Shashwat Shriparv
Re: Nutch hadoop integration
Check out these links : http://wiki.apache.org/nutch/NutchHadoopTutorial http://wiki.apache.org/nutch/NutchTutorial http://joey.mazzarelli.com/2007/07/25/nutch-and-hadoop-as-user-with-nfs/ http://stackoverflow.com/questions/5301883/run-nutch-on-existing-hadoop-cluster Regards ∞ Shashwat Shriparv On Fri, Jun 8, 2012 at 1:29 PM, abhishek tiwari < abhishektiwari.u...@gmail.com> wrote: > how can i integrate hadood and nutch ..anyone please brief me . > -- ∞ Shashwat Shriparv
Hadoop-Git-Eclipse
Hi I have done MapReduce programming using Eclipse before but now I need to learn the Hadoop code internals for one of my projects. I have forked Hadoop from github (https://github.com/apache/hadoop-common ) and need to configure it to work with Eclipse. All the links I could find list steps for earlier versions of Hadoop. I am right now following instructions given in these links: - http://wiki.apache.org/hadoop/GitAndHadoop - http://wiki.apache.org/hadoop/EclipseEnvironment - http://wiki.apache.org/hadoop/HowToContribute Can someone please give me a link to the steps to be followed for getting Hadoop (latest from trunk) started in Eclipse? I need to be able to commit changes to my forked repository on github. Thanks in advance. Regards, Prajakta
Re: Nutch hadoop integration
> how can i integrate hadood and nutch ..anyone please brief me . > Just configure hadoop cluster. Configure nutch path to store the nuth crawl index and crawl list to hdfs. Thats it. -- *Biju*
Re: Nutch hadoop integration
may be this will help you if you have not already checked it http://wiki.apache.org/nutch/NutchHadoopTutorial On Fri, Jun 8, 2012 at 1:29 PM, abhishek tiwari < abhishektiwari.u...@gmail.com> wrote: > how can i integrate hadood and nutch ..anyone please brief me . > -- Nitin Pawar