Re: Realtime sensor's tcpip data to hadoop

2014-05-14 Thread Hardik Pandya
If I were you I would ask following questions to get the answer forget about for a minute and ask yourself how tcpip data are currently being stored - in fs/rdbmbs? hadoop is for offiline batch processing - if you are looking for real time streaming solution - there is a storm (from linkedin)

Re: Wordcount file cannot be located

2014-05-02 Thread Hardik Pandya
Please add below to your config - for some reason hadoop-common jar is being overwritten - please share your feedback - thanks config.set(fs.hdfs.impl,org.apache.hadoop.hdfs.DistributedFileSystem.class.getName() On Fri, May 2, 2014 at 12:08 AM, Alex Lee eliy...@hotmail.com wrote: I tried to

Re: HttpConfig API changes in hadoop 2.4.0

2014-05-01 Thread Hardik Pandya
You are hitting this http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/CHANGES.txtHDFS-5308. Replace HttpConfig#getSchemePrefix with implicit schemes in HDFS JSP. (Haohui Mai via jing9) On Wed, Apr 30, 2014 at 6:08 PM, Gaurav Gupta gaurav.gopi...@gmail.comwrote: I am trying

Re: HttpConfig API changes in hadoop 2.4.0

2014-05-01 Thread Hardik Pandya
https://issues.apache.org/jira/browse/HDFS-5308 On Thu, May 1, 2014 at 8:58 AM, Hardik Pandya smarty.ju...@gmail.comwrote: You are hitting this http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/CHANGES.txtHDFS-5308. Replace HttpConfig#getSchemePrefix with implicit

Re: Using Eclipse for Hadoop code

2014-05-01 Thread Hardik Pandya
I blogged about running map reduce application in eclipse sometime back http://letsdobigdata.wordpress.com/2013/12/07/running-hadoop-mapreduce-application-from-eclipse-kepler/ On Wed, Apr 30, 2014 at 6:53 AM, unmesha sreeveni unmeshab...@gmail.comwrote: Are you asking about standalone mode

Re: Wordcount file cannot be located

2014-05-01 Thread Hardik Pandya
Hi Alex, Your hadoop program configuration is looking into local filesystem directory By default core-site.xml points to local file system fs.default.namefile:/// instead of : file:/tmp/in , if file resides on hdfs, please point fs.default.name to hdfs Configuration conf = getConf();

Re: Why is HDFS_BYTES_WRITTEN is much larger than HDFS_BYTES_READ in this case?

2014-03-28 Thread Hardik Pandya
what is your compression format gzip, lzo or snappy for lzo final output FileOutputFormat.setCompressOutput(conf, true); FileOutputFormat.setOutputCompressorClass(conf, LzoCodec.class); In addition, to make LZO splittable, you need to make a LZO index file. On Thu, Mar 27, 2014 at 8:57 PM,

Re: How to get locations of blocks programmatically?

2014-03-28 Thread Hardik Pandya
have you looked into FileSystem API this is hadoop v2.2.0 http://hadoop.apache.org/docs/r2.2.0/api/org/apache/hadoop/fs/FileSystem.html does not exist in http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/fs/FileSystem.html

Re: reducing HDFS FS connection timeouts

2014-03-28 Thread Hardik Pandya
how about adding ipc.client.connect.max.retries.on.timeouts *2 (default is 45)*Indicates the number of retries a client will make on socket timeout to establish a server connection. does that help? On Thu, Mar 27, 2014 at 4:23 PM, John Lilley john.lil...@redpoint.netwrote: It seems to take a

Re: Hadoop documentation: control flow and FSM diagrams

2014-03-28 Thread Hardik Pandya
Very helpful indeed Emillio, thanks! On Fri, Mar 28, 2014 at 12:58 PM, Emilio Coppa erco...@gmail.com wrote: Hi All, I have created a wiki on github: https://github.com/ercoppa/HadoopDiagrams/wiki This is an effort to provide an updated documentation of how the internals of Hadoop work.

Re: when it's safe to read map-reduce result?

2014-03-28 Thread Hardik Pandya
if the job complets without any failures exitCode should be 0 and safe to read the result public class MyApp extends Configured implements Tool { public int run(String[] args) throws Exception { // Configuration processed by ToolRunner Configuration conf = getConf();

Re: /home/r9r/hadoop-2.2.0/bin/hadoop: line 133: /usr/java/default/bin/java: No such file or directory

2014-01-08 Thread Hardik Pandya
your java home is not set correctly - its still looking under usr/java/default/bin/java in your hadoop-env.sh JAVA_HOME should be /usr/lib/jvm/java-1.7.0/jre/ does your $PATH includes correct ${JAVA_HOME}? On Wed, Jan 8, 2014 at 3:12 PM, Allen, Ronald L. allen...@ornl.gov wrote: Hello

Re: Content of FSImage

2014-01-07 Thread Hardik Pandya
Yes - The entire file system namespace, including the mapping of blocks to files and file system properties, is stored in a file called the FsImage. The FsImage is stored as a file in the NameNode’s local file system too. When the NameNode starts up, it reads the FsImage and EditLog from disk,

Re: Error: Could not find or load main class hdfs

2014-01-07 Thread Hardik Pandya
does input directory exist in hdfs? you can check by hadoop fs -ls On Tue, Jan 7, 2014 at 11:16 AM, Allen, Ronald L. allen...@ornl.gov wrote: Hello, I am trying to run the WordCount example using Hadoop 2.2.0 on a single node. I tried to follow the directions from

Re: Error: Could not find or load main class hdfs

2014-01-07 Thread Hardik Pandya
. From: Hardik Pandya [smarty.ju...@gmail.com] Sent: Tuesday, January 07, 2014 11:37 AM To: user@hadoop.apache.org Subject: Re: Error: Could not find or load main class hdfs does input directory exist in hdfs? you can check by hadoop fs -ls On Tue, Jan 7, 2014 at 11:16 AM, Allen

Re: Understanding MapReduce source code : Flush operations

2014-01-06 Thread Hardik Pandya
Please do not tell me since last 2.5 years you have not used virtual Hadoop environment to debug your Map Reduce application before deploying to Production environment No one can stop you looking at the code , Hadoop and its ecosystem is open-source On Mon, Jan 6, 2014 at 9:35 AM, nagarjuna

Re: Fine tunning

2014-01-06 Thread Hardik Pandya
Can you please share how you are doing the lookup? On Mon, Jan 6, 2014 at 4:23 AM, Ranjini Rathinam ranjinibe...@gmail.comwrote: Hi, I have a input File of 16 fields in it. Using Mapreduce code need to load the hbase tables. The first eight has to go into one table in hbase and last

Re: Spill Failed Caused by ArrayIndexOutOfBoundsException

2014-01-06 Thread Hardik Pandya
The error is happening during Sort And Spill phase org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill It seems like you are trying to compare two Int values and it fails during compare Caused by: java.lang.ArrayIndexOutOfBoundsException: 99614720 at

Re: How to remove slave nodes?

2014-01-04 Thread Hardik Pandya
You can start/stop an Hadoop daemon manually on a machine via bin/hadoop-daemon.sh start/stop [namenode | secondarynamenode | datanode | jobtracker | tasktracker] On Fri, Jan 3, 2014 at 11:47 AM, navaz navaz@gmail.com wrote: How to remove one of the slave node. ? I have a namenode (

Re: How to remove slave nodes?

2014-01-04 Thread Hardik Pandya
also you can exclude the data nodes from conf/mapred-site.xml dfs.hosts/dfs.hosts.excludeList of permitted/excluded DataNodes.If necessary, use these files to control the list of allowable datanodes. On Sat, Jan 4, 2014 at 12:37 PM, Hardik Pandya smarty.ju...@gmail.comwrote: You can start

Re: LocalResource size/time limits

2014-01-04 Thread Hardik Pandya
May be this would clarify some aspect of your questions Resource Localization in YARN Deep Divehttp://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/ The threshold for local files is dictated by the configuration property *yarn.nodemanager.localizer.cache.target-size-mb* described

Re: Map succeeds but reduce hangs

2014-01-01 Thread Hardik Pandya
org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201401010908_0001_m_03_0' has completed task_20140 1010908_0001_m_03 successfully. On Tue, Dec 31, 2013 at 4:56 PM, Hardik Pandya smarty.ju...@gmail.comwrote: as expected, its failing during shuffle it seems like hdfs could not resolve

Re: Map succeeds but reduce hangs

2013-12-31 Thread Hardik Pandya
what does your job log says? is yout hdfs-site configured properly to find 3 data nodes? this could very well getting stuck in shuffle phase last thing to try : does stop-all and start-all helps? even worse try formatting namenode On Tue, Dec 31, 2013 at 11:40 AM, navaz navaz@gmail.com

Re: block replication

2013-12-31 Thread Hardik Pandya
property namedfs.heartbeat.interval/name value3/value descriptionDetermines datanode heartbeat interval in seconds./description /property and may be you are looking for property name*dfs.namenode.stale.datanode.interval*/name value3/value description Default time interval

Re: Request for a pointer to a MapReduce Program tutorial

2013-12-27 Thread Hardik Pandya
I recently blogged about it - hope it helps http://letsdobigdata.wordpress.com/2013/12/07/running-hadoop-mapreduce-application-from-eclipse-kepler/ Regards, Hardik On Fri, Dec 27, 2013 at 6:53 AM, Sitaraman Vilayannur vrsitaramanietfli...@gmail.com wrote: Hi, Would much appreciate a

Re: Error starting hadoop-2.2.0

2013-12-12 Thread Hardik Pandya
do you have multiple or mixed version SLF4J jars in your classpath, how about downgrading your SLF4J to 1.5.5 or 1.5.6? please let me know how it works out for you, thanks from the warning slf4j-api version does not match that of the binding An SLF4J binding designates an artifact such as

Re: Versioninfo and platformName issue.

2013-12-11 Thread Hardik Pandya
its a classpath issue, also make sure your PATH is correct export HIVE_HOME=/home/username/yourhivedir $ export PATH=$HIVE_HOME/bin:$PATH On Wed, Dec 11, 2013 at 9:37 AM, Manish manishbh...@rocketmail.com wrote: Adam, Here is what i get when run $ hadoop version Hadoop 2.0.0-cdh4.4.0

Re: multiusers in hadoop through LDAP

2013-12-10 Thread Hardik Pandya
have you looked at hadoop.security.group.mapping.ldap.* in hadoop-common/core-default.xmlhttp://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-common/core-default.xml additional