Problem to kill jobs with java

2012-06-27 Thread hadoop
Hi Folks, I m using java client to run queries on hive, suggest me some way so that I can kill the query whenever I need. Or how can I find the jobid to kill it. Regards Vikas Srivastava

RE: Error while configuring Hadoop 1.3.0 on windows XP using Cygwin

2012-06-27 Thread hadoop
Dude check from where you fired Hadoop command. $HADOOP_HOME/bin/Hadoop namenode -format Regards Vikas srivastava From: Ambreen Mahvash [mailto:ambreen.mahv...@appsassociates.com] Sent: Wednesday, June 27, 2012 3:02 PM To: common-user@hadoop.apache.org Subject: Error while

oozie workflow file for teragen and terasort

2012-06-23 Thread Hadoop James
I want to be able to submit my teragen and terasort jobs via oozie. I have tried different things in workflow.xml to no avail. Has anybody had any success doing so? Can you share your workflow.xml file ? Many thanks -James

Re: Increasing number of Reducers

2012-03-20 Thread bejoy . hadoop
Hi Masoud One reducer would definitely emit one output file. If you are looking at just one file as your final result in lfs, Then once you have the MR job done use hadoop fs -getmerge . Sent from BlackBerry® on Airtel -Original Message- From: Masoud Date: Tue, 20 Mar 2012 19

Re: Increasing number of Reducers

2012-03-20 Thread bejoy . hadoop
Hi Mausoud Set -D mapred.reduce.tasks=n; ie to any higher value. Sent from BlackBerry® on Airtel -Original Message- From: Masoud Date: Tue, 20 Mar 2012 17:52:58 To: Reply-To: common-user@hadoop.apache.org Subject: Increasing number of Reducers Hi all, we have a cluster with 32 machin

Fwd: Hive with JDBC

2012-03-16 Thread hadoop hive
-- Forwarded message -- From: hadoop hive Date: Fri, Mar 16, 2012 at 2:04 PM Subject: Hive with JDBC To: u...@hive.apache.org HI folks, I m facing a problem while when i fired a query through java code, its returns around half a million records which make the result set in hang

Re: mapred.tasktracker.map.tasks.maximum not working

2012-03-09 Thread bejoy . hadoop
ur job means nothing. Because Hadoop mapreduce platform only checks this parameter when it starts. This is a system configuration. You need to set it in your conf/mapred-site.xml file and restart your hadoop mapreduce. On Fri, Mar 9, 2012 at 7:32 PM, Mohit Anch

Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum

2012-03-09 Thread bejoy . hadoop
Mohit It is a job level config parameter. For plain map reduce jobs you can set the same through CLI as hadoop jar ... -D mapred.map.tasks=n You should be able to do it pig as well. However the number of map tasks for a job are governed by the input splits and the Input Format you are

Re: help for snappy

2012-02-26 Thread hadoop hive
okk.. can i built it with ANT On Mon, Feb 27, 2012 at 12:49 PM, alo alt wrote: > hive? > You are then on the wrong list, for hive related questions refer to: > u...@hive.apache.org > > -- > Alexander Lorenz > http://mapredit.blogspot.com > > On Feb 27, 2012, at

Re: help for snappy

2012-02-26 Thread hadoop hive
gt; For storing snappy compressed files in HDFS you should use Pig or Flume. > > -- > Alexander Lorenz > http://mapredit.blogspot.com > > On Feb 27, 2012, at 7:28 AM, hadoop hive wrote: > > > thanks Alex, > > > > i m using Apache hadoop, steps i followed

Re: help for snappy

2012-02-26 Thread hadoop hive
thanks Alex, i m using Apache hadoop, steps i followed 1:- untar snappy 2:- entry in mapred site this can be used like deflate only(like only on overwriting file) On Mon, Feb 27, 2012 at 11:50 AM, alo alt wrote: > Hi, > > > https://ccp.cloudera.com/display/CDHDOC/Snappy+

help for snappy

2012-02-26 Thread hadoop hive
Hey folks, i m using hadoop 0.20.2 + r911707 , please tell me the installation and how to use snappy for compression and decompression Regards Vikas Srivastava

Re: Splitting files on new line using hadoop fs

2012-02-22 Thread bejoy . hadoop
ith loading of small xml files that are stored efficiently. Regards Bejoy K S From handheld, Please excuse typos. -Original Message- From: Mohit Anchlia Date: Wed, 22 Feb 2012 12:29:26 To: ; Subject: Re: Splitting files on new line using hadoop fs On Wed, Feb 22, 2012 at 12:2

Re: Splitting files on new line using hadoop fs

2012-02-22 Thread bejoy . hadoop
Hi Mohit AFAIK there is no default mechanism available for the same in hadoop. File is split into blocks just based on the configured block size during hdfs copy. While processing the file using Mapreduce the record reader takes care of the new lines even if a line spans across multiple

Changing into Replication factor

2012-02-21 Thread hadoop hive
HI Folks, Rite now i m having replication factor 2, but now i want to make it three for sum tables so how can i do that for specific tables, so that whenever the data would be loaded in those tables it can automatically replicated into three nodes. Or i need to replicate for all the tables. and

Re: How do I synchronize Hadoop jobs?

2012-02-15 Thread bejoy . hadoop
Hi McNeil Have a look at OOZIE. It is meant for work flow management in hadoop and can serve your purpose. --Original Message-- From: W.P. McNeill To: Hadoop Mailing List ReplyTo: common-user@hadoop.apache.org Subject: How do I synchronize Hadoop jobs? Sent: Feb 16, 2012 00

Re: Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread bejoy . hadoop
ount. The to the map is > beta">. The canonical Hadoop program would tokenize this line of text >> and output <"foo",1> and so on. How would the multithreadedmapper know >> how to further divide this line of text into, say: [> bar">,] for 2 threads

Re: Combining MultithreadedMapper threadpool size & map.tasks.maximum

2012-02-10 Thread bejoy . hadoop
ast hour or so http://kickstarthadoop.blogspot.com/2012/02/enable-multiple-threads-in-mapper-aka.html -- Rob On 10 February 2012 14:20, Harsh J wrote: > Hello again, > > On Fri, Feb 10, 2012 at 7:31 PM, Rob Stewart wrote: >> OK, take word count. The to the map is > beta"

Re: How to setup Hive on a single node ?

2012-02-09 Thread hadoop hive
rote: > Thanks for your reply ! > > I think i installed Hadoop correctly because i run wordcount example i have > correct output. But i didn't know how to install Hive, so i installed Hive > via https://cwiki.apache.org/confluence/display/Hive/GettingStartedinclude > installed Hado

Re: is it possible to specify an empty key-value separator for TextOutputFormat?

2012-02-09 Thread hadoop hive
hey luca, you can use conf.set("*mapred.textoutputformat.separator*", " "); hope it works fine regards Vikas Srivastava On Thu, Feb 9, 2012 at 3:57 PM, Luca Pireddu wrote: > Hello list, > > I'm trying to specify from the command line an empty string as the > key-value separator for TextOutpu

Re: HELP - Problem in setting up Hadoop - Multi-Node Cluster

2012-02-09 Thread hadoop hive
hine ? > Please paste also the output of "kingul2" namenode logs. > > Regards, > > Robin > > > On 02/08/12 13:06, Guruprasad B wrote: > > Hi, > > I am Guruprasad from Bangalore (India). I need help in setting up hadoop > platform. I am very much new

Re: Processing compressed files in Hadoop

2012-02-08 Thread bejoy . hadoop
: Reply-To: common-user@hadoop.apache.org Subject: Re: Processing compressed files in Hadoop Hi Bejoy, Thanks for your response. I know how to index Lzo files, however I am curious on whether I can still use my custom InputFormats to process the compressed LZO files or if I have to implement new

Re: Preferred ways to specify input and output directories to Hadoop jobs

2012-02-08 Thread bejoy . hadoop
our custom driver classes for each jobs. Regards Bejoy K S From handheld, Please excuse typos. -Original Message- From: "W.P. McNeill" Date: Wed, 8 Feb 2012 10:17:53 To: ; Subject: Re: Preferred ways to specify input and output directories to Hadoop jobs Right. There is

Re: Preferred ways to specify input and output directories to Hadoop jobs

2012-02-08 Thread bejoy . hadoop
xcuse typos. -Original Message- From: "W.P. McNeill" Date: Wed, 8 Feb 2012 10:00:55 To: Hadoop Mailing List Reply-To: common-user@hadoop.apache.org Subject: Preferred ways to specify input and output directories to Hadoop jobs How do you like to specify input and output dire

Re: Processing compressed files in Hadoop

2012-02-08 Thread bejoy . hadoop
lease excuse typos. -Original Message- From: Leonardo Urbina Sender: flechadeor...@gmail.com Date: Wed, 8 Feb 2012 12:39:54 To: Reply-To: common-user@hadoop.apache.org Subject: Processing compressed files in Hadoop Hello everyone, I run a daily job that takes files in a variety of diff

Re: Sorting text data

2012-02-08 Thread bejoy . hadoop
/hadoop jar hadoop-0.20.2-examples.jar sort -inFormat org.apache.hadoop.mapred.TextInputFormat /user/sangroya/test1 outtest16 Running on 1 nodes to sort from hdfs://localhost:54310/user/sangroya/test1 into hdfs://localhost:54310/user/sangroya/outtest16 with 1 reduces. Job started: Wed Feb 08 14:53:14

Re: Why its take much take to live all the datanode in Jobtracker UI

2012-02-07 Thread hadoop hive
also adding by their ip's Regards Vikas Srivastava On Wed, Feb 8, 2012 at 11:28 AM, Harsh J wrote: > Hi, > > Can you provide your tasktracker startup log as a pastebin.com link? > Also your JT log grepped for "Adding a new node"? > > On Wed, Feb 8, 2012 at 11:

Why its take much take to live all the datanode in Jobtracker UI

2012-02-07 Thread hadoop hive
Hi Folks, I added a node in cluster , and restart the cluster but its taking much time to come all the server live in Jobtracker UI, its only showing the added server in cluster. I there any specific reason for this or anything, Thanks Vikas Srivastava

Re: What's the best practice of loading logs into hdfs while using hive to do log analytic?

2012-02-07 Thread bejoy . hadoop
t; a first start with flume: > http://mapredit.blogspot.com/2011/10/centralized-logfile-management-across.html > > Facebook's scribe could also be work for you. > > - Alex > > -- > Alexander Lorenz > http://mapredit.blogspot.com > > On Feb 7, 2012, at 11:03 AM

Re: What's the best practice of loading logs into hdfs while using hive to do log analytic?

2012-02-07 Thread bejoy . hadoop
: What's the best practice of loading logs into hdfs while using hive to do log analytic? Hi all, Sorry if it is not appropriate to send one thread into two maillist. ** I'm tring to use hadoop and hive to do some log analytic jobs. Our system generate lots of logs every day, for e

Re: Can I write to an compressed file which is located in hdfs?

2012-02-06 Thread bejoy . hadoop
codec='theCodecYouPrefer' You'd get the blocks compressed in the output dir. You can use the API to read from standard input like -get hadoop conf -register the required compression codec -write to CompressionOutputStream. You should get a well detailed explanation on the same from the book '

Re: How to Set the Value of hadoop.tmp.dir?

2012-02-06 Thread bejoy . hadoop
@hadoop.apache.org ReplyTo: bing...@asu.edu Subject: How to Set the Value of hadoop.tmp.dir? Sent: Feb 7, 2012 02:04 Dear all, I am a new Hadoop learner. The version I used is 1.0.0. I tried to set a new value for the parameter instead of /tmp, hadoop.tmp.dir in core-site.xml, hdfs-site.xml and mapred

Re: Can I write to an compressed file which is located in hdfs?

2012-02-06 Thread bejoy . hadoop
ply-To: common-user@hadoop.apache.org > > Subject: Re: Can I write to an compressed file which is located in hdfs? > > > > sorry, this sentence is wrong, > > > > I can't compress these logs every hour and them put them into hdfs. > > > > it should be

Re: Can I write to an compressed file which is located in hdfs?

2012-02-06 Thread bejoy . hadoop
, I can't compress these logs every hour and them put them into hdfs. it should be I can compress these logs every hour and them put them into hdfs. 2012/2/6 Xiaobin She > > hi all, > > I'm testing hadoop and hive, and I want to use them in log analysis. > > H

Problem in reduce phase(critical)

2012-02-03 Thread hadoop hive
On Fri, Feb 3, 2012 at 4:56 PM, hadoop hive wrote: > hey folks, > > i m getting this error while running mapreduce and these comes up in > reduce phase.. > > > 2012-02-03 16:41:19,780 WARN org.apache.hadoop.mapred.ReduceTask: > attempt_201201271626_528

Problem in reduce phase

2012-02-03 Thread hadoop hive
hey folks, i m getting this error while running mapreduce and these comes up in reduce phase.. 2012-02-03 16:41:19,780 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201201271626_5282_r_00_0 copy failed: attempt_201201271626_5282_m_07_2 from hadoopdata3 2012-02-03 16:41:19,954 WARN or

Problem when starting datanode

2012-02-02 Thread hadoop hive
hey folks , I m getting an when i starting my datanode. can any1 have the idea what this error about. 2012-02-03 11:57:02,947 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.0.3.31:50010, storageID=DS-1677953808-10.0.3.31-50010-1318330317888, infoPort=50075, ipcPo

Re: Reduce > copy at 0.00 MB/s

2012-02-01 Thread hadoop hive
Hey , Can any1 help me with this, i have increases the reduce slowstart to .25 but its still hangs after copy . tell me what else i can change it to make it working fine. regards Vikas Srivastava On Wed, Jan 25, 2012 at 7:45 PM, praveenesh kumar wrote: > Yeah , I am doing it, currently its on 2

Re: ClassNotFound just started with custom mapper

2012-01-30 Thread hadoop hive
: > I am facing issues while trying to run a job from windows (through > eclipse) on my hadoop cluster on my RHEL VM's. When I run it as "run on > hadoop" it works fine, but when I run it as a java application, it throws > classnotfound exception > > INFO: Task

Re: jobtracker url(Critical)

2012-01-29 Thread hadoop hive
couple of days. > > On Friday, January 27, 2012, hadoop hive wrote: > > Hey Harsh, > > > > but after sumtym they are available 1 by 1 in jobtracker URL. > > > > any idea how they add up slowly slowly. > > > > regards > > Vikas > > > >

Re: jobtracker url(Critical)

2012-01-27 Thread hadoop hive
ommunication errors in their logs? Did you > perhaps bring up a firewall accidentally, that was not present before? > > On Fri, Jan 27, 2012 at 4:47 PM, hadoop hive wrote: > > Hey folks, > > > > i m facing a problem, with job Tracker URL, actually i added a node to

jobtracker url(Critical)

2012-01-27 Thread hadoop hive
Hey folks, i m facing a problem, with job Tracker URL, actually i added a node to the cluster and after sometime i restart the cluster, then i found that my job tracker is showing recent added node in *nodes * but rest of nodes are not available not even in *blacklist. * * * can any1 have any i

Re: NoSuchElementException while Reduce step

2012-01-27 Thread hadoop hive
hey there must be sum problem with the key or value, reducer didnt find the expected value. On Fri, Jan 27, 2012 at 1:23 AM, Rajesh Sai T wrote: > Hi, > > I'm new to Hadoop. I'm trying to write my custom data types for Writable > types. So, that Map class will produce my

Re: Reduce > copy at 0.00 MB/s

2012-01-25 Thread hadoop hive
this problem arise after adding a node , so then i start balancer to make it balance , On Wed, Jan 25, 2012 at 4:38 PM, praveenesh kumar wrote: > @hadoophive > > Can you explain more by "balance the cluster" ? > > Thanks, > Praveenesh > > On Wed, Jan 25, 2

Re: Reduce > copy at 0.00 MB/s

2012-01-25 Thread hadoop hive
i face the same issue but after sumtime when i balanced the cluster the jobs started running fine, On Wed, Jan 25, 2012 at 3:34 PM, praveenesh kumar wrote: > Hey, > > Can anyone explain me what is reduce > copy phase in the reducer section ? > The (K,List(V)), is passed to the reducer. Is reduce

Re: JobTracker webUI stopped showing suddenly

2012-01-11 Thread hadoop hive
your job tracker is not running On Wed, Jan 11, 2012 at 7:08 PM, praveenesh kumar wrote: > Jobtracker webUI suddenly stopped showing. It was working fine before. > What could be the issue ? Can anyone guide me how can I recover my WebUI ? > > Thanks, > Praveenesh >

Re: has bzip2 compression been deprecated?

2012-01-09 Thread bejoy . hadoop
generally, I wonder what the smallest desirable compressed record size is in the hadoop universe. - Tim. From: Tony Burton [tbur...@sportingindex.com] Sent: Monday, January 09, 2012 10:02 AM To: common-user@hadoop.apache.org Subject: RE: has bzip2

How Jobtracler stores tasktracker's information

2011-12-13 Thread hadoop anis
Anyone please tell this, I want to know from where Jobtracker sends task(taskid) to tasktarcker for scheduling. i.e where it creates taskid & tasktracker pairs Thanks & Regards, Mohmmadanis Moulavi Student, MTech (Computer Sci. & Engg.) Walchand college of Engg. Sangli (M.S.)

Re: Matrix multiplication in Hadoop

2011-11-19 Thread bejoy . hadoop
and then the matrix multiplication takes place in there. Regards Bejoy K S -Original Message- From: Mike Spreitzer Date: Fri, 18 Nov 2011 14:52:05 To: Reply-To: common-user@hadoop.apache.org Subject: RE: Matrix multiplication in Hadoop Well, this mismatch may tell me something

Re: Announcing Bangalore Hadoop Meetup Group

2011-11-17 Thread bejoy . hadoop
al Message- From: Sharad Agarwal Date: Thu, 17 Nov 2011 18:31:08 To: ; ; Reply-To: common-user@hadoop.apache.org Subject: Announcing Bangalore Hadoop Meetup Group Hi Bangalore Area Hadoop Developers and Users, There is a lot of interest in Hadoop and Big Data space in Bangalore. Many folks

Re: Input split for a streaming job!

2011-11-11 Thread bejoy . hadoop
patch. I will try and use 0.21 Raj > >From: Tim Broberg >To: "common-user@hadoop.apache.org" ; Raj V >; Joey Echeverria >Sent: Friday, November 11, 2011 10:25 AM >Subject: RE: Input split for a streaming job! > > > >What

Re: Hadoop 20.2 release compability

2011-11-03 Thread bejoy . hadoop
rrent hadoop 0.20.20x releases. Both the old and new Map reduce APIs work fine. Hope it helps!.. Regards Bejoy K S -Original Message- From: Amir Sanjar Date: Thu, 3 Nov 2011 12:42:06 To: Reply-To: common-user@hadoop.apache.org Subject: Hadoop 20.2 release compability is ther

Re: Bangalore meetups

2011-10-30 Thread bejoy . hadoop
some blog posts, hadoop articles, upcoming in hadoop, new developments in big data techology stack etc. I'd be happy to contribute something. But I believe hug was created by Yahoo B'lor team before last Hadoop India Summit and they'd be having some good amount of hadoop materia

Re: Bangalore meetups

2011-10-30 Thread bejoy . hadoop
Hey There is a hadoop user group in B'lor. But I think it is not really active. You can subscribe to it at hug-blr-subscr...@yahoogroups.com --Original Message-- From: real great.. To: common-user ReplyTo: common-user@hadoop.apache.org Subject: Bangalore meetups Sent: Oct 30,

Re: mapreduce linear chaining: ClassCastException

2011-10-15 Thread bejoy . hadoop
Oct 2011 17:31:27 > To: ; > Reply-To: common-user@hadoop.apache.org > Subject: mapreduce linear chaining: ClassCastException > > Hi all, > I am trying a simple extension of WordCount example in Hadoop. I want to > get a frequency of wordcounts in descending order. To that

Re: mapreduce linear chaining: ClassCastException

2011-10-15 Thread bejoy . hadoop
Message- From: "Periya.Data" Date: Fri, 14 Oct 2011 17:31:27 To: ; Reply-To: common-user@hadoop.apache.org Subject: mapreduce linear chaining: ClassCastException Hi all, I am trying a simple extension of WordCount example in Hadoop. I want to get a frequency of wordcounts in descen

Re: mapreduce linear chaining: ClassCastException

2011-10-15 Thread bejoy . hadoop
e: Fri, 14 Oct 2011 17:31:27 To: ; Reply-To: common-user@hadoop.apache.org Subject: mapreduce linear chaining: ClassCastException Hi all, I am trying a simple extension of WordCount example in Hadoop. I want to get a frequency of wordcounts in descending order. To that I employ a linear chain

Re: Web crawler in hadoop - unresponsive after a while

2011-10-13 Thread bejoy . hadoop
o leverage the parallel processing power of hadoop. You need to have a mini cluster at least for performance bench marking and processing relatively large volume data. Hope it helps!.. --Original Message-- From: Aishwarya Venkataraman Sender: avenk...@eng.ucsd.edu To: common-user@hadoop.

Re: hdfs directory location

2011-10-10 Thread bejoy . hadoop
t; > You mean putting your unix dir contents into hdfs? If so use hadoop fs > -copyFromLocal src destn > --Original Message-- > From: Jignesh Patel > To: common-user@hadoop.apache.org > To: bejoy.had...@gmail.com > Subject: Re: hdfs directory location > Sent: Oc

Re: hdfs directory location

2011-10-10 Thread bejoy . hadoop
Jignesh Sorry I didn't get your query, 'how I can link it with HDFS directory structure? ' You mean putting your unix dir contents into hdfs? If so use hadoop fs -copyFromLocal src destn --Original Message-- From: Jignesh Patel To: common-user@hadoo

Re: Developing MapReduce

2011-10-10 Thread bejoy . hadoop
Hi Mohit I'm really not sure how many of the map reduce developers use the map reduce eclipse plugin. AFAIK majority don't. As Jignesh mentioned you can get it from the hadoop distribution folder as soon as you unzip the same. My suggested approach would be,If you are on Windo

Re: hdfs directory location

2011-10-10 Thread bejoy . hadoop
Jignesh You are creating a dir in hdfs by that command. The dir won't be in your local file system but it hdfs. Issue a command like hadoop fs -ls /user/hadoop-user/citation/ You can see the dir you created in hdfs If you want to create a die on local unix use a simple linux command

Re: hadoop knowledge gaining

2011-10-07 Thread bejoy . hadoop
ou to take the following steps -understand hadoop, hdfs and mapreduce -understand the word count example code you already ran -understand each and every parameters mentioned in the Driver Class along with Map and Reduce class, get a grip on the parameters used in map and reduce methods and why. -

Re: SafeModeException: Cannot delete . Name node is in safe mode.

2011-10-05 Thread bejoy . hadoop
Hi Abdelrahman Kamel Your Name Node is in safe mode now. Either wait till it automatically comes out of safe mode or you can manually make it exit the safe mode by the following command hadoop dfsadmin -safemode leave If you cluster was not put on safe mode manually and it happened

Re: Error using hadoop distcp

2011-10-05 Thread bejoy . hadoop
ssage- From: trang van anh Date: Wed, 05 Oct 2011 14:06:11 To: Reply-To: common-user@hadoop.apache.org Subject: Re: Error using hadoop distcp which host run the task that throws the exception ? ensure that each data node know another data nodes in hadoop cluster-> add "ub16" entry

Re: incremental loads into hadoop

2011-10-02 Thread bejoy . hadoop
Sam Your understanding is right, hadoop definitely works great with large volume of data. But not necessarily every file should be in the range of Giga,Tera or Peta bytes. Mostly when said hadoop process tera bytes of data, It is the total data processed by a map reduce job(rather jobs

Re: Maintaining map reduce job logs - The best practices

2011-09-23 Thread bejoy . hadoop
dir, right? Second question, ie once a job is executed, does the logs from all tasks trackers get dumbed to HADOOP_HOME/logs/history dir in name node/job tracker? Third question is how do I enable DEBUG mode of logger? Or is it enabled in default. If not what is the logger mode enabled default

Re: Out of heap space errors on TTs

2011-09-19 Thread bejoy . hadoop
ubject: Out of heap space errors on TTs > To: common-user@hadoop.apache.org > > > Hey guys, > > > > I am running hive and I am trying to join two tables (2.2GB and > > 136MB) on a > > cluster of 9 nodes (replication = 3) > > > > Hadoop version - 0.20.2 > >

Use of Transaction Log

2011-05-04 Thread hadoop maniac
Hello, Can anyone please explain the use of the Transaction Log on the NameNode ? AFAIK, it logs the files created and deleted details in the Hadoop Cluster. Thanks.

[HDFS] Size of Metadata

2011-05-03 Thread hadoop maniac
Hello, What is the size of the metadata information if a file occupies a single block (of 64Mb) on HDFS ? Is there any specific formula using which we can calculate the size of the metadata ?

Sequence.Sorter Performance

2011-04-23 Thread hadoop
Hi guys, I'm trying to sort a 2.5 GB sequence file in one mapper using its implemented Sort function, but it's taking long that the map is killed for not reporting . I would increase the default time to get reports from the mapper, but I'll do this only if sorting using SequenceFile.so

File doesn't exists in mapred.local.dir

2011-04-14 Thread hadoop
Hi guys, This is a program which used to work but I probably have changed something and now it is taking me a lot of time to figure out how the problem is caused . In mapper : map-function: -- ... tempSeq = new Path(job.get("mapred.local.dir")+"/

Re: Hadoop Version

2011-01-28 Thread hadoop user
Redirecting to common-user, you can check hadoop version by using any of the following methods. CLI : using hadoop version command. bin/hadoop version Web Interface: Check Name node or Job tracker web interface. It will show version number. - Ravi On Fri, Jan 28, 2011 at 11:24 AM, wrote

Why do I get SocketTimeoutException?

2011-01-28 Thread hadoop user
What are possible causes due to which I might get SocketTimeoutException ? 11/01/28 19:01:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: 69000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[conn

Re: Benchmarks

2009-07-22 Thread JQ Hadoop
ral such efforts. >>> >>> Pig has PigMix >>> >>> Hadoop has terasort and likely some others as well. >>> >> >> Hadoop has the terasort, and grid mix. There is even a new version of the >> grid mix coming out. Look at: >> >> h

Re: Using chainmapper and chain reducer in hadoop

2009-07-18 Thread jason hadoop
ould view output only from > Map1 and nothing from Map2 in the logs. > > Regards, > Raakhi > > On Fri, Jul 17, 2009 at 6:59 PM, jason hadoop >wrote: > > > There are some examples for chain mapping in the example bundle for > chapter > > 8 of Pro Hadoop. >

Re: datanode auto down

2009-07-17 Thread jason hadoop
wrote: > Thank you for your reply, I am novice, I do not quite understand how to > check dfs.data.dir more than eight minutes? > My settings are dfs.data.dir > >dfs.data.dir >/ b, / c, / d, / e, / f, / g, / h, / i, / j, / k, / l > > > Each directory has a disk

Re: Using chainmapper and chain reducer in hadoop

2009-07-17 Thread jason hadoop
There are some examples for chain mapping in the example bundle for chapter 8 of Pro Hadoop. One think that may not be clear is that the chain of mappers execute within the single task jvm assigned for each map task, and the mappers chained to the reduce execute in the jvm assigned to the reduce

Re: datanode auto down

2009-07-17 Thread jason hadoop
, mingyang wrote: > i using hadoop storage my media files,, but when the number of documents > when more than one million, > Hadoop start about 10-20 minutes, my datanode automatically down, > namenode log shows that the loss of heart, but I see my normal datanode, > port 50010 can be a nor

Re: help with two column sort

2009-07-16 Thread jason hadoop
that let you log what is going on in the field comparator or field partitioner. On Thu, Jul 16, 2009 at 11:05 PM, jason hadoop wrote: > In the example code for Pro Hadoop there are some shims for the > fieldcomparator classes, that let you log what is going on in the > partitioner. &g

Re: help with two column sort

2009-07-16 Thread jason hadoop
In the example code for Pro Hadoop there are some shims for the fieldcomparator classes, that let you log what is going on in the partitioner. Also it is very useful if cumbersome to step through that in the debugger. On Thu, Jul 16, 2009 at 10:59 PM, David_ca wrote: > Hi, > > I am

Re: map side Vs. Reduce side join

2009-07-16 Thread jason hadoop
> > > Map-side join is almost always more efficient, but only handles some > cases. > > Reduce side joins always work, but require a complete map/reduce job to > get > > the join. > > > > -- Owen > > > -- Pro Hadoop, a book to guide you from beginner to

Re: Datanode Cannot Connect To The Server

2009-07-16 Thread jason hadoop
, Boyu Zhang wrote: > Thank you for your suggestion. > > I have done that plenty of times, and every time I delete the pids and the > files /tmp/hadoop-name that namenode formate genrated. But I got the same > error over and over. > > > I found out that after I start-dfs.sh,

Re: Compression issues!!

2009-07-15 Thread jason hadoop
e files) then the big file can be > processed > in parallel by multiple mappers (each processing a split of the file). > However if the compression codec that you use does not supprt file > splitting, then whole file will be processed by one mapper and you won't > achieve parallel

Re: more than one reducer in standalone mode

2009-07-14 Thread jason hadoop
Thanks Tom. The single reducer is greatly limiting in local mode. On Tue, Jul 14, 2009 at 3:15 AM, Tom White wrote: > There's a Jira to fix this here: > https://issues.apache.org/jira/browse/MAPREDUCE-434 > > Tom > > On Mon, Jul 13, 2009 at 12:34 AM, jason hadoop > wr

Re: Relation between number of map reduce tasks per node and capacity of namenode

2009-07-14 Thread jason hadoop
If you have received this communication in > error, please notify the sender and delete all copies of this message. > Persistent Systems Ltd. does not accept any liability for virus infected > mails. > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.

Re: more than one reducer in standalone mode

2009-07-12 Thread jason hadoop
sks(4); > > before starting the job and it seems that Hadoop overrides the > variable, as it says: > > 09/07/12 12:07:40 INFO mapred.MapTask: numReduceTasks: 1 > > Thanks! > Rares > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http:/

Re: how to compress..!

2009-07-11 Thread jason hadoop
: > A few comments before I answer: > 1) Each time you send an email, we receive two emails. Is your mail client > misconfigured? > 2) You already asked this question in another thread :). See my response > there. > > Short answer: < > > http://hadoop.apache.org/co

Re: .tar.gz as input files

2009-07-11 Thread jason hadoop
this issue and solved it? :) > > > -- > Andraz Tori, CTO > Zemanta Ltd, New York, London, Ljubljana > www.zemanta.com > mail: and...@zemanta.com > tel: +386 41 515 767 > twitter: andraz, skype: minmax_test > > > > -- Pro Hadoop, a book to guide you from begi

Re: Accessing static variables in map function

2009-07-10 Thread jason hadoop
There will be no updates :) On Fri, Jul 10, 2009 at 11:37 AM, Ted Dunning wrote: > And NEVER expect updates to these variables to work like you think. > > On Thu, Jul 9, 2009 at 8:24 PM, jason hadoop > wrote: > > > Then use them as needed in the map method of your mappe

Re: Accessing static variables in map function

2009-07-09 Thread jason hadoop
; > > > Hey Ram. > > The problem is i initialize these variables in the run function after > > receiving the cmd line arguments . > > I want to access the same vars in the mpa function. > > Is there a diff way other than passing the variables through a Conf > ob

Re: Sort by value

2009-07-09 Thread jason hadoop
r the ten most > visited pages on a certain site and so on. > > Wondering if that is even possible with hadopp or if I need to process the > file outside of hadoop. > > Cheers > > /Marcus > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 >

Re: Merging many output files from reducer

2009-07-08 Thread jason hadoop
In the example code from Pro Hadoop, is a sample map reduce job that uses mapside join to merge the files into a single output. It is part of the chapter 9 examples. On Wed, Jul 8, 2009 at 4:55 PM, Ted Dunning wrote: > On Wed, Jul 8, 2009 at 3:38 PM, Owen O'Malley wrote: > >

Re: permission denied on additional binaries

2009-07-08 Thread jason hadoop
Just out of curiosity, what happens when you run your script by hand? On Wed, Jul 8, 2009 at 8:09 AM, Rares Vernica wrote: > On Tue, Jul 7, 2009 at 10:26 PM, jason hadoop > wrote: > > > > The mapper has no control at the point where your mymapper.sh script is > > runni

Re: Reporter - incrCounter

2009-07-08 Thread jason hadoop
d ? > I log the counter call incrCounter. > log stmt counters are fine and Reporter counters are not matching > > m i missing something ? > > -Sagar > > > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?t

Re: permission denied on additional binaries

2009-07-07 Thread jason hadoop
ould end up in your job's userlog. You could redirect it to a temporary file to make it available in /tmp. Chapter 8 of Pro Hadoop covers some fine details of streaming jobs. It may be that there is something going on in the environment that is resulting in your permission denied error. On

Re: Copy files https -> HDFS

2009-07-07 Thread jason hadoop
to do it in parallel. > > I would guess this has been done before? Is there example code > anywhere? I can imagine creating a mapper-only job with a list of files > as input, but how do I easily write to HDFS from a mapper? > > Thanks, > \EF > -- Pro Hadoop, a boo

Re: Parallell maps

2009-07-07 Thread jason hadoop
lse); > > or > > conf.setMapSpeculativeExecution(false); > conf.setReduceSpeculativeExecution(false); > > Thibaut > > > marcusherou wrote: > > > > Hi. > > > > I've noticed that hadoop spawns parallell copies of the same task on > > different hosts. I've underst

Re: zip files as input

2009-07-07 Thread jason hadoop
lot of smaller zip files (not gzip) that need to be > processed. I put these into a SequenceFile outside of hadoop and then > upload to hdfs. Once in hdfs, I have the mapper read the SequenceFile with > each record being a zip file, then read it in as bytes that get > decompressed, an

Re: JobTracker hangs after 400-500 jobs

2009-07-06 Thread jason hadoop
; > > > On 7/6/09 12:23 PM, "Songting Chen" wrote: > > > > No response from the HaDoop cluster then - stop/start map/reduce would > solve the problem. > > Note: HDFS has no such issue. > Is it a common problem (we use v.19)? > > Thanks, > -Songti

  1   2   >