A general doubt about what offers the Hadoop project.

2016-09-26 Thread Manuel Enrique Puebla Martínez
Hello: I'm meeting Hadoop and I have the following question. It is a general doubt about what offers the Hadoop project. "The Apache Hadoop software library is a framework That Allows for the distributed processing of large data sets across clusters of computers using a simple progra

Re: Doubt in DoubleWritable

2015-11-23 Thread unmesha sreeveni
Please try this for (DoubleArrayWritable avalue : values) { Writable[] value = avalue.get(); // DoubleWritable[] value = new DoubleWritable[6]; // for(int k=0;k<6;k++){ // value[k] = DoubleWritable(wvalue[k]); // } //parse accordingly if (Double.parseDouble(value[1].toString()) != 0) {

Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document

2014-09-28 Thread Giridhar Addepalli
Hi All, I am going through Quorum Journal Design document. It is mentioned in Section 2.8 - In Accept Recovery RPC section If the current on-disk log is missing, or a *different length *than the proposed recovery, the JN downloads the log from the provided URI, replacing any current copy of the

Re: Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document

2014-09-28 Thread Ulul
Hi A developer should answer that but a quick look to an edit file with od suggests that record are not fixed length. So maybe the likeliness of the situation you suggest is so low that there is no need to check more than file size Ulul Le 28/09/2014 11:17, Giridhar Addepalli a écrit : Hi

RE: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Radhe Radhe
:37:56 -0400 Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop From: john.meag...@gmail.com To: user@hadoop.apache.org Also, Source Compatibility also means ONLY a recompile is needed. No code changes should

Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Zhijie Shen
:56 -0400 Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop From: john.meag...@gmail.com To: user@hadoop.apache.org Also, Source Compatibility also means ONLY a recompile is needed. No code changes should

RE: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Radhe Radhe
2014 13:03:53 -0700 Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop From: zs...@hortonworks.com To: user@hadoop.apache.org 1. If you have the binaries that were compiled against MRv1 mapred libs, it should just work

Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Zhijie Shen
file is execute it. -RR -- Date: Tue, 15 Apr 2014 13:03:53 -0700 Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop From: zs...@hortonworks.com To: user@hadoop.apache.org 1. If you

Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-14 Thread Radhe Radhe
Hello People, As per the Apache site http://hadoop.apache.org/docs/r2.3.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html Binary CompatibilityFirst, we ensure binary compatibility to the applications that use old mapred

Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-14 Thread John Meagher
Also, Source Compatibility also means ONLY a recompile is needed. No code changes should be needed. On Mon, Apr 14, 2014 at 10:37 AM, John Meagher john.meag...@gmail.com wrote: Source Compatibility = you need to recompile and use the new version as part of the compilation Binary Compatibility

Doubt

2014-03-19 Thread sri harsha
Hi all, is it possible to install Mongodb on the same VM which consists hadoop? -- amiable harsha

Re: Doubt

2014-03-19 Thread Jay Vyas
Certainly it is , and quite common especially if you have some high performance machines : they can run as mapreduce slaves and also double as mongo hosts. The problem would of course be that when running mapreduce jobs you might have very slow network bandwidth at times, and if your front end

Re: Doubt

2014-03-19 Thread sri harsha
thank s jay and praveen, i want to use both separately don't want to use mongodb in the place of hbase On Wed, Mar 19, 2014 at 9:25 PM, Jay Vyas jayunit...@gmail.com wrote: Certainly it is , and quite common especially if you have some high performance machines : they can run as mapreduce

Re: Doubt

2014-03-19 Thread praveenesh kumar
Why not ? Its just a matter of installing 2 different packages. Depends on what do you want to use it for, you need to take care of few things, but as far as installation is concerned, it should be easily doable. Regards Prav On Wed, Mar 19, 2014 at 3:41 PM, sri harsha rsharsh...@gmail.com

Re: doubt

2014-01-19 Thread Justin Black
I've installed a hadoop single node cluster on a VirtualBox machine running ubuntu 12.04LTS (64-bit) with 512MB RAM and 8GB HD. I haven't seen any errors in my testing yet. Is 1GB RAM required? Will I run into issues when I expand the cluster? On Sat, Jan 18, 2014 at 11:24 PM, Alexander

doubt

2014-01-18 Thread sri harsha
Hi , i want to install 4 node cluster in 64-bit LINUX. 4GB RAM 500HD is enough for this or shall i need to expand ? please suggest about my query. than x -- amiable harsha

Re: doubt

2014-01-18 Thread Alexander Pivovarov
it' enough. hadoop uses only 1GB RAM by default. On Sat, Jan 18, 2014 at 10:11 PM, sri harsha rsharsh...@gmail.com wrote: Hi , i want to install 4 node cluster in 64-bit LINUX. 4GB RAM 500HD is enough for this or shall i need to expand ? please suggest about my query. than x -- amiable

Re: Basic Doubt in Hadoop

2013-04-17 Thread Ramesh R Nair
KS Sent from remote device, Please excuse typos -- *From: * Raj Hadoop hadoop...@yahoo.com *Date: *Tue, 16 Apr 2013 21:49:34 -0700 (PDT) *To: *user@hadoop.apache.orguser@hadoop.apache.org *ReplyTo: * user@hadoop.apache.org *Subject: *Basic Doubt in Hadoop Hi

Re: Basic Doubt in Hadoop

2013-04-17 Thread bejoy . hadoop
: Basic Doubt in Hadoop Hi Bejoy, Regarding the output of Map phase, does Hadoop store it in local fs or in HDFS. I believe it is in the former. Correct me if I am wrong. Regards Ramesh On Wed, Apr 17, 2013 at 10:30 AM, bejoy.had...@gmail.com wrote: The data is in HDFS in case of WordCount

Basic Doubt in Hadoop

2013-04-16 Thread Raj Hadoop
Hi, I am new to Hadoop. I started reading the standard Wordcount program. I got this basic doubt in Hadoop. After the Map - Reduce is done, where is the output generated?  Does the reducer ouput sit on individual DataNodes ? Please advise. Thanks, Raj

Re: Basic Doubt in Hadoop

2013-04-16 Thread bejoy . hadoop
: Basic Doubt in Hadoop Hi, I am new to Hadoop. I started reading the standard Wordcount program. I got this basic doubt in Hadoop. After the Map - Reduce is done, where is the output generated?  Does the reducer ouput sit on individual DataNodes ? Please advise. Thanks, Raj

Re: fundamental doubt

2012-11-21 Thread Mohammad Tariq
Hello Jamal, For efficient processing all the values associated with the same key get sorted and go to same reducer. As a result the reducer gets a key and a list of values as its input. To me your assumption seems correct. Regards, Mohammad Tariq On Thu, Nov 22, 2012 at 1:20 AM,

Re: fundamental doubt

2012-11-21 Thread Bejoy KS
it. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: jamal sasha jamalsha...@gmail.com Date: Wed, 21 Nov 2012 14:50:51 To: user@hadoop.apache.orguser@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: fundamental doubt Hi.. I guess i am asking alot

Re: fundamental doubt

2012-11-21 Thread jamal sasha
@hadoop.apache.orguser@hadoop.apache.org *ReplyTo: * user@hadoop.apache.org *Subject: *fundamental doubt Hi.. I guess i am asking alot of fundamental questions but i thank you guys for taking out time to explain my doubts. So i am able to write map reduce jobs but here is my mydoubt As of now

Re: Doubt on Input and Output Mapper - Key value pairs

2012-11-07 Thread Harsh J
The answer (a) is correct, in general. On Wed, Nov 7, 2012 at 6:09 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Hi, Which of the following is correct w.r.t mapper. (a) It accepts a single key-value pair as input and can emit any number of key-value pairs as

Re: Doubt on Input and Output Mapper - Key value pairs

2012-11-07 Thread Mahesh Balija
Hi Rams, A mapper will accept single key-value pair as input and can emit 0 or more key-value pairs based on what you want to do in mapper function (I mean based on your business logic in mapper function). But the framework will actually aggregate the list of values

Re: Amateur doubt about Terasort

2012-09-26 Thread Harsh J
Please do not mail general@ with user/dev questions. Use the user@ alias for it in future. The IdentityMapper and IdentityReducer is what TeraSort uses (it is not needed/hadoop does sort on default - uses default mapper/reducer). On Wed, Sep 26, 2012 at 10:08 PM, Nitin Khandelwal

Re: doubt about reduce tasks and block writes

2012-08-26 Thread Raj Vishwanathan
Message - From: Harsh J ha...@cloudera.com To: common-user@hadoop.apache.org; Raj Vishwanathan rajv...@yahoo.com Cc: Sent: Saturday, August 25, 2012 4:02 AM Subject: Re: doubt about reduce tasks and block writes Raj's almost right. In times of high load or space fillup on a local DN

Re: doubt about reduce tasks and block writes

2012-08-25 Thread Marc Sturlese
Thanks, Raj you got exactly my point. I wanted to confirm this assumption as I was guessing if a shared HDFS cluster with MR and Hbase like this would make sense: http://old.nabble.com/HBase-User-f34655.html -- View this message in context: http://lucene.472066.n3.nabble.com/doubt-about-reduce

Re: doubt about reduce tasks and block writes

2012-08-25 Thread Harsh J
sure about this , but there could be corner cases in case of node failure and such like! I need to look into the code. Raj From: Marc Sturlese marc.sturl...@gmail.com To: hadoop-u...@lucene.apache.org Sent: Friday, August 24, 2012 1:09 PM Subject: doubt about

doubt about reduce tasks and block writes

2012-08-24 Thread Marc Sturlese
Hey there, I have a doubt about reduce tasks and block writes. Do a reduce task always first write to hdfs in the node where they it is placed? (and then these blocks would be replicated to other nodes) In case yes, if I have a cluster of 5 nodes, 4 of them run DN and TT and one (node A) just run

Re: doubt about reduce tasks and block writes

2012-08-24 Thread Minh Duc Nguyen
Marc, see my inline comments. On Fri, Aug 24, 2012 at 4:09 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I have a doubt about reduce tasks and block writes. Do a reduce task always first write to hdfs in the node where they it is placed? (and then these blocks would

Re: doubt about reduce tasks and block writes

2012-08-24 Thread Bertrand Dechoux
on one node without a replica of the data then your node A is as likely as any other to be chosen as a source. Regards Bertrand On Fri, Aug 24, 2012 at 10:09 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I have a doubt about reduce tasks and block writes. Do a reduce task always

Re: doubt about reduce tasks and block writes

2012-08-24 Thread Raj Vishwanathan
, August 24, 2012 1:09 PM Subject: doubt about reduce tasks and block writes Hey there, I have a doubt about reduce tasks and block writes. Do a reduce task always first write to hdfs in the node where they it is placed? (and then these blocks would be replicated to other nodes) In case yes, if I

Re: doubt on Hadoop job submission process

2012-08-13 Thread Harsh J
for the DistributedCache of the job, if necessary. Copying the job's jar and configuration to the map-reduce system directory on the distributed file-system. Submitting the job to the JobTracker and optionally monitoring it's status. I have a doubt in 4th point of job execution flow could any

Re: doubt on Hadoop job submission process

2012-08-13 Thread Manoj Babu
monitoring it's status. I have a doubt in 4th point of job execution flow could any of you explain it? What is job's jar? The job.jar is the jar you supply via hadoop jar jar. Technically though, it is the jar pointed by JobConf.getJar() (Set via setJar or setJarByClass calls

Re: doubt on Hadoop job submission process

2012-08-13 Thread Harsh J
the requisite accounting information for the DistributedCache of the job, if necessary. Copying the job's jar and configuration to the map-reduce system directory on the distributed file-system. Submitting the job to the JobTracker and optionally monitoring it's status. I have a doubt

Re: Doubt from the book Definitive Guide

2012-04-05 Thread Mohit Anchlia
On Wed, Apr 4, 2012 at 10:02 PM, Prashant Kommireddi prash1...@gmail.comwrote: Hi Mohit, What would be the advantage? Reducers in most cases read data from all the mappers. In the case where mappers were to write to HDFS, a reducer would still require to read data from other datanodes across

Re: Doubt from the book Definitive Guide

2012-04-05 Thread Jean-Daniel Cryans
On Thu, Apr 5, 2012 at 7:03 AM, Mohit Anchlia mohitanch...@gmail.com wrote: Only advantage I was thinking of was that in some cases reducers might be able to take advantage of data locality and avoid multiple HTTP calls, no? Data is anyways written, so last merged file could go on HDFS instead

Doubt from the book Definitive Guide

2012-04-04 Thread Mohit Anchlia
I am going through the chapter How mapreduce works and have some confusion: 1) Below description of Mapper says that reducers get the output file using HTTP call. But the description under The Reduce Side doesn't specifically say if it's copied using HTTP. So first confusion, Is the output copied

Re: Doubt from the book Definitive Guide

2012-04-04 Thread Prashant Kommireddi
Answers inline. On Wed, Apr 4, 2012 at 4:56 PM, Mohit Anchlia mohitanch...@gmail.comwrote: I am going through the chapter How mapreduce works and have some confusion: 1) Below description of Mapper says that reducers get the output file using HTTP call. But the description under The Reduce

Re: Doubt from the book Definitive Guide

2012-04-04 Thread Harsh J
Hi Mohit, On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia mohitanch...@gmail.com wrote: I am going through the chapter How mapreduce works and have some confusion: 1) Below description of Mapper says that reducers get the output file using HTTP call. But the description under The Reduce Side

Re: Doubt from the book Definitive Guide

2012-04-04 Thread Mohit Anchlia
On Wed, Apr 4, 2012 at 8:42 PM, Harsh J ha...@cloudera.com wrote: Hi Mohit, On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia mohitanch...@gmail.com wrote: I am going through the chapter How mapreduce works and have some confusion: 1) Below description of Mapper says that reducers get the

Re: Doubt from the book Definitive Guide

2012-04-04 Thread Prashant Kommireddi
Hi Mohit, What would be the advantage? Reducers in most cases read data from all the mappers. In the case where mappers were to write to HDFS, a reducer would still require to read data from other datanodes across the cluster. Prashant On Apr 4, 2012, at 9:55 PM, Mohit Anchlia

A doubt about integration with tools like Pig, Hive or H-Base

2012-03-14 Thread Luiz Antonio Falaguasta Barbosa
Hi people, Please, I would like ask something a bit more high level than programing for Hadoop. I will have some students working with Hive, Pig or H-Base (I don't which of them yet) and I would like to know if somebody here has already use Hadoop from Amazon EC2 integrated to one ot these other

Re: basic doubt on number of reduce tasks

2012-03-02 Thread Bejoy Ks
is, according to the size of data to be processed on a particular node, proportionate number of reduce tasks will be run on different nodes. please some body clarify this basic doubt .. which is correct? If none, what is the actual process that takes place -- *Regards* * Vamshi Krishna *

Re: Hadoop fair scheduler doubt: allocate jobs to pool

2012-03-01 Thread Merto Mertek
From the fairscheduler docs I assume the following should work: property namemapred.fairscheduler.poolnameproperty/name valuepool.name/value /property property namepool.name/name value${mapreduce.job.group.name}/value /property which means that the default pool will be the group of

RE: Hadoop fair scheduler doubt: allocate jobs to pool

2012-03-01 Thread Dave Shine
: Thursday, March 01, 2012 9:33 AM To: common-user@hadoop.apache.org Subject: Re: Hadoop fair scheduler doubt: allocate jobs to pool From the fairscheduler docs I assume the following should work: property namemapred.fairscheduler.poolnameproperty/name valuepool.name/value /property property

Re: Hadoop fair scheduler doubt: allocate jobs to pool

2012-03-01 Thread Austin Chungath
property on the Job Conf to the name of the pool you want the job to use. Dave -Original Message- From: Merto Mertek [mailto:masmer...@gmail.com] Sent: Thursday, March 01, 2012 9:33 AM To: common-user@hadoop.apache.org Subject: Re: Hadoop fair scheduler doubt: allocate jobs to pool

Re: Hadoop fair scheduler doubt: allocate jobs to pool

2012-03-01 Thread Austin Chungath
Hi, I tried what you had said. I added the following to mapred-site.xml: property namemapred.fairscheduler.poolnameproperty/name valuepool.name/value /property property namepool.name/name value${mapreduce.job.group.name}/value /property Funny enough it created a pool with the name

Hadoop fair scheduler doubt: allocate jobs to pool

2012-02-29 Thread Austin Chungath
How can I set the fair scheduler such that all jobs submitted from a particular user group go to a pool with the group name? I have setup fair scheduler and I have two users: A and B (belonging to the user group hadoop) When these users submit hadoop jobs, the jobs from A got to a pool named A

Doubt in Avatarnode?

2011-08-26 Thread shanmuganathan.r
Hi All, I have the doubt in avatar node setup . I configure the avatarnode using the patch https://issues.apache.org/jira/browse/HDFS-976 Am I need to configure the NFS filer for share the FSimage file between active and standby avatarnodes ? What is the other configurations needed

Doubt in Avatarnode?

2011-08-26 Thread shanmuganathan.r
Hi All, I have the doubt in avatar node setup . I configure the avatarnode using the patch https://issues.apache.org/jira/browse/HDFS-976 Am I need to configure the NFS filer for share the FSimage file between active and standby avatarnodes ? What is the other configurations needed

Doubt in Avatarnode?

2011-08-26 Thread shanmuganathan.r
Hi All, I have the doubt in avatar node setup . I configure the avatarnode using the patch https://issues.apache.org/jira/browse/HDFS-976 Am I need to configure the NFS filer for share the FSimage file between active and standby avatarnodes ? What is the other configurations needed

Doubt in Avatarnode?

2011-08-26 Thread shanmuganathan.r
Hi All, I have the doubt in avatar node setup . I configure the avatarnode using the patch https://issues.apache.org/jira/browse/HDFS-976 Am I need to configure the NFS filer for share the FSimage file between active and standby avatarnodes ? What is the other configurations needed

Re: [Doubt]: Submission of Mapreduce from outside Hadoop Cluster

2011-07-01 Thread Harsh J
Narayanan, On Fri, Jul 1, 2011 at 11:28 AM, Narayanan K knarayana...@gmail.com wrote: Hi all, We are basically working on a research project and I require some help regarding this. Always glad to see research work being done! What're you working on? :) How do I submit a mapreduce job from

Re: [Doubt]: Submission of Mapreduce from outside Hadoop Cluster

2011-07-01 Thread Harsh J
Narayanan, On Fri, Jul 1, 2011 at 12:57 PM, Narayanan K knarayana...@gmail.com wrote: So the report will be run from a different machine outside the cluster. So we need a way to pass on the parameters to the hadoop cluster (master) and initiate a mapreduce job dynamically. Similarly the output

Re: [Doubt]: Submission of Mapreduce from outside Hadoop Cluster

2011-07-01 Thread Yaozhen Pan
Narayanan, Regarding the client installation, you should make sure that client and server use same version hadoop for submitting jobs and transfer data. if you use a different user in client than the one runs hadoop job, config the hadoop ugi property (sorry i forget the exact name). 在 2011 7 1

[Doubt]: Submission of Mapreduce from outside Hadoop Cluster

2011-06-30 Thread Narayanan K
Hi all, We are basically working on a research project and I require some help regarding this. I had a few basic doubts regarding submission of Map-Reduce jobs in Hadoop. 1. How do I submit a mapreduce job from outside the cluster i.e from a different machine outside the Hadoop

Doubt: Regarding running Hadoop on a cluster with shared disk.

2010-05-05 Thread Udaya Lakshmi
Hi, I have an account on a cluster which is having a file system similar to NFS. If I create a file on one machine it is being shown on all the machines in the cluster. But hadoop will work on a cluster of machines, where in , each machine has a disk of its own. Can someone please help me use

RE: Doubt: Regarding running Hadoop on a cluster with shared disk.

2010-05-05 Thread Michael Segel
Subject: Doubt: Regarding running Hadoop on a cluster with shared disk. From: udaya...@gmail.com To: common-user@hadoop.apache.org Hi, I have an account on a cluster which is having a file system similar to NFS. If I create a file on one machine it is being shown on all the machines

Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Udaya Lakshmi
Hi, I am given an account on a cluster which uses OpenPBS as the cluster management software. The only way I can run a job is by submitting it to OpenPBS. How to run mapreduce programs on it? Is there any possible work around? Thanks, Udaya.

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Craig Macdonald
HOD supports a PBS environment, namely Torque. Torque is the vastly improved fork of OpenPBS. You may be able to get HOD working on OpenPBS, or better still persuade your cluster admins to upgrade to a more recent version of Torque (e.g. at least 2.1.x) Craig On 22/07/28164 20:59, Udaya

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Udaya Lakshmi
Thank you Craig. My cluster has got Torque. Can you please point me something which will have detailed explanation about using HOD on Torque. On Tue, May 4, 2010 at 10:17 PM, Craig Macdonald cra...@dcs.gla.ac.ukwrote: HOD supports a PBS environment, namely Torque. Torque is the vastly improved

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Peeyush Bishnoi
Udaya, Following link will help you for HOD on torque. http://hadoop.apache.org/common/docs/r0.20.0/hod_user_guide.html Thanks, --- Peeyush On Tue, 2010-05-04 at 22:49 +0530, Udaya Lakshmi wrote: Thank you Craig. My cluster has got Torque. Can you please point me something which will have

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Allen Wittenauer
On May 4, 2010, at 7:46 AM, Udaya Lakshmi wrote: Hi, I am given an account on a cluster which uses OpenPBS as the cluster management software. The only way I can run a job is by submitting it to OpenPBS. How to run mapreduce programs on it? Is there any possible work around? Take a look

Re: Doubt: Using PBS to run mapreduce jobs.

2010-05-04 Thread Udaya Lakshmi
Thank you. Udaya. On Wed, May 5, 2010 at 12:23 AM, Allen Wittenauer awittena...@linkedin.comwrote: On May 4, 2010, at 7:46 AM, Udaya Lakshmi wrote: Hi, I am given an account on a cluster which uses OpenPBS as the cluster management software. The only way I can run a job is by

Doubt about SequenceFile.Writer

2010-02-07 Thread Andiana Squazo Ringa
Hi, I have written to a sideeffect file using SequenceFile.Writer . But when I cat the file, it is printing some unreadable characters . I did not use any compression code. Why is this so? Thanks, Ringa.

Re: Doubt about SequenceFile.Writer

2010-02-07 Thread Jeff Zhang
The SequenceFile is not text file, so you can not see the content by invoking unix command cat. But you can get the text content by using hadoop command : hadoop fs -text src On Sun, Feb 7, 2010 at 8:51 AM, Andiana Squazo Ringa andriana.ri...@gmail.com wrote: Hi, I have written to a

Re: Doubt about SequenceFile.Writer

2010-02-07 Thread Ravi
Thanks a lot Jeff. Ringa. On Sun, Feb 7, 2010 at 10:30 PM, Jeff Zhang zjf...@gmail.com wrote: The SequenceFile is not text file, so you can not see the content by invoking unix command cat. But you can get the text content by using hadoop command : hadoop fs -text src On Sun, Feb 7,

Input file format doubt

2010-01-28 Thread Udaya Lakshmi
Hi all.. I have searched the documentation but could not find a input file format which will give line number as the key and line as the value. Did I miss something? Can someone give me a clue of how to implement one such input file format. Thanks, Udaya.

Re: Input file format doubt

2010-01-28 Thread Amogh Vasekar
Hi, For global line numbers, you would need to know the ordering within each split generated from the input file. The standard input formats provide offsets in splits, so if the records are of equal length you can compute some kind of numbering. I remember someone had implemented sequential

Re: Input file format doubt

2010-01-28 Thread Ravi
Thank you Amogh. On Thu, Jan 28, 2010 at 3:44 PM, Amogh Vasekar am...@yahoo-inc.com wrote: Hi, For global line numbers, you would need to know the ordering within each split generated from the input file. The standard input formats provide offsets in splits, so if the records are of equal

Re: Input file format doubt

2010-01-28 Thread Ravi
I too had the doubt but could not find the clue. However Please post the code if u can find it. On Thu, Jan 28, 2010 at 4:03 PM, Ravi ravindra.babu.rav...@gmail.comwrote: Thank you Amogh. On Thu, Jan 28, 2010 at 3:44 PM, Amogh Vasekar am...@yahoo-inc.comwrote: Hi, For global line numbers

Re: Input file format doubt

2010-01-28 Thread Amogh Vasekar
Hi, Here's the relevant thread with Gordon, the author of the solution: I am in the process of learning Hadoop (and I think I've made a lot of progress). I have described the specific problem and solution on my blog

Re: Input file format doubt

2010-01-28 Thread Ravi
Thank you Amogh Ravi. On 1/28/10, Amogh Vasekar am...@yahoo-inc.com wrote: Hi, Here's the relevant thread with Gordon, the author of the solution: I am in the process of learning Hadoop (and I think I've made a lot of progress). I have described the specific problem and solution on my blog

Re: Input file format doubt

2010-01-28 Thread Udaya Lakshmi
Thank you Amogh. I will go through the link. Udaya. On 1/28/10, Ravi ravindra.babu.rav...@gmail.com wrote: Thank you Amogh Ravi. On 1/28/10, Amogh Vasekar am...@yahoo-inc.com wrote: Hi, Here's the relevant thread with Gordon, the author of the solution: I am in the process of learning

Re: Small doubt in MR

2010-01-04 Thread Mridul Muralidharan
) { flag=false; /* section of code */ } } I am running this code on in pseudo-distributed mode and its working fine . I doubt whether this runs correctly in distributed mode because , mappers on other systems have to notified of the changed flag .. Any Comments

Small doubt in MR

2010-01-02 Thread bharath v
*/ } } I am running this code on in pseudo-distributed mode and its working fine . I doubt whether this runs correctly in distributed mode because , mappers on other systems have to notified of the changed flag .. Any Comments ? If this is wrong , any suggestions on what method I must

Re: Small doubt in MR

2010-01-02 Thread Mark Kerzner
; /* section of code */ } } I am running this code on in pseudo-distributed mode and its working fine . I doubt whether this runs correctly in distributed mode because , mappers on other systems have to notified of the changed flag .. Any Comments ? If this is wrong , any suggestions

Re: Small doubt in MR

2010-01-02 Thread Matei Zaharia
{ if(flag) { flag=false; /* section of code */ } } I am running this code on in pseudo-distributed mode and its working fine . I doubt whether this runs correctly in distributed mode because , mappers on other systems have

Re: Small doubt in MR

2010-01-02 Thread brien colwell
. Main-Class { public boolean flag = true; Map-Class { if(flag) { flag=false; /* section of code */ } } I am running this code on in pseudo-distributed mode and its working fine . I doubt whether this runs

Re: Re: Re: Re: Doubt in Hadoop

2009-11-29 Thread aa225
reduce job in command line or IDE?  in map reduce mode, you should put the jar containing the map and reduce class in your classpath Jeff Zhang On Fri, Nov 27, 2009 at 2:19 PM,   wrote: Hello Everybody,                I have a doubt in Haddop and was wondering

Re: Re: Doubt in Hadoop

2009-11-27 Thread Aaron Kimball
, you should put the jar containing the map and reduce class in your classpath Jeff Zhang On Fri, Nov 27, 2009 at 2:19 PM, wrote: Hello Everybody, I have a doubt in Haddop and was wondering if anybody has faced a similar problem. I have a package called test. Inside

Re: Doubt in Hadoop

2009-11-26 Thread Jeff Zhang
Do you run the map reduce job in command line or IDE? in map reduce mode, you should put the jar containing the map and reduce class in your classpath Jeff Zhang On Fri, Nov 27, 2009 at 2:19 PM, aa...@buffalo.edu wrote: Hello Everybody, I have a doubt in Haddop

Re: Re: Doubt in Hadoop

2009-11-26 Thread aa225
in command line or IDE?  in map reduce mode, you should put the jar containing the map and reduce class in your classpath Jeff Zhang On Fri, Nov 27, 2009 at 2:19 PM, wrote: Hello Everybody,                I have a doubt in Haddop and was wondering if anybody has faced a similar problem. I

Map Recude code doubt

2009-10-15 Thread shwitzu
-code-doubt-tp25916994p25916994.html Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Map Recude code doubt

2009-10-15 Thread Last-chance Architect
Shwitzu, Why can't you just use query thru a Filesystem object and find the file you want? Lajos shwitzu wrote: Hello All, I was wondering if our map reduce code can just return the location of the file? Or place the actual file in a given output directory by searching based on a keyword.

Doubt in reducer

2009-08-27 Thread Rakhi Khatwani
Hi, I am running a map reduce program which reads data from a file, processes it and writes the output into another file. i run 4 maps and 4 reduces, and my output is as follows: 09/08/27 17:34:37 INFO mapred.JobClient: Running job: job_200908271142_0026 09/08/27 17:34:38 INFO

Re: Doubt in reducer

2009-08-27 Thread Vladimir Klimontovich
But reducer can do some preparations during map process. It can distribute map output across nodes that will work as reducers. Copying and sorting map output is also time costuming process (maybe, more consuming than reduce itself). For example, piece job run log on 40node cluster could be

Re: Doubt in HBase

2009-08-21 Thread Ryan Rawson
processes may go waste ? is that the case? Also  I have one more doubt ..I have 5 values for a corresponding key on one region  and other 2 values on 2 different region servers. Does hadoop Map reduce take care of moving these 2 diff values to the region with 5 values instead of moving those 5

Re: Doubt in HBase

2009-08-21 Thread Ryan Rawson
doubt ..I have 5 values for a corresponding key on one region  and other 2 values on 2 different region servers. Does hadoop Map reduce take care of moving these 2 diff values to the region with 5 values instead of moving those 5 values to other system to minimize the dataflow? Is this what

Re: Doubt in HBase

2009-08-21 Thread bharath vissapragada
the number of reduce tasks is more than number of distinct map output keys , some of the reduce processes may go waste ? is that the case? Also I have one more doubt ..I have 5 values for a corresponding key on one region and other 2 values on 2 different region servers. Does hadoop Map

Re: Doubt in HBase

2009-08-21 Thread Jonathan Gray
? is that the case? Also I have one more doubt ..I have 5 values for a corresponding key on one region and other 2 values on 2 different region servers. Does hadoop Map reduce take care of moving these 2 diff values to the region with 5 values instead of moving those 5 values to other system

Re: Doubt in HBase

2009-08-21 Thread bharath vissapragada
? is that the case? Also I have one more doubt ..I have 5 values for a corresponding key on one region and other 2 values on 2 different region servers. Does hadoop Map reduce take care of moving these 2 diff values to the region with 5 values instead of moving those 5 values to other

Re: Doubt in HBase

2009-08-21 Thread Jonathan Gray
of distinct map output keys , some of the reduce processes may go waste ? is that the case? Also I have one more doubt ..I have 5 values for a corresponding key on one region and other 2 values on 2 different region servers. Does hadoop Map reduce take care of moving these 2 diff values

Re: Doubt in HBase

2009-08-21 Thread bharath vissapragada
very useful. You said to increase the number of reduce tasks . Suppose the number of reduce tasks is more than number of distinct map output keys , some of the reduce processes may go waste ? is that the case? Also I have one more doubt ..I have 5 values for a corresponding key

Doubt in HBase

2009-08-20 Thread john smith
Hi all , I have one small doubt . Kindly answer it even if it sounds silly. Iam using Map Reduce in HBase in distributed mode . I have a table which spans across 5 region servers . I am using TableInputFormat to read the data from the tables in the map . When i run the program , by default how

Re: Doubt in HBase

2009-08-20 Thread Amandeep Khurana
On Thu, Aug 20, 2009 at 9:42 AM, john smith js1987.sm...@gmail.com wrote: Hi all , I have one small doubt . Kindly answer it even if it sounds silly. No questions are silly.. Dont worry Iam using Map Reduce in HBase in distributed mode . I have a table which spans across 5 region

Re: Doubt in HBase

2009-08-20 Thread Jonathan Gray
have one small doubt . Kindly answer it even if it sounds silly. No questions are silly.. Dont worry Iam using Map Reduce in HBase in distributed mode . I have a table which spans across 5 region servers . I am using TableInputFormat to read the data from the tables in the map . When i run

  1   2   >