Hello:
I'm meeting Hadoop and I have the following question.
It is a general doubt about what offers the Hadoop project.
"The Apache Hadoop software library is a framework That Allows for the
distributed processing of large data sets across clusters of computers using
a simple progra
Please try this
for (DoubleArrayWritable avalue : values) {
Writable[] value = avalue.get();
// DoubleWritable[] value = new DoubleWritable[6];
// for(int k=0;k<6;k++){
// value[k] = DoubleWritable(wvalue[k]);
// }
//parse accordingly
if (Double.parseDouble(value[1].toString()) != 0) {
Hi All,
I am going through Quorum Journal Design document.
It is mentioned in Section 2.8 - In Accept Recovery RPC section
If the current on-disk log is missing, or a *different length *than the
proposed recovery, the JN downloads the log from the provided URI,
replacing any current copy of the
Hi
A developer should answer that but a quick look to an edit file with od
suggests that record are not fixed length. So maybe the likeliness of
the situation you suggest is so low that there is no need to check more
than file size
Ulul
Le 28/09/2014 11:17, Giridhar Addepalli a écrit :
Hi
:37:56 -0400
Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility with
old *mapred* APIs and new *mapreduce* APIs in Hadoop
From: john.meag...@gmail.com
To: user@hadoop.apache.org
Also, Source Compatibility also means ONLY a recompile is needed.
No code changes should
:56 -0400
Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility
with old *mapred* APIs and new *mapreduce* APIs in Hadoop
From: john.meag...@gmail.com
To: user@hadoop.apache.org
Also, Source Compatibility also means ONLY a recompile is needed.
No code changes should
2014 13:03:53 -0700
Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility with old
*mapred* APIs and new *mapreduce* APIs in Hadoop
From: zs...@hortonworks.com
To: user@hadoop.apache.org
1. If you have the binaries that were compiled against MRv1 mapred libs, it
should just work
file is execute it.
-RR
--
Date: Tue, 15 Apr 2014 13:03:53 -0700
Subject: Re: Doubt regarding Binary Compatibility\Source Compatibility
with old *mapred* APIs and new *mapreduce* APIs in Hadoop
From: zs...@hortonworks.com
To: user@hadoop.apache.org
1. If you
Hello People,
As per the Apache site
http://hadoop.apache.org/docs/r2.3.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html
Binary CompatibilityFirst, we ensure binary compatibility
to the applications that use old mapred
Also, Source Compatibility also means ONLY a recompile is needed.
No code changes should be needed.
On Mon, Apr 14, 2014 at 10:37 AM, John Meagher john.meag...@gmail.com wrote:
Source Compatibility = you need to recompile and use the new version
as part of the compilation
Binary Compatibility
Hi all,
is it possible to install Mongodb on the same VM which consists hadoop?
--
amiable harsha
Certainly it is , and quite common especially if you have some high
performance machines : they can run as mapreduce slaves and also double as
mongo hosts. The problem would of course be that when running mapreduce
jobs you might have very slow network bandwidth at times, and if your front
end
thank s jay and praveen,
i want to use both separately don't want to use mongodb in the place of
hbase
On Wed, Mar 19, 2014 at 9:25 PM, Jay Vyas jayunit...@gmail.com wrote:
Certainly it is , and quite common especially if you have some high
performance machines : they can run as mapreduce
Why not ? Its just a matter of installing 2 different packages.
Depends on what do you want to use it for, you need to take care of few
things, but as far as installation is concerned, it should be easily doable.
Regards
Prav
On Wed, Mar 19, 2014 at 3:41 PM, sri harsha rsharsh...@gmail.com
I've installed a hadoop single node cluster on a VirtualBox machine running
ubuntu 12.04LTS (64-bit) with 512MB RAM and 8GB HD. I haven't seen any
errors in my testing yet. Is 1GB RAM required? Will I run into issues when
I expand the cluster?
On Sat, Jan 18, 2014 at 11:24 PM, Alexander
Hi ,
i want to install 4 node cluster in 64-bit LINUX. 4GB RAM 500HD is enough
for this or shall i need to expand ?
please suggest about my query.
than x
--
amiable harsha
it' enough. hadoop uses only 1GB RAM by default.
On Sat, Jan 18, 2014 at 10:11 PM, sri harsha rsharsh...@gmail.com wrote:
Hi ,
i want to install 4 node cluster in 64-bit LINUX. 4GB RAM 500HD is enough
for this or shall i need to expand ?
please suggest about my query.
than x
--
amiable
KS
Sent from remote device, Please excuse typos
--
*From: * Raj Hadoop hadoop...@yahoo.com
*Date: *Tue, 16 Apr 2013 21:49:34 -0700 (PDT)
*To: *user@hadoop.apache.orguser@hadoop.apache.org
*ReplyTo: * user@hadoop.apache.org
*Subject: *Basic Doubt in Hadoop
Hi
: Basic Doubt in Hadoop
Hi Bejoy,
Regarding the output of Map phase, does Hadoop store it in local fs or
in HDFS.
I believe it is in the former. Correct me if I am wrong.
Regards
Ramesh
On Wed, Apr 17, 2013 at 10:30 AM, bejoy.had...@gmail.com wrote:
The data is in HDFS in case of WordCount
Hi,
I am new to Hadoop. I started reading the standard Wordcount program. I got
this basic doubt in Hadoop.
After the Map - Reduce is done, where is the output generated? Does the
reducer ouput sit on individual DataNodes ? Please advise.
Thanks,
Raj
: Basic Doubt in Hadoop
Hi,
I am new to Hadoop. I started reading the standard Wordcount program. I got
this basic doubt in Hadoop.
After the Map - Reduce is done, where is the output generated? Does the
reducer ouput sit on individual DataNodes ? Please advise.
Thanks,
Raj
Hello Jamal,
For efficient processing all the values associated with the same key
get sorted and go to same reducer. As a result the reducer gets a key and a
list of values as its input. To me your assumption seems correct.
Regards,
Mohammad Tariq
On Thu, Nov 22, 2012 at 1:20 AM,
it.
Regards
Bejoy KS
Sent from handheld, please excuse typos.
-Original Message-
From: jamal sasha jamalsha...@gmail.com
Date: Wed, 21 Nov 2012 14:50:51
To: user@hadoop.apache.orguser@hadoop.apache.org
Reply-To: user@hadoop.apache.org
Subject: fundamental doubt
Hi..
I guess i am asking alot
@hadoop.apache.orguser@hadoop.apache.org
*ReplyTo: * user@hadoop.apache.org
*Subject: *fundamental doubt
Hi..
I guess i am asking alot of fundamental questions but i thank you guys for
taking out time to explain my doubts.
So i am able to write map reduce jobs but here is my mydoubt
As of now
The answer (a) is correct, in general.
On Wed, Nov 7, 2012 at 6:09 PM, Ramasubramanian Narayanan
ramasubramanian.naraya...@gmail.com wrote:
Hi,
Which of the following is correct w.r.t mapper.
(a) It accepts a single key-value pair as input and can emit any number of
key-value pairs as
Hi Rams,
A mapper will accept single key-value pair as input and can emit
0 or more key-value pairs based on what you want to do in mapper function
(I mean based on your business logic in mapper function).
But the framework will actually aggregate the list of values
Please do not mail general@ with user/dev questions. Use the user@
alias for it in future.
The IdentityMapper and IdentityReducer is what TeraSort uses (it is
not needed/hadoop does sort on default - uses default
mapper/reducer).
On Wed, Sep 26, 2012 at 10:08 PM, Nitin Khandelwal
Message -
From: Harsh J ha...@cloudera.com
To: common-user@hadoop.apache.org; Raj Vishwanathan rajv...@yahoo.com
Cc:
Sent: Saturday, August 25, 2012 4:02 AM
Subject: Re: doubt about reduce tasks and block writes
Raj's almost right. In times of high load or space fillup on a local
DN
Thanks, Raj you got exactly my point. I wanted to confirm this assumption as
I was guessing if a shared HDFS cluster with MR and Hbase like this would
make sense:
http://old.nabble.com/HBase-User-f34655.html
--
View this message in context:
http://lucene.472066.n3.nabble.com/doubt-about-reduce
sure about this , but there could be corner cases in case of
node failure and such like! I need to look into the code.
Raj
From: Marc Sturlese marc.sturl...@gmail.com
To: hadoop-u...@lucene.apache.org
Sent: Friday, August 24, 2012 1:09 PM
Subject: doubt about
Hey there,
I have a doubt about reduce tasks and block writes. Do a reduce task always
first write to hdfs in the node where they it is placed? (and then these
blocks would be replicated to other nodes)
In case yes, if I have a cluster of 5 nodes, 4 of them run DN and TT and one
(node A) just run
Marc, see my inline comments.
On Fri, Aug 24, 2012 at 4:09 PM, Marc Sturlese marc.sturl...@gmail.comwrote:
Hey there,
I have a doubt about reduce tasks and block writes. Do a reduce task always
first write to hdfs in the node where they it is placed? (and then these
blocks would
on one node without a replica of the data then your node A is as
likely as any other to be chosen as a source.
Regards
Bertrand
On Fri, Aug 24, 2012 at 10:09 PM, Marc Sturlese marc.sturl...@gmail.comwrote:
Hey there,
I have a doubt about reduce tasks and block writes. Do a reduce task always
, August 24, 2012 1:09 PM
Subject: doubt about reduce tasks and block writes
Hey there,
I have a doubt about reduce tasks and block writes. Do a reduce task always
first write to hdfs in the node where they it is placed? (and then these
blocks would be replicated to other nodes)
In case yes, if I
for the DistributedCache of the
job, if necessary.
Copying the job's jar and configuration to the map-reduce system directory
on the distributed file-system.
Submitting the job to the JobTracker and optionally monitoring it's status.
I have a doubt in 4th point of job execution flow could any
monitoring it's
status.
I have a doubt in 4th point of job execution flow could any of you
explain
it?
What is job's jar?
The job.jar is the jar you supply via hadoop jar jar. Technically
though, it is the jar pointed by JobConf.getJar() (Set via setJar or
setJarByClass calls
the requisite accounting information for the DistributedCache of
the
job, if necessary.
Copying the job's jar and configuration to the map-reduce system
directory
on the distributed file-system.
Submitting the job to the JobTracker and optionally monitoring it's
status.
I have a doubt
On Wed, Apr 4, 2012 at 10:02 PM, Prashant Kommireddi prash1...@gmail.comwrote:
Hi Mohit,
What would be the advantage? Reducers in most cases read data from all
the mappers. In the case where mappers were to write to HDFS, a
reducer would still require to read data from other datanodes across
On Thu, Apr 5, 2012 at 7:03 AM, Mohit Anchlia mohitanch...@gmail.com wrote:
Only advantage I was thinking of was that in some cases reducers might be
able to take advantage of data locality and avoid multiple HTTP calls, no?
Data is anyways written, so last merged file could go on HDFS instead
I am going through the chapter How mapreduce works and have some
confusion:
1) Below description of Mapper says that reducers get the output file using
HTTP call. But the description under The Reduce Side doesn't specifically
say if it's copied using HTTP. So first confusion, Is the output copied
Answers inline.
On Wed, Apr 4, 2012 at 4:56 PM, Mohit Anchlia mohitanch...@gmail.comwrote:
I am going through the chapter How mapreduce works and have some
confusion:
1) Below description of Mapper says that reducers get the output file using
HTTP call. But the description under The Reduce
Hi Mohit,
On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia mohitanch...@gmail.com wrote:
I am going through the chapter How mapreduce works and have some
confusion:
1) Below description of Mapper says that reducers get the output file using
HTTP call. But the description under The Reduce Side
On Wed, Apr 4, 2012 at 8:42 PM, Harsh J ha...@cloudera.com wrote:
Hi Mohit,
On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia mohitanch...@gmail.com
wrote:
I am going through the chapter How mapreduce works and have some
confusion:
1) Below description of Mapper says that reducers get the
Hi Mohit,
What would be the advantage? Reducers in most cases read data from all
the mappers. In the case where mappers were to write to HDFS, a
reducer would still require to read data from other datanodes across
the cluster.
Prashant
On Apr 4, 2012, at 9:55 PM, Mohit Anchlia
Hi people,
Please, I would like ask something a bit more high level than programing
for Hadoop.
I will have some students working with Hive, Pig or H-Base (I don't which
of them yet) and I would like to know if somebody here has already use
Hadoop from Amazon EC2 integrated to one ot these other
is, according to the size of data to be
processed on a particular node, proportionate number of reduce tasks will
be run on different nodes.
please some body clarify this basic doubt .. which is correct? If none,
what is the actual process that takes place
--
*Regards*
*
Vamshi Krishna
*
From the fairscheduler docs I assume the following should work:
property
namemapred.fairscheduler.poolnameproperty/name
valuepool.name/value
/property
property
namepool.name/name
value${mapreduce.job.group.name}/value
/property
which means that the default pool will be the group of
: Thursday, March 01, 2012 9:33 AM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop fair scheduler doubt: allocate jobs to pool
From the fairscheduler docs I assume the following should work:
property
namemapred.fairscheduler.poolnameproperty/name
valuepool.name/value
/property
property
property on the
Job Conf to the name of the pool you want the job to use.
Dave
-Original Message-
From: Merto Mertek [mailto:masmer...@gmail.com]
Sent: Thursday, March 01, 2012 9:33 AM
To: common-user@hadoop.apache.org
Subject: Re: Hadoop fair scheduler doubt: allocate jobs to pool
Hi,
I tried what you had said. I added the following to mapred-site.xml:
property
namemapred.fairscheduler.poolnameproperty/name
valuepool.name/value
/property
property
namepool.name/name
value${mapreduce.job.group.name}/value
/property
Funny enough it created a pool with the name
How can I set the fair scheduler such that all jobs submitted from a
particular user group go to a pool with the group name?
I have setup fair scheduler and I have two users: A and B (belonging to the
user group hadoop)
When these users submit hadoop jobs, the jobs from A got to a pool named A
Hi All,
I have the doubt in avatar node setup .
I configure the avatarnode using the patch
https://issues.apache.org/jira/browse/HDFS-976
Am I need to configure the NFS filer for share the FSimage file between active
and standby avatarnodes ?
What is the other configurations needed
Hi All,
I have the doubt in avatar node setup .
I configure the avatarnode using the patch
https://issues.apache.org/jira/browse/HDFS-976
Am I need to configure the NFS filer for share the FSimage file between active
and standby avatarnodes ?
What is the other configurations needed
Hi All,
I have the doubt in avatar node setup .
I configure the avatarnode using the patch
https://issues.apache.org/jira/browse/HDFS-976
Am I need to configure the NFS filer for share the FSimage file between active
and standby avatarnodes ?
What is the other configurations needed
Hi All,
I have the doubt in avatar node setup .
I configure the avatarnode using the patch
https://issues.apache.org/jira/browse/HDFS-976
Am I need to configure the NFS filer for share the FSimage file between active
and standby avatarnodes ?
What is the other configurations needed
Narayanan,
On Fri, Jul 1, 2011 at 11:28 AM, Narayanan K knarayana...@gmail.com wrote:
Hi all,
We are basically working on a research project and I require some help
regarding this.
Always glad to see research work being done! What're you working on? :)
How do I submit a mapreduce job from
Narayanan,
On Fri, Jul 1, 2011 at 12:57 PM, Narayanan K knarayana...@gmail.com wrote:
So the report will be run from a different machine outside the cluster. So
we need a way to pass on the parameters to the hadoop cluster (master) and
initiate a mapreduce job dynamically. Similarly the output
Narayanan,
Regarding the client installation, you should make sure that client and
server use same version hadoop for submitting jobs and transfer data.
if you use a different user in client than the one runs hadoop job, config
the hadoop ugi property (sorry i forget the exact name).
在 2011 7 1
Hi all,
We are basically working on a research project and I require some help
regarding this.
I had a few basic doubts regarding submission of Map-Reduce jobs in Hadoop.
1. How do I submit a mapreduce job from outside the cluster i.e from a
different machine outside the Hadoop
Hi,
I have an account on a cluster which is having a file system similar to
NFS. If I create a file on one machine it is being shown on all the machines
in the cluster. But hadoop will work on a cluster of machines, where in ,
each machine has a disk of its own. Can someone please help me use
Subject: Doubt: Regarding running Hadoop on a cluster with shared disk.
From: udaya...@gmail.com
To: common-user@hadoop.apache.org
Hi,
I have an account on a cluster which is having a file system similar to
NFS. If I create a file on one machine it is being shown on all the machines
Hi,
I am given an account on a cluster which uses OpenPBS as the cluster
management software. The only way I can run a job is by submitting it to
OpenPBS. How to run mapreduce programs on it? Is there any possible work
around?
Thanks,
Udaya.
HOD supports a PBS environment, namely Torque. Torque is the vastly
improved fork of OpenPBS. You may be able to get HOD working on OpenPBS,
or better still persuade your cluster admins to upgrade to a more recent
version of Torque (e.g. at least 2.1.x)
Craig
On 22/07/28164 20:59, Udaya
Thank you Craig. My cluster has got Torque. Can you please point me
something which will have detailed explanation about using HOD on Torque.
On Tue, May 4, 2010 at 10:17 PM, Craig Macdonald cra...@dcs.gla.ac.ukwrote:
HOD supports a PBS environment, namely Torque. Torque is the vastly
improved
Udaya,
Following link will help you for HOD on torque.
http://hadoop.apache.org/common/docs/r0.20.0/hod_user_guide.html
Thanks,
---
Peeyush
On Tue, 2010-05-04 at 22:49 +0530, Udaya Lakshmi wrote:
Thank you Craig. My cluster has got Torque. Can you please point me
something which will have
On May 4, 2010, at 7:46 AM, Udaya Lakshmi wrote:
Hi,
I am given an account on a cluster which uses OpenPBS as the cluster
management software. The only way I can run a job is by submitting it to
OpenPBS. How to run mapreduce programs on it? Is there any possible work
around?
Take a look
Thank you.
Udaya.
On Wed, May 5, 2010 at 12:23 AM, Allen Wittenauer
awittena...@linkedin.comwrote:
On May 4, 2010, at 7:46 AM, Udaya Lakshmi wrote:
Hi,
I am given an account on a cluster which uses OpenPBS as the cluster
management software. The only way I can run a job is by
Hi,
I have written to a sideeffect file using SequenceFile.Writer . But when
I cat the file, it is printing some unreadable characters . I did not use
any compression code.
Why is this so?
Thanks,
Ringa.
The SequenceFile is not text file, so you can not see the content by
invoking unix command cat.
But you can get the text content by using hadoop command : hadoop fs -text
src
On Sun, Feb 7, 2010 at 8:51 AM, Andiana Squazo Ringa
andriana.ri...@gmail.com wrote:
Hi,
I have written to a
Thanks a lot Jeff.
Ringa.
On Sun, Feb 7, 2010 at 10:30 PM, Jeff Zhang zjf...@gmail.com wrote:
The SequenceFile is not text file, so you can not see the content by
invoking unix command cat.
But you can get the text content by using hadoop command : hadoop fs -text
src
On Sun, Feb 7,
Hi all..
I have searched the documentation but could not find a input file
format which will give line number as the key and line as the value.
Did I miss something? Can someone give me a clue of how to implement
one such input file format.
Thanks,
Udaya.
Hi,
For global line numbers, you would need to know the ordering within each split
generated from the input file. The standard input formats provide offsets in
splits, so if the records are of equal length you can compute some kind of
numbering.
I remember someone had implemented sequential
Thank you Amogh.
On Thu, Jan 28, 2010 at 3:44 PM, Amogh Vasekar am...@yahoo-inc.com wrote:
Hi,
For global line numbers, you would need to know the ordering within each
split generated from the input file. The standard input formats provide
offsets in splits, so if the records are of equal
I too had the doubt but could not find the clue. However Please post the
code if u can find it.
On Thu, Jan 28, 2010 at 4:03 PM, Ravi ravindra.babu.rav...@gmail.comwrote:
Thank you Amogh.
On Thu, Jan 28, 2010 at 3:44 PM, Amogh Vasekar am...@yahoo-inc.comwrote:
Hi,
For global line numbers
Hi,
Here's the relevant thread with Gordon, the author of the solution:
I am in the process of learning Hadoop (and I think I've made a lot of
progress). I have described the specific problem and solution on my blog
Thank you Amogh
Ravi.
On 1/28/10, Amogh Vasekar am...@yahoo-inc.com wrote:
Hi,
Here's the relevant thread with Gordon, the author of the solution:
I am in the process of learning Hadoop (and I think I've made a lot of
progress). I have described the specific problem and solution on my blog
Thank you Amogh. I will go through the link.
Udaya.
On 1/28/10, Ravi ravindra.babu.rav...@gmail.com wrote:
Thank you Amogh
Ravi.
On 1/28/10, Amogh Vasekar am...@yahoo-inc.com wrote:
Hi,
Here's the relevant thread with Gordon, the author of the solution:
I am in the process of learning
)
{
flag=false;
/* section of code */
}
}
I am running this code on in pseudo-distributed mode and its working fine .
I doubt whether this runs correctly in distributed mode because , mappers on
other systems have to notified of the changed flag .. Any Comments
*/
}
}
I am running this code on in pseudo-distributed mode and its working fine .
I doubt whether this runs correctly in distributed mode because , mappers on
other systems have to notified of the changed flag .. Any Comments ? If
this is wrong , any suggestions on what method I must
;
/* section of code */
}
}
I am running this code on in pseudo-distributed mode and its working fine .
I doubt whether this runs correctly in distributed mode because , mappers
on
other systems have to notified of the changed flag .. Any Comments ? If
this is wrong , any suggestions
{
if(flag)
{
flag=false;
/* section of code */
}
}
I am running this code on in pseudo-distributed mode and its working fine .
I doubt whether this runs correctly in distributed mode because , mappers
on
other systems have
.
Main-Class
{
public boolean flag = true;
Map-Class
{
if(flag)
{
flag=false;
/* section of code */
}
}
I am running this code on in pseudo-distributed mode and its working fine .
I doubt whether this runs
reduce job in command
line or IDE? in map reduce
mode, you should put the jar containing the
map and reduce class in
your classpath
Jeff Zhang
On Fri, Nov 27, 2009 at 2:19 PM,
 wrote: Hello Everybody,
  Â
     I have
a doubt in Haddop and was wondering
, you should put the jar containing the map and reduce class in
your classpath
Jeff Zhang
On Fri, Nov 27, 2009 at 2:19 PM, wrote:
Hello Everybody,
I have a doubt in Haddop and was wondering if
anybody has faced a
similar problem. I have a package called test. Inside
Do you run the map reduce job in command line or IDE? in map reduce mode,
you should put the jar containing the map and reduce class in your classpath
Jeff Zhang
On Fri, Nov 27, 2009 at 2:19 PM, aa...@buffalo.edu wrote:
Hello Everybody,
I have a doubt in Haddop
in command line or IDE? in map reduce
mode, you should put the jar containing the map and reduce class in
your classpath
Jeff Zhang
On Fri, Nov 27, 2009 at 2:19 PM, wrote:
Hello Everybody,
I have a doubt in Haddop and was wondering if
anybody has faced a
similar problem. I
-code-doubt-tp25916994p25916994.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
Shwitzu,
Why can't you just use query thru a Filesystem object and find the file
you want?
Lajos
shwitzu wrote:
Hello All,
I was wondering if our map reduce code can just return the location of the
file? Or place the actual file in a given output directory by searching
based on a keyword.
Hi,
I am running a map reduce program which reads data from a file,
processes it and writes the output into another file.
i run 4 maps and 4 reduces, and my output is as follows:
09/08/27 17:34:37 INFO mapred.JobClient: Running job: job_200908271142_0026
09/08/27 17:34:38 INFO
But reducer can do some preparations during map process. It can
distribute map output across nodes that will work as reducers.
Copying and sorting map output is also time costuming process (maybe,
more consuming than reduce itself). For example, piece job run log on
40node cluster
could be
processes may go waste ? is that the case?
Also I have one more doubt ..I have 5 values for a corresponding key on
one
region and other 2 values on 2 different region servers.
Does hadoop Map reduce take care of moving these 2 diff values to the
region
with 5 values instead of moving those 5
doubt ..I have 5 values for a corresponding key on one
region and other 2 values on 2 different region servers.
Does hadoop Map reduce take care of moving these 2 diff values to the region
with 5 values instead of moving those 5 values to other system to minimize
the dataflow? Is this what
the number of
reduce tasks is more than number of distinct map output keys , some of
the
reduce processes may go waste ? is that the case?
Also I have one more doubt ..I have 5 values for a corresponding key on
one
region and other 2 values on 2 different region servers.
Does hadoop Map
? is that the case?
Also I have one more doubt ..I have 5 values for a corresponding key on
one
region and other 2 values on 2 different region servers.
Does hadoop Map reduce take care of moving these 2 diff values to the
region
with 5 values instead of moving those 5 values to other system
? is that the case?
Also I have one more doubt ..I have 5 values for a corresponding key on
one
region and other 2 values on 2 different region servers.
Does hadoop Map reduce take care of moving these 2 diff values to the
region
with 5 values instead of moving those 5 values to other
of distinct map output keys , some of
the
reduce processes may go waste ? is that the case?
Also I have one more doubt ..I have 5 values for a corresponding key on
one
region and other 2 values on 2 different region servers.
Does hadoop Map reduce take care of moving these 2 diff values
very useful.
You said to increase the number of reduce tasks . Suppose the number
of
reduce tasks is more than number of distinct map output keys , some of
the
reduce processes may go waste ? is that the case?
Also I have one more doubt ..I have 5 values for a corresponding key
Hi all ,
I have one small doubt . Kindly answer it even if it sounds silly.
Iam using Map Reduce in HBase in distributed mode . I have a table which
spans across 5 region servers . I am using TableInputFormat to read the data
from the tables in the map . When i run the program , by default how
On Thu, Aug 20, 2009 at 9:42 AM, john smith js1987.sm...@gmail.com wrote:
Hi all ,
I have one small doubt . Kindly answer it even if it sounds silly.
No questions are silly.. Dont worry
Iam using Map Reduce in HBase in distributed mode . I have a table which
spans across 5 region
have one small doubt . Kindly answer it even if it sounds silly.
No questions are silly.. Dont worry
Iam using Map Reduce in HBase in distributed mode . I have a table which
spans across 5 region servers . I am using TableInputFormat to read the
data
from the tables in the map . When i run
1 - 100 of 134 matches
Mail list logo