/hadoop/common/
On Tue, Jul 29, 2014 at 1:36 AM, Jane Wayne jane.wayne2...@gmail.com
wrote:
where can i get the old hadoop documentation (e.g. cluster setup, xml
configuration params) for hadoop v0.22.0 and below? i downloaded the
source
and binary files but could not find
where can i get the old hadoop documentation (e.g. cluster setup, xml
configuration params) for hadoop v0.22.0 and below? i downloaded the source
and binary files but could not find the documentations as a part of the
archive file.
on the home page at http://hadoop.apache.org/, i only see
i am following the instructions to setup a multi-node cluster at
http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
.
my problem is that when i run the script to start up the slave datanodes,
no slave datanode is started (more on this later).
i have two
this is the correct way to start slave datanode daemons (NOTICE THE PLURAL
DAEMONS).
$HADOOP_PREFIX/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script
hdfs start datanode
On Sun, Jul 27, 2014 at 3:11 AM, Jane Wayne jane.wayne2...@gmail.com
wrote:
i am following the instructions to setup
hi,
i have have hadoop v2.3.0 installed on CentOS 6.5 64-bit. OpenJDK 64-bit
v1.7 is my java version.
when i attempt to start hadoop, i keep seeing this message below.
OpenJDK 64-Bit Server VM warning: You have loaded library
/usr/local/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0 which might
i recently made the switch from hadoop 0.20.x to hadoop 2.3.0 (yes, big
leap). i was wondering if there is a way to view my jobs now via a web UI?
i used to be able to do this by accessing the following URL
http://hadoop-cluster:50030/jobtracker.jsp
however, there is no more job tracker
after it is done with its work. Once it is done, you
can go look at it in the MapReduce specific JobHistoryServer.
+Vinod
On Mar 6, 2014, at 1:11 PM, Jane Wayne jane.wayne2...@gmail.com wrote:
i recently made the switch from hadoop 0.20.x to hadoop 2.3.0 (yes, big
leap). i was wondering
ok, the reason why hadoop jobs were not showing up was because i did not
enable mapreduce to be run as a yarn application.
On Thu, Mar 6, 2014 at 11:45 PM, Jane Wayne jane.wayne2...@gmail.comwrote:
when i go to the job history server
http://hadoop-cluster:19888/jobhistory
i see no map
i am using hadoop v2.3.0.
in my hdfs-site.xml, i have the following property set.
property
namedfs.permissions.enabled/name
valuefalse/value
/property
however, when i try to run a hadoop job, i see the following
AccessControlException.
, Harsh J ha...@cloudera.com wrote:
I don't think you ought to be using HADOOP_HOME anymore.
Try unset HADOOP_HOME and then export HADOOP_PREFIX=/opt/hadoop
and retry the NN command.
On Sun, Aug 11, 2013 at 8:50 AM, Jane Wayne jane.wayne2...@gmail.com
wrote:
hi,
i have downloaded
hi,
i have downloaded and untarred hadoop v0.23.9. i am trying to set up a
single node instance to learn this version of hadop. also, i am following
as best as i can, the instructions at
http://hadoop.apache.org/docs/r0.23.9/hadoop-project-dist/hadoop-common/SingleCluster.html
.
when i attempt
vote would
be the Name node. ;-)
HTH
-Mike
On May 16, 2013, at 10:34 AM, Niels Basjes ni...@basjes.nl wrote:
If you make sure that everything uses NTP then this becomes an irrelevant
distinction.
On Thu, May 16, 2013 at 4:01 PM, Jane Wayne jane.wayne2...@gmail.com
wrote:
yes
You are searching for a solution in the Hadoop API (where this does not
exist)
thanks, that's all i needed to know.
cheers.
On Fri, May 17, 2013 at 9:17 AM, Niels Basjes ni...@basjes.nl wrote:
Hi,
i have another computer (which i have referred to as a server, since it
is
running
if NTP is correclty used
that's the key statement. in several of our clusters, NTP setup is kludgy.
note that the professionals administering the cluster are different from
us the engineers. so, there's a lot of red tape to go through to get
something trivial or not fixed. we have noticed that
and please remember, i stated that although the hadoop cluster uses NTP,
the server (the machine that is not a part of the hadoop cluster) cannot
assume to be using NTP (and in fact, doesn't).
On Fri, May 17, 2013 at 10:10 AM, Jane Wayne jane.wayne2...@gmail.comwrote:
if NTP is correclty used
information:
http://stackoverflow.com/questions/833768/java-code-for-getting-current-time
If you have a client that is not under NTP then that should be the way to
fix your issue.
Once you have that getting the current time is easy.
Niels Basjes
On Tue, May 14, 2013 at 5:46 PM, Jane Wayne
hi all,
is there a way to get the current time of a hadoop cluster via the
api? in particular, getting the time from the namenode or jobtracker
would suffice.
i looked at JobClient but didn't see anything helpful.
with the server).
On Tue, May 14, 2013 at 11:38 AM, Niels Basjes ni...@basjes.nl wrote:
If you have all nodes using NTP then you can simply use the native Java SPI
to get the current system time.
On Tue, May 14, 2013 at 4:41 PM, Jane Wayne jane.wayne2...@gmail.comwrote:
hi all,
is there a way to get
AM, Jane Wayne jane.wayne2...@gmail.comwrote:
hi,
i need to know how to resolve conflicts with jar dependencies.
* first, my job requires Jackson JSON-processor v1.9.11.
* second, the hadoop cluster has Jackson JSON-processor v1.5.2. the
jars are installed in $HADOOP_HOME/lib.
according
i'm on windows using AWS EMR/EC2. i use the ruby client to manipulate AWS EMR.
1. spawn an EMR cluster. this should return a jobflow id (jobflow-id).
ruby elastic-mapreduce --create --name j-med --alive --num-instances
10 --instance-type c1.medium
2. run a job. you need to describe the job
there's probably a million ways to do it, but it seems like it can be done,
per your question. off the top of my head, you'd probably want to do
the cumulative sum in the reducer. if you're savy, maybe even make the
reducer reusable as a combiner (looks like this problem might have an
associative
will have to come up with a composite key). when the data comes
into the reducer, just keep a running count and emit each time.
On Fri, Oct 5, 2012 at 11:21 AM, Jane Wayne jane.wayne2...@gmail.comwrote:
there's probably a million ways to do it, but it seems like it can be
done, per your question
it but whether it is relevant or not to you will
depend on your context.
Regards
Bertrand
On Wed, Sep 26, 2012 at 5:36 PM, Jane Wayne jane.wayne2...@gmail.comwrote:
hi,
i know that some algorithms cannot be parallelized and adapted to the
mapreduce paradigm. however, i have noticed that in most cases
://research.google.com/pubs/pub36632.html (dremel) is from 2010.
Regards
Bertrand
On Wed, Sep 26, 2012 at 8:18 PM, Jane Wayne jane.wayne2...@gmail.comwrote:
jay,
thanks. i just needed a sanity check. i hope and expect that one day,
hadoop will mature towards supporting a shared-something
jay,
thanks. i just needed a sanity check. i hope and expect that one day,
hadoop will mature towards supporting a shared-something approach.
the web service call is not a bad idea at all. that way, we can
abstract what that ultimate data store really is.
i'm just a little surprised that we are
Sandeep,
How are you guys moving 100 TB into the AWS cloud? Are you using S3 or
EBS? If you are using S3, it does not work like HDFS. Although data is
replicated (I believe within an availability zone) in S3, it is not
the same as HDFS replication. You lose the data locality optimization
feature
i am currently testing my map reduce job on Windows + Cygwin + Hadoop
v0.20.205. for some strange reason, the list of values (i.e.
IterableT values) going into the reducer looks all wrong. i have
tracked the map reduce process with logging statements (i.e. logged
the input to the map, logged the
it = values.iterator();
Value a = it.next();
Value b = it.next();
}
the variables, a and b of type Value, will be the same object
instance! i suppose this behavior of the iterator is to optimize
iterating so as to avoid the new operator.
On Thu, Apr 5, 2012 at 4:55 PM, Jane Wayne jane.wayne2
serge, i specify 15 instances, but only 14 end up being data/tasks
nodes. 1 instance is reserved as the name node (job tracker).
On Wed, Apr 4, 2012 at 1:17 PM, Serge Blazhievsky
serge.blazhiyevs...@nice.com wrote:
How many datanodes do you use fir your job?
On 4/3/12 8:11 PM, Jane Wayne
i have a map reduce job that is generating a lot of intermediate key-value
pairs. for example, when i am 1/3 complete with my map phase, i may have
generated over 130,000,000 output records (which is about 9 gigabytes). to
get to the 1/3 complete mark is very fast (less than 10 minutes), but at
to 512Mb
- increase map task heap size to 2GB.
If the task still stalls, try providing lesser input for each mapper.
Regards
Bejoy KS
On Tue, Apr 3, 2012 at 2:08 PM, Jane Wayne jane.wayne2...@gmail.com wrote:
i have a map reduce job that is generating a lot of intermediate key-value
pairs
the vint bytes to get the length of the
following byte array.
So when you call the compareBytes method you need to pass in where the
actual bytes start (s1 + vIntLen) and how many bytes to compare (vint)
On Mar 31, 2012 12:38 AM, Jane Wayne jane.wayne2...@gmail.com wrote:
in tom white's book
a reference to
a byte array, not making a copy of the array.
Chris
On Sat, Mar 31, 2012 at 12:23 AM, Jane Wayne jane.wayne2...@gmail.com
wrote:
i have a RawComparator that i would like to unit test (using mockito and
mrunit testing packages). i want to test the method,
public int
in tom white's book, Hadoop, The Definitive Guide, in the second edition,
on page 99, he shows how to compare the raw bytes of a key with Text
fields. he shows an example like the following.
int firstL1 = WritableUtils.decodeVIntSize(b1[s1]) + readVInt(b1, s1);
int firstL2 =
i have a matrix that i am performing operations on. it is 10,000 rows by
5,000 columns. the total size of the file is just under 30 MB. my HDFS
block size is set to 64 MB. from what i understand, the number of mappers
is roughly equal to the number of HDFS blocks used in the input. i.e. if my
, Anil Gupta anilgupt...@gmail.com wrote:
Have a look at NLineInputFormat class in Hadoop. That class will solve
your purpose.
Best Regards,
Anil
On Mar 20, 2012, at 11:07 PM, Jane Wayne jane.wayne2...@gmail.com wrote:
i have a matrix that i am performing operations on. it is 10,000 rows
]);
int n2 = WritableUtils.decodeVIntSize(b2[s2]);
return compareBytes(b1, s1+n1, l1-n1, b2, s2+n2, l2-n2);
}
}
static {
// register this comparator
WritableComparator.define(Text.class, new Comparator());
}
Chris
On Tue, Mar 20, 2012 at 2:47 AM, Jane Wayne
On Mar 14, 2012 2:31 AM, Jane Wayne jane.wayne2...@gmail.com wrote:
i am using the new org.apache.hadoop.mapreduce.Partitioner class.
however,
i need to pass it some background information. how can i do this?
in the old, org.apache.hadoop.mapred.Partitioner class (now deprecated
/3686
I think the Partitioner class guarantees that you will have multiple
reducers.
On Thu, Mar 8, 2012 at 6:30 PM, Jane Wayne jane.wayne2...@gmail.com
wrote:
i am wondering if hadoop always respect Job.setNumReduceTasks(int)?
as i am emitting items from the mapper, i expect/desire only 1
i am wondering if hadoop always respect Job.setNumReduceTasks(int)?
as i am emitting items from the mapper, i expect/desire only 1 reducer to
get these items because i want to assign each key of the key-value input
pair a unique integer id. if i had 1 reducer, i can just keep a local
counter
the reducer at all.
Yeah the map output will go to HDFS directly. It's called map-only job.
Jie
On Thursday, March 8, 2012, Jane Wayne wrote:
Jie,
so if if i set the number of reduce tasks to 0, do i need to specify the
reducer (or should i set it null)? if i don't specify the reducer
by you.
Fix the input path line to use a different config:
Path input = new Path(conf.get(input.path));
And run job as:
hadoop jar dummy-0.1.jar dummy.MyJob -Dinput.path=data/dummy.txt
-Dmapred.output.dir=result
On Tue, Mar 6, 2012 at 9:03 AM, Jane Wayne jane.wayne2...@gmail.com
wrote
of the
file).
-Joey
On Tue, Mar 6, 2012 at 9:53 AM, Jane Wayne jane.wayne2...@gmail.com
wrote:
hi,
i am writing a little util class to recurse into a directory and add all
*.txt files into a sequence file (key is the file name, value is the
content of the corresponding text file). as i am
43 matches
Mail list logo