Hi Alex,
Can you please attach your code? and the sample input data.
Best,
Mahesh Balija,
Calsoft Labs.
On Tue, Apr 30, 2013 at 2:29 AM, wrote:
>
> Hello,
>
> I try to write mapreduce program in hadoop -1.0.4. using mapred libs. I have
> a map function which gets
>
> keys and cr
"It" is '/'?
On Apr 29, 2013, at 5:09 PM, Mohammad Tariq wrote:
> make it 755.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Tue, Apr 30, 2013 at 3:30 AM, Kevin Burton
> wrote:
>> Thank you the HDFS system seems to be up. Now I am having a problem wi
make it 755.
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Tue, Apr 30, 2013 at 3:30 AM, Kevin Burton wrote:
> Thank you the HDFS system seems to be up. Now I am having a problem with
> getting the JobTracker and TaskTracker up. According to the logs on the
> JobTracker
Thank you the HDFS system seems to be up. Now I am having a problem with
getting the JobTracker and TaskTracker up. According to the logs on the
JobTracker mapred doesn't have write permission to /. I am not clear on what
the permissions should be.
Anyway, thank you.
On Apr 29, 2013, at 4:30
Hello Kevin,
Have you reformatted the NN(unsuccessfully)?Was your NN serving
some other cluster earlier or your DNs were part of some other
cluster?Datanodes bind themselves to namenode through namespaceID and in
your case the IDs of DNs and NN seem to be different. As a workaround you
c
If I were to start with a 5 node cluster, I would do this :
*Machine 1 : *NN+JT
32GB RAM, 2xQuad Core Proc, 500GB SATA HDD along with a NAS(To make sure
metadata is safe)
*Machine 2 : *SNN*
*
32GB RAM, 2xQuad Core Proc, 500GB SATA HDD
*Machine 3,4,5 : *DN+TT
16GB RAM, 2xQuad Core Proc, 5 x 200GB
I am trying to start up a cluster and in the datanode log on the
NameNode server I get the error:
2013-04-29 15:50:20,988 INFO
org.apache.hadoop.hdfs.server.common.Storage: Lock on
/data/hadoop/dfs/data/in_use.lock acquired by nodename 1406@devUbuntu05
2013-04-29 15:50:20,990 FATAL
org.apac
Hello,
I try to write mapreduce program in hadoop -1.0.4. using mapred libs. I have a
map function which gets
keys and creates a different object with a few attributes like id and etc and
passes it to reducer function using
output.collect(key, value);
Reducer gets keys, but values has empt
I look in the name node log and I get the following errors:
2013-04-29 15:25:11,646 ERROR
org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:mapred (auth:SIMPLE)
cause:org.apache.hadoop.security.AccessControlException: Permission
denied: user=mapred, access=WRITE,
Hi,
In 5 node cluster - you mean
Name Node , Job Tracker , Secondary Name Node all on 1
64 GB Ram ( Processor - 2 x Quad cores Intel , Storage - ? )
Data Trackers and Job Trackers - on 4 machies - each of
32 GB Ram ( Processor - 2 x Quad cores Intel , Storage - ? )
NIC ?
The env-var is auto-created by the "hadoop" script for you when you
invoke "hadoop jar". You do not necessarily have to manually set it,
nor do you have to compile the native libs if what you're using is
pre-built for your OS.
On Tue, Apr 30, 2013 at 12:52 AM, wrote:
> I don't have this environm
I see a startup error in the
/var/log/hadoop-hdfs/hadoop-hdfs-namenode-.log
2013-04-29 14:12:36,095 FATAL
org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode
join
java.io.IOException: There appears to be a gap in the edit log. We
expected txid 1, but got txid 2103.
Hi,
did you check in your ubuntu installation; "libhadoop" binary.. it is
present in my ubuntu installation at a relative path of (I used apache
installation)
"hadoop-common-project/hadoop-common/target/native/target/usr/local/lib"
if present add it to your LID_LIBRARY_PATH.
if not present then
I think that having more than 6 drives is better.
More memory never hurts. If you have too little, you may have to run with
fewer slots than optimal.
10GB networking is good. If not, having more than 2 1GBe ports is good, at
least on distributions that can deal with them properly.
On Mon, Apr
2 x Quad cores Intel
2-3 TB x 6 SATA
64GB mem
2 NICs teaming
my 2 cents
On Apr 29, 2013, at 9:24 AM, Raj Hadoop
mailto:hadoop...@yahoo.com>>
wrote:
Hi,
I have to propose some hardware requirements in my company for a Proof of
Concept with Hadoop. I was reading Hadoop Operations and also saw
If it doesn't work what are my options? Is there source that I can download and
compile?
On Apr 29, 2013, at 10:31 AM, Ted Xu wrote:
> Hi Kevin,
>
> Native libraries are those implemented using C/C++, which only provide code
> level portability (instead of binary level portability, as Java do
To validate if your jobs are running locally, look for the classname
"LocalJobRunner" in the runtime output.
Configs are sourced either from the classpath (if a dir or jar on the
classpath has the XMLs at their root, they're read), or via the code
(conf.set("mapred.job.tracker", "foo:349");) or al
This is one of the reasons we set up edge nodes in the cluster. This is a node
where Hadoop is loaded yet none of the Hadoop services are running . This
allows jobs to automatically pick up the right Hadoop configuration from the
node and point to the right cluster.
The edge nodes are used for
They can be different if maps read HDFS files directly instead of or on top of
getting key-val pairs via the map interface.
HDFS_BYTES_READ will always be greater than or equal to map-input-bytes.
Thanks,
+Vinod
On Apr 29, 2013, at 1:50 AM, Pralabh Kumar wrote:
> Hi
>
> What's the relationsh
Regards, Raj. To know that data that you want to process with Hadoop is
critical for this, at least an approximation of the data. I think that
Hadoop Operations is an invaluable resource for this:
- Hadoop use heavily RAM, so, the first resource that you have to consider
is to use all available RA
Hi,
I have to propose some hardware requirements in my company for a Proof of
Concept with Hadoop. I was reading Hadoop Operations and also saw Cloudera
Website. But just wanted to know from the group - what is the requirements if I
have to plan for a 5 node cluster. I dont know at this time, t
What's the best way to add a number of jar's to workers class path? Preferably
only adding something to one of the main configuration files (core-site.xml,
mapred-site.xml) since we don't really want to mess with any of the startup
scripts.
Thanks
Hi Han,
I think your point is valid. In fact you can change the progress report
logic by manually calling the Reporter API, but by default it is quite
straight forward. Reducer progress is divided into 3 phases, namely copy
phase, merge/sort phase and reduce phase, each with ~33%. In your case it
Hi Kevin,
Native libraries are those implemented using C/C++, which only provide code
level portability (instead of binary level portability, as Java do). That
is to say, the binaries provided by CDH4 distribution will in most cases be
broken in your environment.
To check if your native libraries
Yes, this indeed seem to be the case. After running java -version and
seeing 1.5 it rung a bell because all our servers (as far as I knew) were
1.6 or above. So I never thought that this would be any issue!! But boy I
was wrong and it indeed turned out to be something so obvious. Thanks guys
for yo
Thanks Ted and .. Ted ..
I've been looking at the progress when the job is executing.
In fact, I think it's not a skewed partition problem. I've looked at the
mapper output files, all are of the same size and the reducer each takes a
single group.
What I want to know is that how hadoop M/R framewor
Well… Bingo! :)
We don't write our projects for 1.5 JVMs, and especially not the GCJ
(1.5 didn't have annotations either IIRC? We depend on that here). Try
with a Sun/Oracle/OpenJDK 1.6 or higher and your problem is solved.
On Mon, Apr 29, 2013 at 8:24 PM, Shahab Yunus wrote:
> The output of "ja
The output of "java -version" is:
java -version
java version "1.5.0"
gij (GNU libgcj) version 4.4.6 20120305 (Red Hat 4.4.6-4)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FIT
This is rather odd and am unable to reproduce this across several
versions. It may even be something to do with all that static loading
done in the VersionInfo class but am unsure at the moment.
What does "java -version" print for you?
On Mon, Apr 29, 2013 at 8:12 PM, Shahab Yunus wrote:
> Okay,
Okay, I think I know what you mean. Those were back ticks!
So I tried the following:
java -cp `hbase classpath` org.apache.hadoop.hbase.util.VersionInfo
and I still get:
13/04/29 09:40:31 INFO util.VersionInfo: HBase Unknown
13/04/29 09:40:31 INFO util.VersionInfo: Subversion Unknown -r Unknow
Ted, Sorry I didn't understand. What do you mean exactly by "specifying
`hbase classpath` "? You mean declare a environment variable
'HBASE_CLASSPATH'?
Regards,
Shaahb
On Mon, Apr 29, 2013 at 10:31 AM, Ted Yu wrote:
> bq. 'java -cp /usr/lib/hbase/hbase...
>
> Instead of hard coding class path
bq. 'java -cp /usr/lib/hbase/hbase...
Instead of hard coding class path, can you try specifying `hbase classpath`
?
Cheers
On Mon, Apr 29, 2013 at 5:52 AM, Shahab Yunus wrote:
> Hello,
>
> This might be something very obvious that I am missing but this has been
> bugging me and I am unable to
Hello,
This might be something very obvious that I am missing but this has been
bugging me and I am unable to find what am I missing?
I have hadoop and hbase installed on Linux machine. Version 2.0.0-cdh4.1.2
and 0.92.1-cdh4.1.2 respectively. They are working and I can invoke hbase
shell and hado
Dear Chandrash3khar K0tekar,
Using the run() method implies implementing Tool and using ToolRunner. This
gives as additional benefit that some "standard" hadoop command line
options are available. See here:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera
Hi
What's the relationship between HDFS_BYTE_READ and Map input bytes counter
. Why can they be different for particular MR job.
Thanks and Regards
> Pralabh Kumar
35 matches
Mail list logo