Hello everyone,
we're happy to announce that we have just released Pydoop 0.5.0
(http://pydoop.sourceforge.net).
The main changes with respect to the previous version are:
* Pydoop now works with Hadoop 1.0.0.
* Support for multiple Hadoop versions with the same Pydoop installation
* Easy
Dear all,
Today I am trying to configure hadoop-0.20.205.0 on a 4 node Cluster.
When I start my cluster , all daemons got started except tasktracker,
don't know why task tracker fails due to following error logs.
Cluster is in private network.My /etc/hosts file contains all IP
hostname
Seven,
Yes that strategy has changed since long ago, but the doc on it was
only recently updated: https://issues.apache.org/jira/browse/HDFS-1454
(and some more improvements followed later IIRC)
2012/2/21 seven garfee garfee.se...@gmail.com:
hi,all
As this Page(
awesome, guys!
-Alex
sent via my mobile device
On Feb 20, 2012, at 11:59 PM, Luca Pireddu pire...@crs4.it wrote:
Hello everyone,
we're happy to announce that we have just released Pydoop 0.5.0
(http://pydoop.sourceforge.net).
The main changes with respect to the previous version are:
hi..
i want to access hbase table from hadoop mapreducei m using windowsXP
and cygwin
i m using hadoop-0.20.2 and hbase-0.92.0
hadoop cluster is working finei am able to run mapreduce wordcount
successfully on 3 pc's
hbase is also working .i can cerate table from shell
i have tried
Dheeraj,
In most homogenous cluster environments, people do keep the configs
synced. However, that isn't necessary.
It is alright to have different *-site.xml contents on each slave,
tailored for its provided resources. For instance if you have 3 slaves
with 3 disks, and 1 slave with 2, you can
It sounds to me like you just need to include your HBase jars into your
compiler's classpath like so:
javac -classpath $HADOOP_HOME Example.java
where $HADOOP_HOME includes all your base hadoop jars as well as your hbase
jars.
then you would want to put the resulting Example.class file into
Have you had any Name Node failures lately? I had them every couple of days
and found that there were files being left in hdfs
/log/hadoop/tmp/mapred/staging/... when communications with the Name Node
was lost. Not sure why they never got replicated correctly (maybe because
they are in /log?)
I
I'd recommend making a SequenceFile[1] to store each XML file as a value.
-Joey
[1]
http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/io/SequenceFile.html
On Tue, Feb 21, 2012 at 12:15 PM, Mohit Anchlia mohitanch...@gmail.comwrote:
We have small xml files. Currently I am
Mohit
Rather than just appending the content into a normal text file or
so, you can create a sequence file with the individual smaller file content
as values.
Regards
Bejoy.K.S
On Tue, Feb 21, 2012 at 10:45 PM, Mohit Anchlia mohitanch...@gmail.comwrote:
We have small xml files.
On Tue, Feb 21, 2012 at 9:25 AM, Bejoy Ks bejoy.had...@gmail.com wrote:
Mohit
Rather than just appending the content into a normal text file or
so, you can create a sequence file with the individual smaller file content
as values.
Thanks. I was planning to use pig's
Hi,
I am working on a project which requires a setup as follows:
One master with four slaves.However, when a map only program is run, the
master dynamically selects the slave to run the map. For example, when the
program is run for the first time, slave 2 is selected to run the map and
reduce
You might want to check out File Crusher:
http://www.jointhegrid.com/hadoop_filecrush/index.jsp
I've never used it, but it sounds like it could be helpful.
On Tue, Feb 21, 2012 at 10:25 AM, Bejoy Ks bejoy.had...@gmail.com wrote:
Hi Mohit
AFAIK XMLLoader in pig won't be suited for
I am trying to look for examples that demonstrates using sequence files
including writing to it and then running mapred on it, but unable to find
one. Could you please point me to some examples of sequence files?
On Tue, Feb 21, 2012 at 10:25 AM, Bejoy Ks bejoy.had...@gmail.com wrote:
Hi Mohit
Hi,
Let's say all the smaller files are in the same directory.
Then u can do:
*BufferedWriter output = new BufferedWriter
(newOutputStreamWriter(fs.create(output_path,
true))); // Output path*
*FileStatus[] output_files = fs.listStatus(new Path(input_path)); // Input
directory*
*for ( int
Thanks How does mapreduce work on sequence file? Is there an example I can
look at?
On Tue, Feb 21, 2012 at 11:34 AM, Arko Provo Mukherjee
arkoprovomukher...@gmail.com wrote:
Hi,
Let's say all the smaller files are in the same directory.
Then u can do:
*BufferedWriter output = new
Hi,
I think the following link will help:
http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
Cheers
Arko
On Tue, Feb 21, 2012 at 2:04 PM, Mohit Anchlia mohitanch...@gmail.comwrote:
Sorry may be it's something obvious but I was wondering when map or reduce
gets called what
Hello,
I'm a market analyst involved in researching the Hadoop space, had
a quick question. I was wondering if and what type of requirements may
there be for WAN-based high availability for Hadoop configurations
e.g. for disaster recovery and what type of solutions may be available
for such
I am past this error. Looks like I needed to use CDH libraries. I changed
my maven repo. Now I am stuck at
*org.apache.hadoop.security.AccessControlException *since I am not writing
as user that owns the file. Looking online for solutions
On Tue, Feb 21, 2012 at 12:48 PM, Mohit Anchlia
For High Availability?
The issue is the nameNode, going forward there is a Federated NameNode
environment, but I haven't used it and not sure If it's kind of an
active-active name node environment or just a sharded environment.
DR/BR is always an issue when you have petabytes of data across
I think that job configuration does not allow you such setup, however maybe
I missed something..
Probably I would tackle this problem from the scheduler source. The
default one is JobQueueTaskScheduler which preserves a fifo based queue.
When a tasktracker (your slave) tells the jobtracker that
Yeah, I'm not sure how you can actually do it, as I haven't done it
before, but from a logical perspective, you'd probably have to do a lot
of configuration changes and maybe even write up some complicated M/R
code, coordination/rules engine logic, change how the heartbeat
scheduler operate to
Need some more help. I wrote sequence file using below code but now when I
run mapreduce job I get file.*java.lang.ClassCastException*:
org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.io.Text even though I didn't use LongWritable when I
originally wrote to the sequence
It looks like in mapper values are coming as binary instead of Text. Is
this expected from sequence file? I initially wrote SequenceFile with Text
values.
On Tue, Feb 21, 2012 at 4:13 PM, Mohit Anchlia mohitanch...@gmail.comwrote:
Need some more help. I wrote sequence file using below code but
On Tue, Feb 21, 2012 at 7:50 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
It looks like in mapper values are coming as binary instead of Text. Is
this expected from sequence file? I initially wrote SequenceFile with Text
values.
On Tue, Feb 21, 2012 at 4:13 PM, Mohit Anchlia
thanks a lot.
2012/2/21 Harsh J ha...@cloudera.com
Seven,
Yes that strategy has changed since long ago, but the doc on it was
only recently updated: https://issues.apache.org/jira/browse/HDFS-1454
(and some more improvements followed later IIRC)
2012/2/21 seven garfee
Finally figured it out. I needed to use SequenceFileAstextInputFormat.
There is just lack of examples that makes it difficult when you start.
On Tue, Feb 21, 2012 at 4:50 PM, Mohit Anchlia mohitanch...@gmail.comwrote:
It looks like in mapper values are coming as binary instead of Text. Is
this
1. It is important to ensure your clients are on the same major
version jars as your server.
2. You are probably looking for hadoop fs -chown and hadoop fs
-chmod tools to modify permissions.
On Wed, Feb 22, 2012 at 3:15 AM, Mohit Anchlia mohitanch...@gmail.com wrote:
I am past this error. Looks
HI Folks,
Rite now i m having replication factor 2, but now i want to make it three
for sum tables so how can i do that for specific tables, so that whenever
the data would be loaded in those tables it can automatically replicated
into three nodes.
Or i need to replicate for all the tables.
and
29 matches
Mail list logo