Hi,
How to convert .pdf and .doc files into text file using mapreduce in java.
Please help me with sample code or any site to refer.
Please provide sample code.
Thanks in advance
Ranjini
Just wondering what is the diff between File_bytes_read vs hdfs_bytes_read
which gets displayed in the output of job.
Thanks
Sai
Its simple,
bytes read from local file system: File_bytes_read
bytes read from HDFS file system: hdfs_bytes_read
Regards,
Vinayakumar B
From: Sai Sai [mailto:saigr...@yahoo.in]
Sent: 14 March 2014 14:51
To: user@hadoop.apache.org
Subject: File_bytes_read vs hdfs_bytes_read
Just wondering what
Hi,
I understand that the mapper produces 1 partition per reducer. How does the
reducer know which partition to copy? Lets say there are 2 nodes running
mapper for word count program and there are 2 reducers configured. If each
map node produces 2 partitions, with the possibility of partitions in
Can some one please help:
How to unzip a .tar.bz2 file which is in hadoop/hdfs
Thanks
Sai
By using the default partitioner the key who will go to the partitions
with the same number and the reducer will collect all partition for a given
number. Reducer 0 all partitions 0 and so on. But yes, you can mess up with
the partitioner logic for good or evil, if that's your question. With great
There is some explanation here as well (in case you haven't check that out
yet):
http://stackoverflow.com/questions/16634294/understanding-the-hadoop-file-system-counters
Regards,
Shahab
On Fri, Mar 14, 2014 at 5:32 AM, Vinayakumar B vinayakuma...@huawei.comwrote:
Its simple,
bytes read
Thanks Rohith, I restarted the datanodes and all is well.
From: Rohith Sharma K S [mailto:rohithsharm...@huawei.com]
Sent: Thursday, March 13, 2014 10:56 PM
To: user@hadoop.apache.org
Subject: RE: NodeManager health Question
Hi ,
As troubleshooting, few things you can verify
1. check
Hi John
Would you mind filing a jira with more details. The RM going down just because
a host was not resolvable or DNS timed out is something that should be
addressed.
thanks
-- Hitesh
On Mar 13, 2014, at 2:29 PM, John Lilley wrote:
Never mind… we figured out its DNS entry was going
James Neofotistos
Hey Clay,
How have you loaded 6TB data into HDP? I am in a similar situation and
wanted to understand your use case.
On Thu, Mar 13, 2014 at 3:59 PM, Clay McDonald
stuart.mcdon...@bateswhite.com wrote:
Hello all, I have laid out my POC in a project plan and have HDP 2.0
installed. HDFS is
What do you want to know? Here is how it goes;
1. We receive 6TB from an outside client and need to analyze the data
quickly and report on our findings. I'm using an analysis that was done in our
current environment with the same data.
2. Upload the data to hdfs with -put
3.
Also, I too created all my processes and SQL in Hortonworks' sandbox with small
sample data. Then, we created 7 VMs and attached enough storage to handle the
full dataset test. I installed and configured CentOS and installed Hortonwork
HDP 2.0 using Ambari. The cluster is 4 datanodes and 3
Thank you Clay,
So you installed small set of data first into Sandbox and used CentOS
machine and installed full blown HDP2.0 on that box using Ambari.
So when you installed HDP2.0 using Ambari, how did you configured your
master-namenode/slave-datanodes?
Also where did you installed Hive? On
Hi Andrew,
I do most of my Hadoop development on MacOS and I've always wondered about that
message. I tried your fix and it works.
Thanks!
Geoff
On Mar 12, 2014, at 9:40 AM, Andrew Pennebaker apenneba...@42six.com wrote:
In recent versions of Mac OS X, a default Hadoop configuration such as
To run Hadoop 2.0 you need to build WunUtil.exe and hadoop.dll
I am having problems building these and given that virtually ALL windows
work is on 64 bit windows see little reason why users cannot download these
- does enyone have these build and in a spot where they can be downloaded?
Hi Steve,
I've filed the problem as HADOOP-10051.
https://issues.apache.org/jira/browse/HADOOP-10051
Can someone answer this problem?
Thanks,
Tsuyoshi
On Fri, Mar 14, 2014 at 2:53 PM, Steve Lewis lordjoe2...@gmail.com wrote:
To run Hadoop 2.0 you need to build WunUtil.exe and hadoop.dll
I am
I was doing some testing with HA NN today. I set up two NN with active
failover (ZKFC) using sshfence. I tested that its working on both NN by
doing 'kill -9 pid' on the active NN. When I did this on the active node,
the standby would become the active and everything seemed to work. Next, I
logged
Could you have also prevented the standby from communicating with
Zookeeper?
Chris
On Mar 14, 2014 8:22 PM, dlmarion dlmar...@hotmail.com wrote:
I was doing some testing with HA NN today. I set up two NN with active
failover (ZKFC) using sshfence. I tested that its working on both NN by
doing
Hi Dave,
How many zookeeper servers do you have and where are them?
Juan Carlos Fernández Rodríguez
El 15/03/2014, a las 01:21, dlmarion dlmar...@hotmail.com escribió:
I was doing some testing with HA NN today. I set up two NN with active
failover (ZKFC) using sshfence. I tested that its
I don't think so. NN1 and ZKFC1 are one physically separate machines than
NN2 and ZKFC2.
From: Chris Mawata [mailto:chris.maw...@gmail.com]
Sent: Friday, March 14, 2014 9:05 PM
To: user@hadoop.apache.org
Subject: Re: HA NN Failover question
Could you have also prevented the standby from
Server 1: NN1 and ZKFC1
Server 2: NN2 and ZKFC2
Server 3: Journal1 and ZK1
Server 4: Journal2 and ZK2
Server 5: Journal3 and ZK3
Server 6+: Datanode
All in the same rack. I would expect the ZKFC from the active name node server
to lose its lock and the other ZKFC to tell the standby
Which Hadoop version you used?
Sent from my iPhone5s
On 2014年3月15日, at 9:29, dlmarion dlmar...@hotmail.com wrote:
Server 1: NN1 and ZKFC1
Server 2: NN2 and ZKFC2
Server 3: Journal1 and ZK1
Server 4: Journal2 and ZK2
Server 5: Journal3 and ZK3
Server 6+: Datanode
All in the same
Apache Hadoop 2.3.0
Sent via the Samsung GALAXY S®4, an ATT 4G LTE smartphone
Original message
From: Azuryy azury...@gmail.com
Date:03/14/2014 10:45 PM (GMT-05:00)
To: user@hadoop.apache.org
Subject: Re: HA NN Failover question
Which Hadoop version you used?
Sent from my
I suppose NN2 is standby, please check ZKFC2 is alive before stop network on nn1
Sent from my iPhone5s
On 2014年3月15日, at 10:53, dlmarion dlmar...@hotmail.com wrote:
Apache Hadoop 2.3.0
Sent via the Samsung GALAXY S®4, an ATT 4G LTE smartphone
Original message
25 matches
Mail list logo