You can change the block size of existing files with a command like
hadoop distcp -Ddfs.block.size=$[256*1024*1024] /path/to/inputdata
/path/to/inputdata-with-largeblocks.
After this command completes, you can remove the original data
From: kun yan
Hi,
I am having some difficulty in copy data between 2 HDFS filesystems in
Amazon EC2.I want to try to use distcp2 command to see if I can.
- Where is the distcp2 command in yarn?
- Is it possible to copy data between HDFS using SSL?
- Has anyone copied data between HDFS filesystems in 2
Hi,
After hosting an insecure Hadoop environment for early testing I'm
transitioning to something more secure that would (hopefully) more or less
mirror what a production environment might look like. I've integrated our
Hadoop cluster into our Kerberos realm and everything is working ok except
Yes. . Protobuf 2.5 jars wants every Protobuf code in its jvm to be
generated and compiled using 2.5. Its not supporting old compiled code.
Even though there will not be any compilation issues with 2.4 generated
code, exception will be thrown at runtime.
So upgrade all your code to 2.5 and
You can change it to any size in multiples of 512 bytes by default which is
bytesPerChecksum.
But setting it to lesser values leads to heavy load on cluster and setting
to very high value will not distribute the data. So 64MB or (128MB in
latest trunk.) Is recommended as optimal. Its upto you to
Hi Hadoop users,
I have been trying to concatenate multiple sequence files into one.
Since the total size of the sequence files is quite big (1TB), I won't use
mapreduce because it requires 1TB in the reducer host to hold the temporary
data.
I ended up doing what have been suggested in this
Jerry,
It might not help with this particular file, but you might considered the
approach used at Blackberry when dealing with your data. They block
compressed into small avro files and then concatenated into large avro
files without decompressing. Check out the boom file format here:
Here's a great tool for exactly what you're looking for
https://github.com/edwardcapriolo/filecrush
On Tue, Sep 10, 2013 at 11:07 AM, Jerry Lam chiling...@gmail.com wrote:
Hi Hadoop users,
I have been trying to concatenate multiple sequence files into one.
Since the total size of the sequence
iirc sequence files can be concatenated as is and read as one large file
but maybe im forgetting something.
Thank you for the clarification Adam.
On Tue, Sep 10, 2013 at 12:34 PM, Adam Muise amu...@hortonworks.com wrote:
Harsh is giving you a best practice for JVMs using IPv4 in general. As
what I am suggesting is IPv4-only connections to the Hadoop daemons and
clients on the cluster and gateway,
Hello Shahab,
Thanks for the reply. Typically, to invoke the HDFS client, I will use
bin/haddop dfs But the command that you used hadoop fs makes
me wonder what this is the Hadoop 2.* client commands. Could you clarify
for me such -D fs.local.block.size is supported in Hadoop 1.1. or
can be set at the time I load the file to the HDFS (that is, it is the
client side setting)?
I don't think you can do this while reading. These are done at the time of
writing.
You can do it like this (the example is for CLI as evident):
hadoop fs -D fs.local.block.size=134217728 -put
Ok, seems there is a jira for this issue.
https://issues.apache.org/jira/browse/YARN-800
On Mon, Sep 9, 2013 at 3:39 PM, Jian Fang jian.fang.subscr...@gmail.comwrote:
Hi,
I need to use the web services in application master, for example,
curl
HDFS-347 introduced this feature, and it is currently only available
in 2.1.x onwards.
On Wed, Sep 11, 2013 at 12:00 AM, Jun Li jltz922...@gmail.com wrote:
Hi,
In the link,
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml,
the explanation:
Hi All,
I'm facing an issue while showing Hadoop metrics in ganglia, Though I
have installed ganglia on my master/slaves nodes and I'm able to see
all the default metrics on ganglia UI from all the nodes but I'm not
able to see Hadoop metrics in metrics section.
versions:-
Hadoop 1.1.1
ganglia
Hi Guys,
How to import mysql to Hbase table. I am using sqoop2 when i try to import
table it's doesn't show storage as Hbase.
Schema name: sqoop:000 create job --xid 12 --type import
.
.
.
.
Boundary query:
Output configuration
Storage type:
* 0 : HDFS*
Choose:
Please guide me. How to do
So
for the first *wrong* /etc/hosts file, the sequence would be :
find hdfs://master:54310
find master - 192.168.6.10 (*but it already got ip here*)
find 192.168.6.10 - localhost
find localhost - 127.0.0.1
The other thing, when 'ping master', i would got reply from '192.168.6.10'
instead of
17 matches
Mail list logo