Re: dfs.write.packet.size set to 2G

2011-11-08 Thread Harsh J
Block sizes are per-file, not permanently set on the HDFS. So create your files with a sufficiently large block size (2G is OK if it fits your usecase well). This way you won't have block splits, as you desire. For example, to upload a file via the shell with a tweaked blocksize, I'd do: hadoop

Re: dfs.write.packet.size set to 2G

2011-11-08 Thread donal0412
Thanks! That's exactly what I want. And Ted, what do you mean by snapshots and mirrors ? On 2011/11/8 16:21, Harsh J wrote: Block sizes are per-file, not permanently set on the HDFS. So create your files with a sufficiently large block size (2G is OK if it fits your usecase well). This way you

Re: dfs.write.packet.size set to 2G

2011-11-08 Thread Uma Maheswara Rao G 72686
- Original Message - From: donal0412 donal0...@gmail.com Date: Tuesday, November 8, 2011 1:04 pm Subject: dfs.write.packet.size set to 2G To: hdfs-user@hadoop.apache.org Hi, I want to store lots of files in HDFS, the file size is = 2G. I don't want the file to split into blocks,because

Problem in using Hadoop Eclipse Plugin

2011-11-08 Thread Stuti Awasthi
Hi all, I was trying to configure Eclipse SDK Version: 3.7.1 with remote Hadoop 0.20.2 . I am able to configure eclipse such that I can browse HDFS from it. When I try to execute MapReduce job using run on Hadoop, nothing happens and I get the error in eclipse logs Message : Plug-in

Re: dfs.write.packet.size set to 2G

2011-11-08 Thread Ted Dunning
By snapshots, I mean that you can freeze a copy of a portion of the the file system for later use as a backup or reference. By mirror, I mean that a snapshot can be transported to another location in the same cluster or to another cluster and the mirrored image will be updated atomically to the

RE: Problem in using Hadoop Eclipse Plugin

2011-11-08 Thread Stuti Awasthi
Hi, I resolved the issue by doing the following : 1. Get hadoop-eclipse-plugin-0.20.203.0 2. Rename to : hadoop-0.20.2-eclipse-plugin 3. Place this plugin jar in eclipse plugins directory 4. Restart eclipse This works for me and now able to execute MR job through my eclipse. Thanks

Re: Sizing help

2011-11-08 Thread Rita
Thats a good point. What is hdfs is used as an archive? We dont really use it for mapreduce more for archival purposes. On Mon, Nov 7, 2011 at 7:53 PM, Ted Dunning tdunn...@maprtech.com wrote: 3x replication has two effects. One is reliability. This is probably more important in large

Re: Sizing help

2011-11-08 Thread Ted Dunning
For archival purposes, you don't need speed (mostly). That eliminates one argument for 3x replication. If you have RAID-5 or RAID-6 on your storage nodes, then you eliminate most of your disk failure costs at the cluster level. This gives you something like 2.2x replication cost. You can also

Could not obtain block

2011-11-08 Thread Steve Lewis
Just recently my hadoop jobs started failing with Could not obtain block. I recently restarted the cluster but this error has killed 3 jobs I need some hints on how tp diagnose and fix the problem d by: java.io.IOException: Could not obtain block: blk_8697778223665513111_1917303

error building libhdfs

2011-11-08 Thread Inder Pall
facing the following if /bin/sh ./libtool --mode=compile --tag=CC gcc -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1