Re: HDFS Append Problem

2015-03-05 Thread Suresh Srinivas
Please take this up CDH mailing list. From: Molnár Bálint molnarcsi...@gmail.com Sent: Thursday, March 05, 2015 4:53 AM To: user@hadoop.apache.org Subject: HDFS Append Problem Hi Everyone! I 'm experiencing an annoying problem. My Scenario is: I want to store

Re: Error while executing command on CDH5

2015-03-04 Thread Suresh Srinivas
Can you please use CDH mailing listd for this question? From: SP sajid...@gmail.com Sent: Wednesday, March 04, 2015 11:00 AM To: user@hadoop.apache.org Subject: Error while executing command on CDH5 Hello All, Why am I getting this error every time I execute

Re: DFS Used V/S Non DFS Used

2014-10-10 Thread Suresh Srinivas
Here is the information from - https://issues.apache.org/jira/browse/HADOOP-4430?focusedCommentId=12640259page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12640259 Here are the definition of data reported on the Web UI: Configured Capacity: Disk space corresponding to

Re: Significance of PID files

2014-07-07 Thread Suresh Srinivas
When a daemon process is started, the process ID of the process is captured in a pid file. It is used for following purposes: - During a daemon startup, the existence of pid file is used to determine that the process is already running. - When a daemon is stooped, hadoop scripts sends kill TERM

Re: hadoop 2.2.0 HA: standby namenode generate a long list of loading edits

2014-06-11 Thread Suresh Srinivas
Henry, I suspect this is what is happening. On active namenode, oncethe existing set of editlogs during startup are loaded, it becomes active and from then it has no need to load any more edits. It only generates edits. On the other hand, standby namenode not only loads the edits during startup,

Re: hadoop 2.2.0 HA: standby namenode generate a long list of loading edits

2014-06-11 Thread Suresh Srinivas
. Best regards, Henry *From:* Suresh Srinivas [mailto:sur...@hortonworks.com] *Sent:* Thursday, June 12, 2014 11:23 AM *To:* hdfs-u...@hadoop.apache.org *Subject:* Re: hadoop 2.2.0 HA: standby namenode generate a long list of loading edits Henry, I suspect this is what

Re: how can i monitor Decommission progress?

2014-06-05 Thread Suresh Srinivas
The namenode webui provides that information. Click on the main webui the link associated with decommissioned nodes. Sent from phone On Jun 5, 2014, at 10:36 AM, Raj K Singh rajkrrsi...@gmail.com wrote: use $hadoop dfsadmin -report Raj K

Re: listing a 530k files directory

2014-05-30 Thread Suresh Srinivas
Listing such a directory should not be a big problem. Can you cut and paste the command output. Which release are you using? Sent from phone On May 30, 2014, at 5:49 AM, Guido Serra z...@fsfe.org wrote: already tried, didn't work (24cores at 100% and a-lot-memory, stilll ... GC overhead

Re: any optimize suggestion for high concurrent write into hdfs?

2014-02-20 Thread Suresh Srinivas
Another alternative is to write block sized chunks into multiple hdfs files concurrently followed by concat to all those into a single file. Sent from phone On Feb 20, 2014, at 8:15 PM, Chen Wang chen.apache.s...@gmail.com wrote: Ch, you may consider using flume as it already has a flume

Re: HDFS Federation address performance issue

2014-01-28 Thread Suresh Srinivas
Response inline... On Tue, Jan 28, 2014 at 10:04 AM, Anfernee Xu anfernee...@gmail.com wrote: Hi, Based on http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/Federation.html#Key_Benefits, the overall performance can be improved by federation, but I'm not sure federation

Re: compatibility between new client and old server

2013-12-18 Thread Suresh Srinivas
2.x is a new major release. 1.x and 2.x are not compatible. In 1.x, the RPC wire protocol used java serialization. In 2.x, the RPC wire protocol uses protobuf. A client must be compiled against 2.x and should use appropriate jars from 2.x to work with 2.x. On Wed, Dec 18, 2013 at 10:45 AM, Ken

Re: HDP 2.0 GA?

2013-11-05 Thread Suresh Srinivas
Please send the questions related to a vendor specific distro to vendor mailing list. In this case - http://hortonworks.com/community/forums/. On Tue, Nov 5, 2013 at 10:49 AM, Jim Falgout jim.falg...@actian.com wrote: HDP 2.0.6 is the GA version that matches Apache Hadoop 2.2.

Re: HDFS / Federated HDFS - Doubts

2013-10-16 Thread Suresh Srinivas
On Wed, Oct 16, 2013 at 9:22 AM, Steve Edison sediso...@gmail.com wrote: I have couple of questions about HDFS federation: Can I state different block store directories for each namespace on a datanode ? No. The main idea of federation was not to physically partition the storage across

Re: HDFS federation Configuration

2013-09-23 Thread Suresh Srinivas
I'm not able to follow the page completely. Can you pls help me to get some clear step by step or little bit more details in the configuration side? Have you setup a non-federated cluster before. If you have, the page should be easy to follow. If you have not setup a non-federated cluster

Re: HDFS federation Configuration

2013-09-19 Thread Suresh Srinivas
Have you looked at - http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-project-dist/hadoop-hdfs/Federation.html Let me know if the document is not clear or needs improvements. Regards, Suresh On Thu, Sep 19, 2013 at 12:01 PM, Manickam P manicka...@outlook.com wrote: Guys, I need some

Re: Name node High Availability in Cloudera 4.1.1

2013-09-19 Thread Suresh Srinivas
Please do not cross-post these emails to hdfs-user. The relevant email list is only cdh-user. On Thu, Sep 19, 2013 at 1:44 AM, Pavan Kumar Polineni smartsunny...@gmail.com wrote: Hi all, *Name Node High Availability Job tracker high availability* is there in Cloudera 4.1.1 ? If not,

Re: Cloudera Vs Hortonworks Vs MapR

2013-09-13 Thread Suresh Srinivas
Shahab, I agree with your arguments. Really well put. Only things I would add is - we do not want sales/marketing folks getting involved in these kinds of threads and pollute it with sales pitches, unsubstantiated claims, and make it a forum for marketing pitch. This can also have community

Re: Cloudera Vs Hortonworks Vs MapR

2013-09-12 Thread Suresh Srinivas
Raj, You can also use Apache Hadoop releases. Bigtop does fine job as well putting together consumable Hadoop stack. As regards to vendor solutions, this is not the right forum. There are other forums for this. Please refrain from this type of discussions on Apache forum. Regards, Suresh On

Re: Symbolic Link in Hadoop 1.0.4

2013-09-05 Thread Suresh Srinivas
FileContext APIs and symlink functionality is not available in 1.0. It is only available in 0.23 and 2.x release. On Thu, Sep 5, 2013 at 8:06 AM, Gobilliard, Olivier olivier.gobilli...@cartesian.com wrote: Hi, I am using Hadoop 1.0.4 and need to create a symbolic link in HDSF. This

Re: Documentation for Hadoop's RPC mechanism

2013-08-20 Thread Suresh Srinivas
Create a Jira and post it into hadoop documentation. I can help you with the review and commit. Sent from phone On Aug 20, 2013, at 10:40 AM, Elazar Leibovich elaz...@gmail.com wrote: Hi, I've written some documentation for Hadoop's RPC mechanism internals:

Re: Maven Cloudera Configuration problem

2013-08-13 Thread Suresh Srinivas
Folks, can you please take this thread to CDH related mailing list? On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox bradj...@gmail.com wrote: That link got my hopes up. But Cloudera Manager (what I'm running; on CDH4) does not offer an Export Client Config option. What am I missing? On Aug 13,

Re:

2013-07-12 Thread Suresh Srinivas
Please use CDH mailing list. This is apache hadoop mailing list. Sent from phone On Jul 12, 2013, at 7:51 PM, Anit Alexander anitama...@gmail.com wrote: Hello, I am encountering a problem in cdh4 environment. I can successfully run the map reduce job in the hadoop cluster. But when i

Re: Cloudera links and Document

2013-07-11 Thread Suresh Srinivas
Sathish, this mailing list for Apache Hadoop related questions. Please post questions related to other distributions to appropriate vendor's mailing list. On Thu, Jul 11, 2013 at 6:28 AM, Sathish Kumar sa848...@gmail.com wrote: Hi All, Can anyone help me the link or document that explain

Re: data loss after cluster wide power loss

2013-07-03 Thread Suresh Srinivas
On Wed, Jul 3, 2013 at 8:12 AM, Colin McCabe cmcc...@alumni.cmu.edu wrote: On Mon, Jul 1, 2013 at 8:48 PM, Suresh Srinivas sur...@hortonworks.com wrote: Dave, Thanks for the detailed email. Sorry I did not read all the details you had sent earlier completely (on my phone). As you said

Re: HDFS file section rewrite

2013-07-02 Thread Suresh Srinivas
HDFS only supports regular writes and append. Random write is not supported. I do not know of any feature/jira that is underway to support this feature. On Tue, Jul 2, 2013 at 9:01 AM, John Lilley john.lil...@redpoint.netwrote: I’m sure this has been asked a zillion times, so please just

Re: data loss after cluster wide power loss

2013-07-01 Thread Suresh Srinivas
available? Dave On Mon, Jul 1, 2013 at 3:16 PM, Suresh Srinivas sur...@hortonworks.comwrote: Yes this is a known issue. The HDFS part of this was addressed in https://issues.apache.org/jira/browse/HDFS-744 for 2.0.2-alpha and is not available in 1.x release. I think HBase does not use

Re: Please explain FSNamesystemState TotalLoad

2013-06-07 Thread Suresh Srinivas
, this is useful. Knowing what it represent, you would find many other uses as well. From: Suresh Srinivas sur...@hortonworks.com Reply-To: user@hadoop.apache.org user@hadoop.apache.org Date: Thursday, June 6, 2013 4:14 PM To: hdfs-u...@hadoop.apache.org user@hadoop.apache.org Subject: Re: Please

Re: Please explain FSNamesystemState TotalLoad

2013-06-06 Thread Suresh Srinivas
It is the total number of transceivers (readers and writers) reported by all the datanodes. Datanode reports this count in periodic heartbeat to the namenode. On Thu, Jun 6, 2013 at 1:48 PM, Nick Niemeyer nnieme...@riotgames.comwrote: Can someone please explain what TotalLoad represents

Re: How to test the performance of NN?

2013-06-05 Thread Suresh Srinivas
What do you mean by it is not telling me any thing about performance? Also I do not understand the part, only about potential failures.. Can you add more details. nnbench is the best microbenchmark for nn performance test. On Wed, Jun 5, 2013 at 3:17 PM, Mark Kerzner

Re: cloudera4.2 source code ant

2013-05-17 Thread Suresh Srinivas
Folks, this is Apache Hadoop mailing list. For vendor distro related questions, please use the appropriate vendor mailing list. Sent from a mobile device On May 17, 2013, at 2:06 AM, Kun Ling lkun.e...@gmail.com wrote: Hi dylan, I have not build CDH source code using ant, However I

Re: CDH4 installation along with MRv1 from tarball

2013-03-20 Thread Suresh Srinivas
Can you guys please take this thread to CDH mailing list? Sent from phone On Mar 20, 2013, at 2:48 PM, rohit sarewar rohitsare...@gmail.com wrote: Hi Jens These are not complete version of Hadoop. 1) hadoop-0.20-mapreduce-0.20.2+1341 (has only MRv1) 2) hadoop-2.0.0+922 (has HDFS+ Yarn)

Re: Regarding: Merging two hadoop clusters

2013-03-14 Thread Suresh Srinivas
I have two different hadoop clusters in production. One cluster is used as backing for HBase and the other for other things. Both hadoop clusters are using the same version 1.0 and I want to merge them and make them one. I know, one possible solution is to copy the data across, but the data is

Re: Hadoop cluster hangs on big hive job

2013-03-11 Thread Suresh Srinivas
, 2013 at 1:32 PM, Daning Wang dan...@netseer.com wrote: [hive@mr3-033 ~]$ hadoop version Hadoop 1.0.4 Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290 Compiled by hortonfo on Wed Oct 3 05:13:58 UTC 2012 On Sun, Mar 10, 2013 at 8:16 AM, Suresh

Re: Hadoop cluster hangs on big hive job

2013-03-10 Thread Suresh Srinivas
What is the version of hadoop? Sent from phone On Mar 7, 2013, at 11:53 AM, Daning Wang dan...@netseer.com wrote: We have hive query processing zipped csv files. the query was scanning for 10 days(partitioned by date). data for each day around 130G. The problem is not consistent since if

Re: [jira] [Commented] (HDFS-4533) start-dfs.sh ignored additional parameters besides -upgrade

2013-03-08 Thread Suresh Srinivas
Please followup on Jenkins failures. Looks like the patch is generated at the wrong directory. On Thu, Feb 28, 2013 at 1:34 AM, Azuryy Yu azury...@gmail.com wrote: Who can review this JIRA(https://issues.apache.org/jira/browse/HDFS-4533), which is very simple. -- Forwarded message

Re: How to setup Cloudera Hadoop to run everything on a localhost?

2013-03-05 Thread Suresh Srinivas
Can you please take this Cloudera mailing list? On Tue, Mar 5, 2013 at 10:33 AM, anton ashanin anton.asha...@gmail.comwrote: I am trying to run all Hadoop servers on a single Ubuntu localhost. All ports are open and my /etc/hosts file is 127.0.0.1 frigate frigate.domain.locallocalhost

Re: QJM HA and ClusterID

2013-02-26 Thread Suresh Srinivas
looks start-dfs.sh has a bug. It only takes -upgrade option and ignores clusterId. Consider running the command (which is what start-dfs.sh calls): bin/hdfs start namenode -upgrade -clusterId your cluster ID Please file a bug, if you can, for start-dfs.sh bug which ignores additional parameters.

Re: Hive Metastore DB Issue ( Cloudera CDH4.1.2 MRv1 with hive-0.9.0-cdh4.1.2)

2013-02-07 Thread Suresh Srinivas
Please only use CDH mailing list and do not copy this to hdfs-user. On Thu, Feb 7, 2013 at 7:20 AM, samir das mohapatra samir.help...@gmail.com wrote: Any Suggestion... On Thu, Feb 7, 2013 at 4:17 PM, samir das mohapatra samir.help...@gmail.com wrote: Hi All, I could

Re: Application of Cloudera Hadoop for Dataset analysis

2013-02-05 Thread Suresh Srinivas
Please take this thread to CDH mailing list. On Tue, Feb 5, 2013 at 2:43 AM, Sharath Chandra Guntuku sharathchandr...@gmail.com wrote: Hi, I am Sharath Chandra, an undergraduate student at BITS-Pilani, India. I would like to get the following clarifications regarding cloudera hadoop

Re: Advice on post mortem of data loss (v 1.0.3)

2013-02-05 Thread Suresh Srinivas
Sorry to hear you are having issues. Few questions and comments inline. On Fri, Feb 1, 2013 at 8:40 AM, Peter Sheridan psheri...@millennialmedia.com wrote: Yesterday, I bounced my DFS cluster. We realized that ulimit –u was, in extreme cases, preventing the name node from creating threads.

Re: ClientProtocol Version mismatch. (client = 69, server = 1)

2013-01-29 Thread Suresh Srinivas
Please take this up in CDH mailing list. Most likely you are using client that is not from 2.0 release of Hadoop. On Tue, Jan 29, 2013 at 12:33 PM, Kim Chew kchew...@gmail.com wrote: I am using CDH4 (2.0.0-mr1-cdh4.1.2) vm running on my mbp. I was trying to invoke a remote method in the

Re: Using distcp with Hadoop HA

2013-01-29 Thread Suresh Srinivas
Currently, as you have pointed out, client side configuration based failover is used in HA setup. The configuration must define namenode addresses for the nameservices of both the clusters. Are the datanodes belonging to the two clusters running on the same set of nodes? Can you share the

Re: Cohesion of Hadoop team?

2013-01-18 Thread Suresh Srinivas
On Fri, Jan 18, 2013 at 6:48 AM, Glen Mazza gma...@talend.com wrote: Hi, looking at the derivation of the 0.23.x 2.0.x branches on one hand, and the 1.x branches on the other, as described here:

Re: NN Memory Jumps every 1 1/2 hours

2012-12-27 Thread Suresh Srinivas
a slow death. On Sun, Dec 23, 2012 at 9:40 PM, Suresh Srinivas sur...@hortonworks.com wrote: Do not have access to my computer. Based on reading the previous email, I do not see any thing suspicious on the list of objects in the histo live dump. I would like to hear from you about

Re: NN Memory Jumps every 1 1/2 hours

2012-12-27 Thread Suresh Srinivas
we really mean is the name node data must fit in memory 3x On Thu, Dec 27, 2012 at 5:08 PM, Suresh Srinivas sur...@hortonworks.com wrote: You did free up lot of old generation with reducing young generation, right? The extra 5G of RAM for the old generation should have helped. Based

Re: NN Memory Jumps every 1 1/2 hours

2012-12-27 Thread Suresh Srinivas
. But these optimizations affect only the fsimage and not the memory consumed on the namenode. Will I be saving 400,000,000 bytes of memory if I do? On Thu, Dec 27, 2012 at 5:41 PM, Suresh Srinivas sur...@hortonworks.com wrote: I do not follow what you mean here. Even when I forced a GC it cleared 0

Re: NN Memory Jumps every 1 1/2 hours

2012-12-23 Thread Suresh Srinivas
at 10:23 PM, Suresh Srinivas sur...@hortonworks.comwrote: -XX:NewSize=1G -XX:MaxNewSize=1G

Re: NN Memory Jumps every 1 1/2 hours

2012-12-22 Thread Suresh Srinivas
This looks to me is because of larger default young generation size in newer java releases - see http://docs.oracle.com/javase/6/docs/technotes/guides/vm/cms-6.html#heap_size. I can see looking at your GC logs, around 6G space being used for young generation (though I do not see logs related to

Re: Is there an additional overhead when storing data in HDFS?

2012-11-20 Thread Suresh Srinivas
HDFS uses 4GB for the file + checksum data. Default is for every 512 bytes of data, 4 bytes of checksum are stored. In this case additional 32MB data. On Tue, Nov 20, 2012 at 11:00 PM, WangRamon ramon_w...@hotmail.com wrote: Hi All I'm wondering if there is an additional overhead when

Re: High Availability - second namenode (master2) issue: Incompatible namespaceIDs

2012-11-16 Thread Suresh Srinivas
Vinay, if the Hadoop docs are not clear in this regard, can you please create a jira to add these details? On Fri, Nov 16, 2012 at 12:31 AM, Vinayakumar B vinayakuma...@huawei.comwrote: Hi, ** ** If you are moving from NonHA (single master) to HA, then follow the below steps.

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Suresh Srinivas
- A datanode is typically kept free with up to 5 free blocks (HDFS block size) of space. - Disk space is used by mapreduce jobs to store temporary shuffle spills also. This is what dfs.datanode.du.reserved is used to configure. The configuration is available in hdfs-site.xml. If you have not

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Suresh Srinivas
wrote: On Sep 4, 2012, at 10:05 , Suresh Srinivas wrote: When these errors are thrown, please send the namenode web UI information. It has storage related information in the cluster summary. That will help debug. Sure thing. Thanks. Here's what I currently see. It looks like the problem

Re: Hadoop WebUI

2012-08-01 Thread Suresh Srinivas
Clement, To get the details related to how to contribute - see http://wiki.apache.org/hadoop/HowToContribute. UI is simple because it serves the purpose. More sophisticated UI for management and monitoring is being done in Ambari, see - http://incubator.apache.org/ambari/. The core hadoop UIs

Re: Namenode and Jobtracker dont start

2012-07-18 Thread Suresh Srinivas
Can you share information on the java version that you are using. - Is it as obvious as some previous processes still running and new processes cannot bind to the port? - Another pointer -

Re: can HADOOP-6546: BloomMapFile can return false negatives get backported to branch-1?

2012-05-08 Thread Suresh Srinivas
This change in merged into branch-1 and will be available in release 1.1. On Mon, May 7, 2012 at 6:40 PM, Jim Donofrio donofrio...@gmail.com wrote: Can someone backport HADOOP-6546: BloomMapFile can return false negatives to branch-1 for the next 1+ release? Without this fix BloomMapFile is

Re: can HADOOP-6546: BloomMapFile can return false negatives get backported to branch-1?

2012-05-07 Thread Suresh Srinivas
I have marked it for 1.1. I will follow up on promoting the path. Regards, Suresh On May 7, 2012, at 6:40 PM, Jim Donofrio donofrio...@gmail.com wrote: Can someone backport HADOOP-6546: BloomMapFile can return false negatives to branch-1 for the next 1+ release? Without this fix

Re: Best practice to migrate HDFS from 0.20.205 to CDH3u3

2012-05-03 Thread Suresh Srinivas
This probably is a more relevant question in CDH mailing lists. That said, what Edward is suggesting seems reasonable. Reduce replication factor, decommission some of the nodes and create a new cluster with those nodes and do distcp. Could you share with us the reasons you want to migrate from

Re: hadoop permission guideline

2012-03-22 Thread Suresh Srinivas
Can you please take this discussion CDH mailing list? On Mar 22, 2012, at 7:51 AM, Michael Wang michael.w...@meredith.com wrote: I have installed Cloudera hadoop (CDH). I used its Cloudera Manager to install all needed packages. When it was installed, the root is used. I found the

Re: Issue when starting services on CDH3

2012-03-15 Thread Suresh Srinivas
Guys, can you please take this up in CDH related mailing lists. On Thu, Mar 15, 2012 at 10:01 AM, Manu S manupk...@gmail.com wrote: Because for large clusters we have to run namenode in a single node, datanode in another nodes So we can start namenode and jobtracker in master node and

Re: Questions about HDFS’s placement policy

2012-03-14 Thread Suresh Srinivas
See my comments inline: On Wed, Mar 14, 2012 at 9:24 AM, Giovanni Marzulli giovanni.marzu...@ba.infn.it wrote: Hello, I'm trying HDFS on a small test cluster and I need to clarify some doubts about hadoop behaviour. Some details of my cluster: Hadoop version: 0.20.2 I have two racks

Re: What is the NEW api?

2012-03-11 Thread Suresh Srinivas
there are many people talking about the NEW API This might be related to releases 0.21 or later, where append and related functionality is re-implemented. 1.0 comes from 0.20.205 and has same API as 0.20-append. Sent from phone On Mar 11, 2012, at 6:27 PM, WangRamon ramon_w...@hotmail.com

Re: Backupnode in 1.0.0?

2012-02-23 Thread Suresh Srinivas
On Thu, Feb 23, 2012 at 12:41 AM, Jeremy Hansen jer...@skidrow.la wrote: Thanks. Could you clarify what BackupNode does? -jeremy Namenode currently keeps the entire file system namespace in memory. It logs the write operations (create, delete file etc.) into a journal file called editlog.

Re: Backupnode in 1.0.0?

2012-02-22 Thread Suresh Srinivas
my iPhone On Feb 22, 2012, at 14:56, Jeremy Hansen jer...@skidrow.la wrote: Any possibility of getting spec files to create packages for 0.22? Thanks -jeremy On Feb 22, 2012, at 11:50 AM, Suresh Srinivas wrote: BackupNode is major functionality with change in required in RPC

Re: Setting up Federated HDFS

2012-02-07 Thread Suresh Srinivas
On Tue, Feb 7, 2012 at 4:51 PM, Chandrasekar chandruseka...@gmail.comwrote: In which file should i specify all this information about nameservices and the list of namenodes? hdfs-site.xml is the appropriate place, since it is hdfs-specific configuration. If there are multiple

Re: HDFS Federation Exception

2012-01-11 Thread Suresh Srinivas
Thanks for figuring that. Could you create an HDFS Jira for this issue? On Wednesday, January 11, 2012, Praveen Sripati praveensrip...@gmail.com wrote: Hi, The documentation (1) suggested to set the `dfs.namenode.rpc-address.ns1` property to `hdfs://nn-host1:rpc-port` in the example. Changing

Re: datanode failing to start

2012-01-09 Thread Suresh Srinivas
Can you please send your notes on what info is out of date or better still create a jira so that it can be addressed. On Fri, Jan 6, 2012 at 3:11 PM, Dave Kelsey da...@gamehouse.com wrote: gave up and installed version 1. it installed correctly and worked, thought the instructions for setup

Re: HDFS load balancing for non-local reads

2012-01-05 Thread Suresh Srinivas
Currently it sorts the block locations as: # local node # local rack node # random order of remote nodes See DatanodeManager#sortLocatedBlock(...) and NetworkTopology#pseudoSortByDistance(...). You can play around with other policies by plugging in different NetworkTopology. On Thu, Jan 5, 2012

Re: HDFS Backup nodes

2011-12-13 Thread Suresh Srinivas
Srivas, As you may know already, NFS is just being used in the first prototype for HA. Two options for editlog store are: 1. Using BookKeeper. Work has already completed on trunk towards this. This will replace need for NFS to store the editlogs and is highly available. This solution will also

Re: Difference between DFS Used and Non-DFS Used

2011-07-08 Thread Suresh Srinivas
non DFS storage is not required, it is provided as information only to shown how the storage is being used. The available storage on the disks is used for both DFS and non DFS (mapreduce shuffle output and any other files that could be on the disks). See if you have unnecessary files or shuffle

Re: Rapid growth in Non DFS Used disk space

2011-05-14 Thread suresh srinivas
dfs.data.dir/current is used by datanodes to store blocks. This directory should only have files starting with blk-* Things to check: - Are there other files that are not blk related? - Did you manually copy the content of one storage dir to another? (some folks did this when they added new

Re: CDH and Hadoop

2011-03-24 Thread suresh srinivas
On Thu, Mar 24, 2011 at 7:04 PM, Rita rmorgan...@gmail.com wrote: Oh! Thats for the heads up on that... I guess I will go with the cloudera source then On Thu, Mar 24, 2011 at 8:41 PM, David Rosenstrauch dar...@darose.net wrote: They do, but IIRC, they recently announced that they're

Re: hadoop fs -du hbase table size

2011-03-15 Thread suresh srinivas
When you brought down the DN, the blocks in it were replicated to the remaining DNs. When the DN was added back, the blocks in it were over replicated, resulting in deletion of the extra replica. On Mon, Mar 14, 2011 at 7:34 AM, Alex Baranau alex.barano...@gmail.comwrote: Hello, As far as I

Re: copy a file from hdfs to local file system with java

2011-02-25 Thread suresh srinivas
For an example how it is done, look at FsShell#copyToLocal() and its internal implementation. It uses FileUtil#copy() method to do this copying. On Fri, Feb 25, 2011 at 5:08 AM, Alessandro Binhara binh...@gmail.comwrote: How to copy a file from a HDS to local file system with a JAVA API ?

Re: corrupt blocks after restart

2011-02-19 Thread suresh srinivas
The problem is that replicas for 3609 blocks are not reported to namenode. Do you have datanodes in exclude file? What is the number of registered nodes before start compared to what it is now? Removing all the datanodes from exclude file (if there are any) and restarting the cluster should fix

Re: Data Nodes do not start

2011-02-09 Thread suresh srinivas
On Tue, Feb 8, 2011 at 11:05 PM, rahul patodi patodira...@gmail.com wrote: I think you should copy the namespaceID of your master which is in name/current/VERSION file to all the slaves This is a sure recipe for disaster. The VERSION file is a file system meta data file not to be messed