Support multiple block placement policies

2014-09-15 Thread Zesheng Wu
Hi there, According to the code, the current implement of HDFS only supports one specific type of block placement policy, which is BlockPlacementPolicyDefault by default. The default policy is enough for most of the circumstances, but under some special circumstances, it works not so well. For

Re: HDFS: Couldn't obtain the locations of the last block

2014-09-10 Thread Zesheng Wu
only in the case of NN last block size reported as non-zero if it was synced (see more in HDFS-4516). Regards, Yi Liu *From:* Zesheng Wu [mailto:wuzeshen...@gmail.com] *Sent:* Tuesday, September 09, 2014 6:16 PM *To:* user@hadoop.apache.org *Subject:* HDFS: Couldn't obtain the locations

Re: HDFS: Couldn't obtain the locations of the last block

2014-09-10 Thread Zesheng Wu
Hi Yi, I went through HDFS-4516, and it really solves our problem, thanks very much! 2014-09-10 16:39 GMT+08:00 Zesheng Wu wuzeshen...@gmail.com: Thanks Yi, I will look into HDFS-4516. 2014-09-10 15:03 GMT+08:00 Liu, Yi A yi.a@intel.com: Hi Zesheng, I got from an offline email

HDFS: Couldn't obtain the locations of the last block

2014-09-09 Thread Zesheng Wu
Hi, These days we encountered a critical bug in HDFS which can result in HBase can't start normally. The scenario is like following: 1. rs1 writes data to HDFS file f1, and the first block is written successfully 2. rs1 apply to create the second block successfully, at this time, nn1(ann) is

Re: Replace a block with a new one

2014-07-21 Thread Zesheng Wu
HDFS on top of RAID? I am not sure I understand any of these use cases. HDFS handles for you replication and error detection. Fine tuning the cluster wouldn't be the easier solution? Bertrand Dechoux On Mon, Jul 21, 2014 at 7:25 AM, Zesheng Wu wuzeshen...@gmail.com wrote: Thanks for reply

Re: Replace a block with a new one

2014-07-21 Thread Zesheng Wu
is that the less replicas there is, the less read of the data for processing will be 'efficient/fast'. Reducing the number of replicas also diminishes the number of supported node failures. I wouldn't say it's an easy ride. Bertrand Dechoux On Mon, Jul 21, 2014 at 1:29 PM, Zesheng Wu wuzeshen

Re: Replace a block with a new one

2014-07-21 Thread Zesheng Wu
-07-21 20:30 GMT+08:00 Zesheng Wu wuzeshen...@gmail.com: Thanks Bertrand, my reply comments inline following. So you know that a block is corrupted thanks to an external process which in this case is checking the parity blocks. If a block is corrupted but hasn't been detected by HDFS, you could

Re: Replace a block with a new one

2014-07-21 Thread Zesheng Wu
for such tool on a Hadoop cluster. Bertrand Dechoux On Mon, Jul 21, 2014 at 2:35 PM, Zesheng Wu wuzeshen...@gmail.com wrote: If a block is corrupted but hasn't been detected by HDFS, you could delete the block from the local filesystem (it's only a file) then HDFS will replicate the good

Re: Replace a block with a new one

2014-07-21 Thread Zesheng Wu
issues: https://issues.apache.org/jira/browse/HDFS-503 https://issues.apache.org/jira/browse/HDFS-600 Thanks again Bertrand, I will check through the facebook branch to find more information. 2014-07-22 9:31 GMT+08:00 Zesheng Wu wuzeshen...@gmail.com: Thank Bertrand, I've checked these information

Re: Replace a block with a new one

2014-07-20 Thread Zesheng Wu
PM, Zesheng Wu wuzeshen...@gmail.com wrote: How about write a new block with new checksum file, and replace the old block file and checksum file both? 2014-07-17 19:34 GMT+08:00 Wellington Chevreuil wellington.chevre...@gmail.com: Hi, there's no way to do that, as HDFS does not provide

Replace a block with a new one

2014-07-17 Thread Zesheng Wu
Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we write a new file B which is composed by the new data blocks, then we merge A and B to C which

Re: Replace a block with a new one

2014-07-17 Thread Zesheng Wu
. Regards, Wellington. On 17 Jul 2014, at 10:50, Zesheng Wu wuzeshen...@gmail.com wrote: Hi guys, I recently encounter a scenario which needs to replace an exist block with a newly written block The most straightforward way to finish may be like this: Suppose the original file is A, and we

Re: HDFS File Writes Reads

2014-06-17 Thread Zesheng Wu
1. HDFS doesn't allow parallel write 2. HDFS use pipeline to write multiple replicas, so it doesn't take three times more time than a traditional file write 3. HDFS allow parallel read 2014-06-17 19:17 GMT+08:00 Vijaya Narayana Reddy Bhoomi Reddy vijay.bhoomire...@gmail.com: Hi, I have a

Re: Programmatic Kerberos login with password to a secure cluster

2014-06-16 Thread Zesheng Wu
Perhaps you can use LDAP(or any other possible way) to do the authentication on the WebServer, and then let the WebServer as an authenticated proxy user to agent real users requests. 2014-06-17 4:11 GMT+08:00 Geoff Thompson ge...@bearpeak.com: Greetings, We are developing a YARN application

Re: Upgrade to 2.4

2014-05-30 Thread Zesheng Wu
Hi Ian, -rollingUpgrade is available since hadoop 2.4, so you can't use -rollingUpgrade to upgrade a 2.3 cluster to 2.4. Because HDFS 2.4 introduce protobuf formated fsimage, we must stop the whole cluster and then upgrade the cluster. I tried to upgrade a 2.0 HDFS cluster to 2.4 several days ago,

Re: HDFS undo Overwriting

2014-05-30 Thread Zesheng Wu
I am afraid this cannot undo, in HDFS only the data which is deleted by the dfs client and goes into the trash can be undone. 2014-05-30 18:18 GMT+08:00 Amjad ALSHABANI ashshab...@gmail.com: Hello Everybody, I ve made a mistake when writing to HDFS. I created new database in Hive giving the

Re: issue about how to decommission a datanode from hadoop cluster

2014-05-30 Thread Zesheng Wu
I think you just need to set an exclude file on NN, that's enough. 2014-05-30 14:09 GMT+08:00 ch huang justlo...@gmail.com: hi,maillist: I use CDH4.4 yarnhdfs cluster ,i want to decommission a datanode ,should i modify hdfs-site.xml and mapred-xml of all node in cluster to

Re:

2014-01-10 Thread Zesheng Wu
Of course you can, you can think this as an independent runnable program. 2014/1/11 Andrea Barbato and.barb...@gmail.com Hi, i have a simple question. I have this example code: class WordCountMapper : public HadoopPipes::Mapper {public: // constructor: does nothing WordCountMapper(