Java/Webhdfs - upload with snappy compressing on the fly
Hello I need to upload large file using WEBHDFS (from local disk into HDFS, WEBHDFS is my only option, don't have direct access). Because in my case network connection is bottleneck I decided to compress file with snappy before sending. I am using Java application, compiled with "org.apache.hadoop:hadoop-client:2.4.0" library. So far my code looks as below: private void uploadFile(Path hdfsPath, FileSystem fileSystem) throws IOException { // Input file reader BufferedReader bufferedReader = new BufferedReader(new FileReader(localFile), INPUT_STREAM_BUFFER_SIZE); // Output file writer FSDataOutputStream hdfsDataOutputStream = fileSystem.create(hdfsPath, false, OUTPUT_STREAM_BUFFER_SIZE); SnappyOutputStream snappyOutputStream = new SnappyOutputStream(hdfsDataOutputStream, OUTPUT_STREAM_BUFFER_SIZE); BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(snappyOutputStream, "UTF-8")); String line; while ((line = bufferedReader.readLine()) != null) { bufferedWriter.write(line); } bufferedReader.close(); bufferedWriter.close(); } Basically it works. Snappy compressed file is uploaded to HDFS, yet there seem to be problems with snappy format itsefl. It is not recognized as snappy compressed file by Hadoop. I checked my compressed file, and another one compressed by Hadoop. Main compressed stream seem to be the same in both files, but headers are different. What do I do wrong? Would you be so kind to suggest any solution for my issue? Best Regards Michal Michalak
default 8 mappers per host ?
Hi, all: I configured a hadoop cluster with 9 hosts, each with 2 VCPU and 4G Ram. I noticed when I run the example pi program, only when I configure it with at least 8*9=72 mappers will all hosts be busy. Which means there is a default 8 mappers per host? How is this value decided? And where can I change it? Thanks very much! Sisu -- *Sisu Xi, PhD Candidate* http://www.cse.wustl.edu/~xis/ Department of Computer Science and Engineering Campus Box 1045 Washington University in St. Louis One Brookings Drive St. Louis, MO 63130
Re: Not able to place enough replicas
Maybe the user 'test' has no privilege of write operation. You can refer the ERROR log like: org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:test (auth:SIMPLE) 2014-07-15 2:07 GMT+08:00 Bogdan Raducanu : > I'm getting this error while writing many files. > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not > able to place enough replicas, still in need of 4 to reach 4 > > I've set logging to DEBUG but still there is no reason printed. There > should've been a reason after this line but instead there's just an empty > line. > Has anyone seen something like this before? It is seen on a 4 node cluster > running hadoop 2.2 > > > org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.create: file /file_1002 > for DFSClient_NONMAPREDUCE_839626346_1 at 192.168.180.1 > org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: > src=/file_1002, holder=DFSClient_NONMAPREDUCE_839626346_1, > clientMachine=192.168.180.1, createParent=true, replication=4, > createFlag=[CREATE, OVERWRITE] > org.apache.hadoop.hdfs.StateChange: DIR* addFile: /file_1002 is added > org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: add > /file_1002 to namespace for DFSClient_NONMAPREDUCE_839 > << ... many other operations ... >> > 8 seconds later: > org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file > /file_1002 fileId=189252 for DFSClient_NONMAPREDUCE_839626346_1 > org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: > file /file_1002 for DFSClient_NONMAPREDUCE_839626346_1 > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not > able to place enough replicas, still in need of 4 to reach 4 > << EMPTY LINE >> > org.apache.hadoop.security.UserGroupInformation: > PriviledgedActionException as:test (auth:SIMPLE) cause:java.io.IOException: > File /file_1002 could only be replicated to 0 nodes instead of > minReplication (=1). There are 4 datanode(s) running and no node(s) are > excluded in this operation. > org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call > org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from > 192.168.180.1:49592 Call#1321 Retry#0: error: java.io.IOException: File > /file_1002 could only be replicated to 0 nodes instead of minReplication > (=1). There are 4 datanode(s) running and no node(s) are excluded in this > operation. > java.io.IOException: File /file_1002 could only be replicated to 0 nodes > instead of minReplication (=1). There are 4 datanode(s) running and no > node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042):0 > >
Re: clarification on HBASE functionality
Right. hbase is different from Cassandra in this regard. On Mon, Jul 14, 2014 at 2:57 PM, Adaryl "Bob" Wakefield, MBA < adaryl.wakefi...@hotmail.com> wrote: > Now this is different from Cassandra which does NOT use HDFS correct? > (Sorry. Don’t know why that needed two emails.) > > B. > > *From:* Ted Yu > *Sent:* Monday, July 14, 2014 4:53 PM > *To:* mailto:user@hadoop.apache.org > *Subject:* Re: clarification on HBASE functionality > > Yes. > See http://hbase.apache.org/book.html#arch.hdfs > > > On Mon, Jul 14, 2014 at 2:52 PM, Adaryl "Bob" Wakefield, MBA < > adaryl.wakefi...@hotmail.com> wrote: > >> HBASE uses HDFS to store it's data correct? >> >> B. >> > >
Re: clarification on HBASE functionality
Now this is different from Cassandra which does NOT use HDFS correct? (Sorry. Don’t know why that needed two emails.) B. From: Ted Yu Sent: Monday, July 14, 2014 4:53 PM To: mailto:user@hadoop.apache.org Subject: Re: clarification on HBASE functionality Yes. See http://hbase.apache.org/book.html#arch.hdfs On Mon, Jul 14, 2014 at 2:52 PM, Adaryl "Bob" Wakefield, MBA wrote: HBASE uses HDFS to store it's data correct? B.
Re: clarification on HBASE functionality
Yes. See http://hbase.apache.org/book.html#arch.hdfs On Mon, Jul 14, 2014 at 2:52 PM, Adaryl "Bob" Wakefield, MBA < adaryl.wakefi...@hotmail.com> wrote: > HBASE uses HDFS to store it's data correct? > > B. >
clarification on HBASE functionality
HBASE uses HDFS to store it's data correct? B.
Re: OIV Compatiblity
There shouldn't be any - it basically streams over the existing local fsimage file. On Tue, Jul 15, 2014 at 12:21 AM, Ashish Dobhal wrote: > Sir I tried it it works. Are there any issues in downloading the gsimage > using wget. > > > On Tue, Jul 15, 2014 at 12:17 AM, Harsh J wrote: >> >> Sure, you could try that. I've not tested that mix though, and OIV >> relies on some known formats support, but should hopefully work. >> >> On Mon, Jul 14, 2014 at 11:56 PM, Ashish Dobhal >> wrote: >> > Could I download the fsimage of a hadoop 1.0 using wget and then >> > interpret >> > it in offline mode using the tool in the hadoop 1.2 or higher >> > distributions.I guess the structure of fsimage would be same for both >> > the >> > distributions. >> > >> > >> > On Mon, Jul 14, 2014 at 11:53 PM, Ashish Dobhal >> > >> > wrote: >> >> >> >> Harsh thanks >> >> >> >> >> >> On Mon, Jul 14, 2014 at 11:39 PM, Harsh J wrote: >> >>> >> >>> The OIV for 1.x series is available in release 1.2.0 and higher. You >> >>> can use it from the 'hadoop oiv' command. >> >>> >> >>> It is not available in 1.0.x. >> >>> >> >>> On Mon, Jul 14, 2014 at 9:49 PM, Ashish Dobhal >> >>> wrote: >> >>> > Hey everyone ; >> >>> > Could anyone tell me how to use the OIV tool in hadoop 1.0 as there >> >>> > is >> >>> > no >> >>> > hdfs.sh file there. >> >>> > Thanks. >> >>> >> >>> >> >>> >> >>> -- >> >>> Harsh J >> >> >> >> >> > >> >> >> >> -- >> Harsh J > > -- Harsh J
Re: changing split size in Hadoop configuration
For what it's worth, mapreduce.jobtracker.split.metainfo.maxsize is related to the size of the file containing the information describing the input splits. It is not related directly to the volume of data but to the number of splits which might explode when using too many (small) files. It's basically a safeguard. Alternatively, you might want to reduce the number of splits ; raising the block size is one way to do it. Bertrand Dechoux On Mon, Jul 14, 2014 at 7:50 PM, Adam Kawa wrote: > It sounds like JobTracker setting, so the restart looks to be required. > > You verify it in pseudo-distributed mode by setting it to a very low > value, restarting JT and seeing if you get the exception that prints this > new value. > > Sent from my iPhone > > On 14 jul 2014, at 16:03, Jan Warchoł wrote: > > Hello, > > I recently got "Split metadata size exceeded 1000" error when running > Cascading jobs with very big joins. I found that I should change > mapreduce.jobtracker.split.metainfo.maxsize property in hadoop > configuration by adding this to the mapred-site.xml file: > > > > mapreduce.jobtracker.split.metainfo.maxsize > 10 > > > but it didn't seem to have any effect - I'm probably doing something wrong. > > Where should I add this change so that is has the desired effect? Do I > understand correctly that jobtracker restart is required after making the > change? The cluster I'm working on has Hadoop 1.0.4. > > thanks for any help, > -- > *Jan Warchoł* > *Software Engineer* > > > - > M: +48 509 078 203 > E: jan.warc...@codilime.com > - > CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, > 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the > Capital City of Warsaw, XII Commercial Department of the National Court > Register. Entered into National Court Register under No. KRS 388871. > Tax identification number (NIP) 5272657478. Statistical number > (REGON) 142974628. > >
Re: OIV Compatiblity
Sir I tried it it works. Are there any issues in downloading the gsimage using wget. On Tue, Jul 15, 2014 at 12:17 AM, Harsh J wrote: > Sure, you could try that. I've not tested that mix though, and OIV > relies on some known formats support, but should hopefully work. > > On Mon, Jul 14, 2014 at 11:56 PM, Ashish Dobhal > wrote: > > Could I download the fsimage of a hadoop 1.0 using wget and then > interpret > > it in offline mode using the tool in the hadoop 1.2 or higher > > distributions.I guess the structure of fsimage would be same for both the > > distributions. > > > > > > On Mon, Jul 14, 2014 at 11:53 PM, Ashish Dobhal < > dobhalashish...@gmail.com> > > wrote: > >> > >> Harsh thanks > >> > >> > >> On Mon, Jul 14, 2014 at 11:39 PM, Harsh J wrote: > >>> > >>> The OIV for 1.x series is available in release 1.2.0 and higher. You > >>> can use it from the 'hadoop oiv' command. > >>> > >>> It is not available in 1.0.x. > >>> > >>> On Mon, Jul 14, 2014 at 9:49 PM, Ashish Dobhal > >>> wrote: > >>> > Hey everyone ; > >>> > Could anyone tell me how to use the OIV tool in hadoop 1.0 as there > is > >>> > no > >>> > hdfs.sh file there. > >>> > Thanks. > >>> > >>> > >>> > >>> -- > >>> Harsh J > >> > >> > > > > > > -- > Harsh J >
Re: OIV Compatiblity
Sure, you could try that. I've not tested that mix though, and OIV relies on some known formats support, but should hopefully work. On Mon, Jul 14, 2014 at 11:56 PM, Ashish Dobhal wrote: > Could I download the fsimage of a hadoop 1.0 using wget and then interpret > it in offline mode using the tool in the hadoop 1.2 or higher > distributions.I guess the structure of fsimage would be same for both the > distributions. > > > On Mon, Jul 14, 2014 at 11:53 PM, Ashish Dobhal > wrote: >> >> Harsh thanks >> >> >> On Mon, Jul 14, 2014 at 11:39 PM, Harsh J wrote: >>> >>> The OIV for 1.x series is available in release 1.2.0 and higher. You >>> can use it from the 'hadoop oiv' command. >>> >>> It is not available in 1.0.x. >>> >>> On Mon, Jul 14, 2014 at 9:49 PM, Ashish Dobhal >>> wrote: >>> > Hey everyone ; >>> > Could anyone tell me how to use the OIV tool in hadoop 1.0 as there is >>> > no >>> > hdfs.sh file there. >>> > Thanks. >>> >>> >>> >>> -- >>> Harsh J >> >> > -- Harsh J
Re: OIV Compatiblity
Could I download the fsimage of a hadoop 1.0 using wget and then interpret it in offline mode using the tool in the hadoop 1.2 or higher distributions.I guess the structure of fsimage would be same for both the distributions. On Mon, Jul 14, 2014 at 11:53 PM, Ashish Dobhal wrote: > Harsh thanks > > > On Mon, Jul 14, 2014 at 11:39 PM, Harsh J wrote: > >> The OIV for 1.x series is available in release 1.2.0 and higher. You >> can use it from the 'hadoop oiv' command. >> >> It is not available in 1.0.x. >> >> On Mon, Jul 14, 2014 at 9:49 PM, Ashish Dobhal >> wrote: >> > Hey everyone ; >> > Could anyone tell me how to use the OIV tool in hadoop 1.0 as there is >> no >> > hdfs.sh file there. >> > Thanks. >> >> >> >> -- >> Harsh J >> > >
Re: OIV Compatiblity
Harsh thanks On Mon, Jul 14, 2014 at 11:39 PM, Harsh J wrote: > The OIV for 1.x series is available in release 1.2.0 and higher. You > can use it from the 'hadoop oiv' command. > > It is not available in 1.0.x. > > On Mon, Jul 14, 2014 at 9:49 PM, Ashish Dobhal > wrote: > > Hey everyone ; > > Could anyone tell me how to use the OIV tool in hadoop 1.0 as there is no > > hdfs.sh file there. > > Thanks. > > > > -- > Harsh J >
Re: OIV Compatiblity
The OIV for 1.x series is available in release 1.2.0 and higher. You can use it from the 'hadoop oiv' command. It is not available in 1.0.x. On Mon, Jul 14, 2014 at 9:49 PM, Ashish Dobhal wrote: > Hey everyone ; > Could anyone tell me how to use the OIV tool in hadoop 1.0 as there is no > hdfs.sh file there. > Thanks. -- Harsh J
Not able to place enough replicas
I'm getting this error while writing many files. org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to place enough replicas, still in need of 4 to reach 4 I've set logging to DEBUG but still there is no reason printed. There should've been a reason after this line but instead there's just an empty line. Has anyone seen something like this before? It is seen on a 4 node cluster running hadoop 2.2 org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.create: file /file_1002 for DFSClient_NONMAPREDUCE_839626346_1 at 192.168.180.1 org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: src=/file_1002, holder=DFSClient_NONMAPREDUCE_839626346_1, clientMachine=192.168.180.1, createParent=true, replication=4, createFlag=[CREATE, OVERWRITE] org.apache.hadoop.hdfs.StateChange: DIR* addFile: /file_1002 is added org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: add /file_1002 to namespace for DFSClient_NONMAPREDUCE_839 << ... many other operations ... >> 8 seconds later: org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /file_1002 fileId=189252 for DFSClient_NONMAPREDUCE_839626346_1 org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file /file_1002 for DFSClient_NONMAPREDUCE_839626346_1 org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to place enough replicas, still in need of 4 to reach 4 << EMPTY LINE >> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:test (auth:SIMPLE) cause:java.io.IOException: File /file_1002 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.180.1:49592 Call#1321 Retry#0: error: java.io.IOException: File /file_1002 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. java.io.IOException: File /file_1002 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042):0
Re: changing split size in Hadoop configuration
It sounds like JobTracker setting, so the restart looks to be required. You verify it in pseudo-distributed mode by setting it to a very low value, restarting JT and seeing if you get the exception that prints this new value. Sent from my iPhone > On 14 jul 2014, at 16:03, Jan Warchoł wrote: > > Hello, > > I recently got "Split metadata size exceeded 1000" error when running > Cascading jobs with very big joins. I found that I should change > mapreduce.jobtracker.split.metainfo.maxsize property in hadoop configuration > by adding this to the mapred-site.xml file: > > > > mapreduce.jobtracker.split.metainfo.maxsize > 10 > > > but it didn't seem to have any effect - I'm probably doing something wrong. > > Where should I add this change so that is has the desired effect? Do I > understand correctly that jobtracker restart is required after making the > change? The cluster I'm working on has Hadoop 1.0.4. > > thanks for any help, > -- > Jan Warchoł > Software Engineer > > - > M: +48 509 078 203 > E: jan.warc...@codilime.com > - > CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, > 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the > Capital City of Warsaw, XII Commercial Department of the National Court > Register. Entered into National Court Register under No. KRS 388871. Tax > identification number (NIP) 5272657478. Statistical number (REGON) 142974628.
OIV Compatiblity
Hey everyone ; Could anyone tell me how to use the OIV tool in hadoop 1.0 as there is no hdfs.sh file there. Thanks.
changing split size in Hadoop configuration
Hello, I recently got "Split metadata size exceeded 1000" error when running Cascading jobs with very big joins. I found that I should change mapreduce.jobtracker.split.metainfo.maxsize property in hadoop configuration by adding this to the mapred-site.xml file: mapreduce.jobtracker.split.metainfo.maxsize 10 but it didn't seem to have any effect - I'm probably doing something wrong. Where should I add this change so that is has the desired effect? Do I understand correctly that jobtracker restart is required after making the change? The cluster I'm working on has Hadoop 1.0.4. thanks for any help, -- *Jan Warchoł* *Software Engineer* - M: +48 509 078 203 E: jan.warc...@codilime.com - CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the Capital City of Warsaw, XII Commercial Department of the National Court Register. Entered into National Court Register under No. KRS 388871. Tax identification number (NIP) 5272657478. Statistical number (REGON) 142974628.
回复: Block should be additionally replicated on 1 more rack(s)
HI , I didn't try Hadoop rebalancer 。Because I remember rebalancer only considers disk load, and won't consider that data blocks which rack 。 I can try 。Thank you for your reply 。 -- 原始邮件 -- 发件人: "Yehia Elshater";; 发送时间: 2014年7月14日(星期一) 下午4:52 收件人: "user"; 主题: Re: Block should be additionally replicated on 1 more rack(s) Hi, Did you try Hadoop rebalancer ? http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html#Rebalancer On 14 July 2014 04:10, 风雨无阻 <232341...@qq.com> wrote: HI all: After the cluster configuration rack awareness,run " hadoop fsck / " A lot of the following error occurred: Replica placement policy is violated for blk_-1267324897180563985_11130670. Block should be additionally replicated on 1 more rack(s). Online said "The reason is that three copies on the same rack" . The solution is now: hadoop dfs -setrep 4 /user/hive/warehouse/tbl_add_av_errorlog_android/dt=2013-08-24/04_0 sleep N hadoop dfs -setrep 3 /user/hive/warehouse/tbl_add_av_errorlog_android/dt=2013-08-24/04_0 But the speed is very slow。 What is a better good way to make HDFS become healthy? Thanks, Ma Jian
Re: Block should be additionally replicated on 1 more rack(s)
Hi, Did you try Hadoop rebalancer ? http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html#Rebalancer On 14 July 2014 04:10, 风雨无阻 <232341...@qq.com> wrote: > HI all: > > After the cluster configuration rack awareness,run " hadoop fsck / " > A lot of the following error occurred: > Replica placement policy is violated for > blk_-1267324897180563985_11130670. Block should be additionally replicated > on 1 more rack(s). > > Online said "The reason is that three copies on the same rack" . > The solution is now: > hadoop dfs -setrep 4 > /user/hive/warehouse/tbl_add_av_errorlog_android/dt=2013-08-24/04_0 > sleep N > hadoop dfs -setrep 3 > /user/hive/warehouse/tbl_add_av_errorlog_android/dt=2013-08-24/04_0 > But the speed is very slow。 > > What is a better good way to make HDFS become healthy? > > Thanks, > Ma Jian >
Block should be additionally replicated on 1 more rack(s)
HI all: After the cluster configuration rack awareness,run " hadoop fsck / " A lot of the following error occurred: Replica placement policy is violated for blk_-1267324897180563985_11130670. Block should be additionally replicated on 1 more rack(s). Online said "The reason is that three copies on the same rack" . The solution is now: hadoop dfs -setrep 4 /user/hive/warehouse/tbl_add_av_errorlog_android/dt=2013-08-24/04_0 sleep N hadoop dfs -setrep 3 /user/hive/warehouse/tbl_add_av_errorlog_android/dt=2013-08-24/04_0 But the speed is very slow。 What is a better good way to make HDFS become healthy? Thanks, Ma Jian
Re: Hadoop(version 2.4.1) is a symbolic link support?
Hadoop 2.4.1 doesn't support symbolic link. (2014/07/14 11:34), cho ju il wrote: My hadoop version is 2.4.1. Hdfs(version 2.4.1) is a symbolic link support? How do I create symbolic links?