building from subversion repository

2013-02-17 Thread George R Goffe
Hi, I'm trying to build hadoop from a current check out of the repository and am receiving the following messages. Can someone enlighten me as to what I'm doing wrong please? Thanks, George... [INFO] BUILD FAILURE [INFO]

RE: Can I perfrom a MR on my local filesystem

2013-02-17 Thread Agarwal, Nikhil
Hi, Thank you Niels and thank you Nitin for your reply. Actually, I want to run MR on a cloud store, which is open source. So I thought of implementing a file system for the same and plugging it into Hadoop, just like S3/KFS are there. This would enable a hadoop client to talk to My cloud

Re: building from subversion repository

2013-02-17 Thread Harsh J
Hi George, The error below is your issue: [ERROR] Could not find goal 'protoc' in plugin org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals - [Help 1] To build trunk, a protocol buffers (protobuf) compiler installation of version 2.4 at least is required, cause we

Re: QJM deployment

2013-02-17 Thread Harsh J
Hi Azuryy, Thanks for your feedback on the docs! I've filed https://issues.apache.org/jira/browse/HDFS-4508 on your behalf to address them. Feel free to file JIRA with documentation complaints with change patches to have them improved yourself :) On Sun, Feb 17, 2013 at 2:25 PM, Azuryy Yu

Re: executing hadoop commands from python?

2013-02-17 Thread anuj maurice
i was stuck with similar issue before and couldn't come up with a more viable alternative than this so if the output of the hadoop command is not that big then you can take it into your py script and process it . i use the following code snippet to clean the output of ls and store it into a py

Re: executing hadoop commands from python?

2013-02-17 Thread Harsh J
Instead of 'scraping' this way, consider using a library such as Pydoop (http://pydoop.sourceforge.net) which provides pythonic ways and APIs to interact with Hadoop components. There are also other libraries covered at http://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/

Re: Namenode failures

2013-02-17 Thread Robert Dyer
It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there

Re: Namenode failures

2013-02-17 Thread Mohammad Tariq
Hello Robert, It seems that your edit logs and fsimage have got corrupted somehow. It looks somewhat similar to this one https://issues.apache.org/jira/browse/HDFS-686 Have you made any changes to the 'dfs.name.dir' directory lately?Do you have enough space where metadata is getting

Re: Namenode failures

2013-02-17 Thread Robert Dyer
On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote: Hello Robert, It seems that your edit logs and fsimage have got corrupted somehow. It looks somewhat similar to this one https://issues.apache.org/jira/browse/HDFS-686 Similar, but the trace is different.

Re: Namenode failures

2013-02-17 Thread Robert Dyer
On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote: You can make use of offine image viewer to diagnose the fsimage file. Is this not included in the 1.0.x branch? All of the documentation I find for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'. Warm

Re: Namenode failures

2013-02-17 Thread Harsh J
Hi Robert, Are you by any chance adding files carrying unusual encoding? If its possible, can we be sent a bundle of the corrupted log set (all of the dfs.name.dir contents) to inspect what seems to be causing the corruption? The only identified (but rarely occurring) bug around this part in

Re: Namenode failures

2013-02-17 Thread Robert Dyer
On Sun, Feb 17, 2013 at 5:08 PM, Harsh J ha...@cloudera.com wrote: Hi Robert, Are you by any chance adding files carrying unusual encoding? I don't believe so. The only files I push to HDFS are SequenceFiles (with protobuf objects in them) and HBase's regions, which again is just protobuf

RE: why my test result on dfs short circuit read is slower?

2013-02-17 Thread Liu, Raymond
I have try to tune io.file.buffer.size to 128K instead of 4K ShortCircuit read performance is still worse than read through datanode. I am start to wondering, does shortcircuit read really help under hadoop 1.1.1 version? I google to find a few people mention they got 2x gain or so upon CDH etc.

Re: product recommendations engine

2013-02-17 Thread Ted Dunning
Yeah... you can make this work. First, if your setup is relatively small, then you won't need Hadoop. Second, having lots of kinds of actions is a very reasonable thing to have. My own suggestion is that you analyze these each for their predictive power independently and then combine them at

RE: why my test result on dfs short circuit read is slower?

2013-02-17 Thread Liu, Raymond
Alright, I think in my sequence read scenario, it is possible that shortcircuit read is actually slower than read through datanode. For, when read through datanode, FS read operation is done by datanode daemon, while data processing is done by client. Thus when client is processing the data,

Re: some ideas for QJM and NFS

2013-02-17 Thread Azuryy Yu
Oh, yes, you are right, George. I'll probably do it in the next days. On Mon, Feb 18, 2013 at 2:47 PM, George Datskos george.dats...@jp.fujitsu.com wrote: Hi Azuryy, So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I think you should also measure hadoop-2.0.3 _wihout_

答复: why my test result on dfs short circuit read is slower?

2013-02-17 Thread 谢良
Probably readahead played a key role on the first scenario(scan only job) ? the default LONG_READ_THRESHOLD_BYTES(BlockSender.java) is 256k in current codebase, and ReadaheadPool takes effect on normal read path. Regards, Liang 发件人: Liu, Raymond

Re: some ideas for QJM and NFS

2013-02-17 Thread Azuryy Yu
Hi, I did it on hadoop-2.0.3-alpha without HA as following: [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_15:20:01 13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using

答复: some ideas for QJM and NFS

2013-02-17 Thread 谢良
Hi Azuryy, just want to confirm one thing, your JN did not deploy on the same machines within DN, right ? Regards, Liang 发件人: Azuryy Yu [azury...@gmail.com] 发送时间: 2013年2月18日 15:22 收件人: user@hadoop.apache.org 主题: Re: some ideas for QJM and NFS Hi, I did it on

Re: 答复: some ideas for QJM and NFS

2013-02-17 Thread Azuryy Yu
All JNs are deployed on the same node with DN. On Mon, Feb 18, 2013 at 3:35 PM, 谢良 xieli...@xiaomi.com wrote: Hi Azuryy, just want to confirm one thing, your JN did not deploy on the same machines within DN, right ? Regards, Liang -- *发件人:* Azuryy Yu