Re: hadoop1.2.1 speedup model

2013-09-09 Thread Robert Evans
How many times did you run the experiment at each setting? What is the standard deviation for each of these settings. It could be that you are simply running into the error bounds of Hadoop. Hadoop is far from consistent in it's performance. For our benchmarking we typically will run the test 5

Re: [VOTE] Release Apache Hadoop 0.23.9

2013-07-02 Thread Robert Evans
+1 downloaded the release. Ran a couple of simple jobs and everything worked. On 7/1/13 12:20 PM, "Thomas Graves" wrote: >I've created a release candidate (RC0) for hadoop-0.23.9 that I would like >to release. > >The RC is available at: >http://people.apache.org/~tgraves/hadoop-0.23.9-candidate

Re: [VOTE] Release Apache Hadoop 0.23.8

2013-05-30 Thread Robert Evans
+1 Downloaded the release and ran a few basic tests. --Bobby On 5/28/13 11:00 AM, "Thomas Graves" wrote: > >I've created a release candidate (RC0) for hadoop-0.23.8 that I would like >to release. > >This release is a sustaining release with several important bug fixes in >it. The most critica

Re: [VOTE] Plan to create release candidate for 0.23.8

2013-05-20 Thread Robert Evans
+1 On 5/17/13 4:10 PM, "Thomas Graves" wrote: >Hello all, > >We've had a few critical issues come up in 0.23.7 that I think warrants a >0.23.8 release. The main one is MAPREDUCE-5211. There are a couple of >other issues that I want finished up and get in before we spin it. Those >include HDFS-

Re: [VOTE] - Release 2.0.5-beta

2013-05-16 Thread Robert Evans
-0 (Binding) I have made my opinion known in the previous thread/vote, but I have spent enough time discussing this and need to get back to my day job. If the community is able to get snapshots and everything else in this list merged and stable without breaking the stack above it in two weeks it w

Re: Heads up - 2.0.5-beta

2013-05-03 Thread Robert Evans
I agree that "destructive" is not the correct word to describe features like snapshots and windows support. However, I also agree with Konstantin that any large feature will have a destabilizing effect on the code base, even if it is done on a branch and thoroughly tested before being merged in. H

Re: mrv1 vs YARN

2013-04-22 Thread Robert Evans
Like with most major releases of Hadoop the releases are API compatible, but not necessarily binary compatible. That means a job for 1.0 can be recompiled against 2.0 and it should compile and run similarly to 1.0. If it does not feel free to file a JAIR on the incompatibility. There have been a

Re: [VOTE] Release Apache Hadoop 2.0.4-alpha

2013-04-17 Thread Robert Evans
+1 (binding) Downloaded the tar ball and ran some simple jobs. --Bobby Evans On 4/17/13 2:01 PM, "Siddharth Seth" wrote: >+1 (binding) >Verified checksums and signatures. >Built from the source tar, deployed a single node cluster and tested a >couple of simple MR jobs. > >- Sid > > >On Fri, Ap

Re: [VOTE] Release Apache Hadoop 0.23.7

2013-04-16 Thread Robert Evans
+1 (binding) I downloaded the release and ran a few sanity tests on it. --Bobby On 4/11/13 2:55 PM, "Thomas Graves" wrote: >I've created a release candidate (RC0) for hadoop-0.23.7 that I would like >to release. > >This release is a sustaining release with several important bug fixes in >it. >

Re: Hadoop Source Code

2013-03-18 Thread Robert Evans
Look at http://wiki.apache.org/hadoop/HowToContribute It gives step by step instructions. --Bobby On 3/18/13 6:43 AM, "Mustaqeem" <3m.mustaq...@gmail.com> wrote: >I am also working in same direction. >As I am new, First of all, I want to know that what have you done to >enhance the >hadoop p

Re: [VOTE] Plan to create release candidate Monday 3/18

2013-03-15 Thread Robert Evans
+1 On 3/10/13 10:38 PM, "Matt Foley" wrote: >Hi all, >I have created branch-1.2 from branch-1, and propose to cut the first >release candidate for 1.2.0 on Monday 3/18 (a week from tomorrow), or as >soon thereafter as I can achieve a stable build. > >Between 1.1.2 and the current 1.2.0, there ar

Re: [VOTE] Plan to create release candidate for 0.23.7

2013-03-15 Thread Robert Evans
+1 On 3/13/13 11:31 AM, "Thomas Graves" wrote: >Hello all, > >I think enough critical bug fixes have went in to branch-0.23 that >warrant another release. I plan on creating a 0.23.7 release by the end >March. > >Please vote '+1' to approve this plan. Voting will close on Wednesday >3/20 at 10:

Re: testing

2013-03-05 Thread Robert Evans
I personally would start off with a bug in an area that you are interested in. https://issues.apache.org/jira/issues/?jql=project%20in%20%28HADOOP%2C%20MA PREDUCE%2C%20HDFS%2C%20YARN%29%20AND%20status%20%3D%20Open%20AND%20type%20% 3D%20Bug%20AND%20assignee%20is%20EMPTY%20ORDER%20BY%20priority%20AS

Re: [Vote] Merge branch-trunk-win to trunk

2013-02-28 Thread Robert Evans
out is an interesting one >>-- >> ie the idea that we would not merge windows support to trunk, but rather >> treat is as a "parallel code line" which lives in the ASF and has its >>own >> builds and releases. The windows team would periodically merge >>tru

Re: [Vote] Merge branch-trunk-win to trunk

2013-02-27 Thread Robert Evans
After this is merged in is Windows still going to be a second class citizen but happens to work for more than just development or is it a fully supported platform where if something breaks it can block a release? How do we as a community intend to keep Windows support from breaking? We don't have

timeout is now requested to be on all tests

2013-02-20 Thread Robert Evans
Sorry about cross posting, but this will impact all developers and I wanted to give you all a heads-up. HADOOP-9112 was just checked it. This means that the pre commit build will now give a –1 for any patch with junit tests that do not include

Re: [VOTE] Hadoop 1.1.2-rc5 release candidate vote

2013-02-08 Thread Robert Evans
Sorry about that +1 (binding) I downloaded the binary tar started everything up and ran a few simple jobs. --Bobby On 2/8/13 12:04 AM, "Matt Foley" wrote: >Wow, total apathy! We only got one vote besides mine, and that was >non-binding. >I'll try again. Please vote on this release candidate f

Re: [VOTE] Release hadoop-2.0.3-alpha

2013-02-07 Thread Robert Evans
I downloaded the binary package and ran a few example jobs on a 3 node cluster. Everything seems to be working OK on it, I did see WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable For every shell command, but just l

Re: More information regarding the Project suggestions given on the Hadoop website

2013-02-07 Thread Robert Evans
13 at 9:42 AM, Varsha Raveendran < >varsha.raveend...@gmail.com> wrote: > >> Thank you! I will check with the Mahout team and also go through Commons >> Math site. >> >> Thanks & Regards, >> Varsha >> >> >> On Sat, Jan 19, 2013 at 12:16 AM, Ro

Re: More information regarding the Project suggestions given on the Hadoop website

2013-01-18 Thread Robert Evans
I'm not sure I am exactly the right person for this, but I assume that you are familiar with genetic algorithms. The Mahout Project is probably a good place to start http://mahout.apache.org/ they have a number of machine learning algorithms that run on top of Hadoop. I did a search and it looks

Re: Problem creating patch for HADOOP-9184

2013-01-10 Thread Robert Evans
or the 0.20 branch. I tried running the test-contrib target but I was >having tests fail because of timeouts. > >Is there documentation somewhere about what I should post to the jira for >an older commit? > > >On Tue, Jan 8, 2013 at 10:53 AM, Jeremy Karn wrote: > >&g

Re: Problem creating patch for HADOOP-9184

2013-01-08 Thread Robert Evans
This is because your patch is against the 0.20 branch, not against trunk. Jenkins pre commit only works for trunk right now. If the issue also exists on trunk then please provide a patch for trunk too, if it is a 1.0/0.20 specific issue then you can run the pre commit tests yourself and just post

Re: [VOTE] introduce Python as build-time and run-time dependency for Hadoop and throughout Hadoop stack

2012-11-26 Thread Robert Evans
+1, +1, 0 On 11/24/12 2:13 PM, "Matt Foley" wrote: >For discussion, please see previous thread "[PROPOSAL] introduce Python as >build-time and run-time dependency for Hadoop and throughout Hadoop >stack". > >This vote consists of three separate items: > >1. Contributors shall be allowed to use P

Re: [PROPOSAL] 1.1.1 and 1.2.0 scheduling

2012-11-09 Thread Robert Evans
+1 On 11/9/12 12:27 PM, "Steve Loughran" wrote: >On 9 November 2012 17:52, Matt Foley wrote: > >> Hi all, >> Hadoop 1.1.0 came out on Oct 12. I think there's enough interest to do >>a >> maintenance release with some important patches. I propose to code >>freeze >> branch-1.1 a week from toda

RE: [DISCUSS] remove packaging

2012-10-15 Thread Robert Evans
Eli answered my question I am a +1 too. -Original Message- From: Alejandro Abdelnur [mailto:t...@cloudera.com] Sent: Monday, October 15, 2012 11:02 AM To: common-dev@hadoop.apache.org Subject: Re: [DISCUSS] remove packaging +1 Alejandro On Oct 15, 2012, at 10:52 AM, Robert Evans

RE: [DISCUSS] remove packaging

2012-10-15 Thread Robert Evans
Eli, By packaging I assume that you mean the RPM/Deb packages and not the tar.gz. If that is the case I have no problem with them being removed because as you said in the JIRA BigTop is already providing a working alternative. If someone else wants to step up to maintain them I also don't hav

Re: [VOTE] Hadoop-1.0.4-rc0

2012-10-10 Thread Robert Evans
I had to also update id.apache.org because the maven repo started to complain that it didn't know who I was. I didn't have any issues checking in my updated keys to the svn repo though. --Bobby On 10/9/12 5:38 PM, "Eli Collins" wrote: >On Tue, Oct 9, 2012 at 1:02 PM, Matt Foley wrote: >> Hi El

Re: Fix versions for commits branch-0.23

2012-10-09 Thread Robert Evans
I don't see much of a reason to have the same JIRA listed under both 0.23 and 2.0. I can see some advantage of being able to see what went into 0.23.X by looking at a 2.0.X CHANGES.txt, but unless the two are released at exactly the same time they will be out of date with each other in the best ca

Re: Commits breaking compilation of MR 'classic' tests

2012-09-26 Thread Robert Evans
MR 'classic' tests Fair, however there are still tests which need to be ported over. We can remove them after the port. On Sep 26, 2012, at 9:54 AM, Robert Evans wrote: As per my comment on the bug. I though we were going to remove them. MAPREDUCE-4266 only needs a little bit more wo

Re: Commits breaking compilation of MR 'classic' tests

2012-09-26 Thread Robert Evans
As per my comment on the bug. I though we were going to remove them. MAPREDUCE-4266 only needs a little bit more work, change a patch to a script, before they disappear entirely. I would much rather see dead code die then be maintained for a few tests that are mostly testing the dead code itself

Re: About ant hadoop

2012-09-19 Thread Robert Evans
It would help if you could explain a bit more about what you changed. It is hard to debug something simply by saying it compiles but does not run correctly. You probably want to check the logs/UI for the JT and try to trace down the path this job is taking. --Bobby On 9/19/12 2:01 AM, "Li Sheng

Re: Hadoop fs -ls behaviour compared to native ls

2012-09-11 Thread Robert Evans
I think most of the rational is for backwards compatibility, but I could be wrong. If you want to change it file a JIRA about it and we can discuss on the JIRA the merits of the change. --Bobby On 9/11/12 6:28 AM, "Hemanth Yamijala" wrote: >Hi, > >hadoop fs -ls dirname > >lists entries like >

Re: [DISCUSS] release branching scheme under maven was [VOTE] 0.23.3 release

2012-09-10 Thread Robert Evans
I forked the thread, because it is not really about the release vote any more, although we both seem to be on the same page so this may be overkill :) On 9/10/12 3:50 PM, "Owen O'Malley" wrote: >On Mon, Sep 10, 2012 at 12:19 PM, Robert Evans >wrote: >> Thanks for th

Re: [VOTE] 0.23.3 release

2012-09-10 Thread Robert Evans
Thanks for the info Owen I was not aware of that, I can see at the beginning of the twiki http://wiki.apache.org/hadoop/HowToReleasePostMavenization that it is kind of implied by the skip this section comment. But, I was just confused because to do an official release I need to change the version

Re: Branch 2 release names

2012-09-05 Thread Robert Evans
ed. Once that happens, we can create a branch-2.1 off branch-2. Does that sound okay? Thanks, +Vinod Kumar Vavilapalli Hortonworks Inc. http://hortonworks.com/ On Sep 4, 2012, at 3:05 PM, Robert Evans wrote: I am fine with that too, but it is going to be a fairly large amount of work to pull in

Re: Branch 2 release names

2012-09-04 Thread Robert Evans
I am fine with that too, but it is going to be a fairly large amount of work to pull in all of the bug fixes into 2.0 that have gone into 0.23. There was already a lot of discussion about just rebasing 2.1 instead of trying to merge everything back into it and 2.1 is a lot further along then 2.0 is

Re: Unused API in LocalDirAllocator

2012-09-04 Thread Robert Evans
I don't think it really matters that much. The API is limited Private and unstable, so I would say just remove it, but fixing it is fine too. Either way file a JIRA on it. --Bobby On 9/4/12 6:34 AM, "Hemanth Yamijala" wrote: >Hi, > >Stumbled on the fact that LocalDirAllocator.ifExists() is not

Re: How to get TaskId from ContainerId or ApplicationId or Request in Hadoop 0.23??

2012-08-23 Thread Robert Evans
There really is no way. The RM also has no knowledge of map tasks vs reduce tasks nor should it know. --Bobby On 8/22/12 8:23 PM, "Shekhar Gupta" wrote: >In ResourceManager, is there any way to findout if the assigned container >is going to execute a mapping task or a reduce task? I can access

Re: MultithreadedMapper

2012-07-26 Thread Robert Evans
In general multithreaded does not get you much in traditional Map/Reduce. If you want the mappers to run faster you can drop the split size and get a similar result, because you get more parallelism. This is the use case that we have typically concentrated on. About the only time that MultiThread

Re: Shifting to Java 7 . Is it good choice?

2012-07-17 Thread Robert Evans
Oracle is dropping java 6 support by the end of the year. So there is likely to be a big shift to java 7 before then. Currently Hadoop officially supports java 6 so unless there is an official change of position you cannot use Java 7 specific APIs if you want to check your code into Hadoop. Hadoo

Re: New JIRA version field for branch-2's next release?

2012-07-16 Thread Robert Evans
Thanks for catching that and fixing it Harsh and Arun. On 7/15/12 10:26 PM, "Harsh J" wrote: >Ah looks like you've covered that edge too, many thanks! > >On Mon, Jul 16, 2012 at 8:40 AM, Harsh J wrote: >> Thanks Arun! I will now diff both branches and fix any places the JIRA >> fix version need

Re: Jetty fixes for Hadoop

2012-07-11 Thread Robert Evans
I am +1 on this also, although I think we need to look at moving to Jetty-7 or possibly dropping Jetty completely and look at Netty or even Tomcat long term. Jetty has just been way too unstable at Hadoop scale and that has not really changed with newer versions of Jetty. Sticking with an old for

Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-09 Thread Robert Evans
ike mapred-site.xml.Am I >correct? > >On Mon, Jul 9, 2012 at 12:36 PM, Robert Evans wrote: > >> On 2.0 core-site, yarn-site, hdfs-site, and mapped-site are all kind of >> needed. The exact configs that you need to set may very a lot based off >> of what you are trying t

Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-09 Thread Robert Evans
t all configuration files are mandatory for the >hadoop-0.23.3 to work. >I am tuning a few but still not able to set it up completely.Thanks > >On Fri, Jul 6, 2012 at 10:24 AM, Robert Evans wrote: > >> Sorry I don't know of a good source for that right now. Perhaps others

Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-06 Thread Robert Evans
generation Hadoop ? >All I find is 1st generation Hadoop setup. > >On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans wrote: > >> That may be something that we missed, as I have been providing my own >> marped-site.xml for quite a while now. Have you tried it with branch-2 >

Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-06 Thread Robert Evans
That may be something that we missed, as I have been providing my own marped-site.xml for quite a while now. Have you tried it with branch-2 or trunk to see if they are providing it? In either case it is just going to be a template for you to fill in, but it would be nice to package that template

Re: JobTracker/TaskTraker heartbeats communication mechanism?

2012-06-29 Thread Robert Evans
Daniel and Joao, The RPC classes in Hadoop handle this. Essentially a proxy object is create on the client side for the interface, then when a method is called in the proxy object the parameters are serialized sent to the configured server along with the method name, where they are deserialized a

Re: Resolving find bug issue

2012-06-26 Thread Robert Evans
The issue you are running into is because you made the HOST variable public, when it was package previously. Findbugs thinks that you want HOST to be a constant because it is ALL CAPS and is only set once and read all other times. By making it public it is now difficult to ensure that it is ne

Re: Cyclic dependency in JobControl job DAG

2012-06-25 Thread Robert Evans
I personally think it is useful. I would say contribute it. (Moved common-dev to bcc, we try not to cross post on these lists) --Bobby Evans On 6/25/12 3:37 AM, "madhu phatak" wrote: Hi, In current implementation of JobControl, whenever there is a cyclic dependency between the jobs it throws

Re: contributing as student

2012-05-29 Thread Robert Evans
Also be aware that a lot of us are very busy, so sadly you may need to send mail to the appropriate mailing list if your patch is not reviewed quickly. --Bobby Evans On 5/29/12 4:32 AM, "Devaraj k" wrote: Good to hear Hasan. Welcome to the Hadoop community. You can find more details in the be

Re: Need Urgent Help on Architecture

2012-05-21 Thread Robert Evans
All attachments are stripped when sent to the mailing list. You will need to use another service if you want us to see the diagram. On 5/18/12 12:50 PM, "samir das mohapatra" wrote: Hi harsh, I wanted to implement one Workflow within the MAPPER. I am Sharing my concept through the Archit

Re: Sailfish

2012-05-11 Thread Robert Evans
That makes perfect sense to me. Especially because it really is a new implementation of shuffle that is optimized for very large jobs. I am happy to see anything go in that is going to improve the performance of hadoop, and I look forward to running some benchmarks on the changes. I am not su

Re: Hadoop: Trunk vs branch src code

2012-04-10 Thread Robert Evans
That depends on where you want your code to go in. If it is a new feature then it needs to go into trunk at a minimum. Trunk and branch-2 are very similar right now so if you want it to go into the next release with MRV2 you may want to target branch-2 as well. It should be minimal effort to

Re: Help with error

2012-04-09 Thread Robert Evans
What do you mean by relocated some supporting files to HDFS? How do you relocate them? What API do you use? --Bobby Evans On 4/9/12 11:10 AM, "Ralph Castain" wrote: Hi folks I'm trying to develop an AM for the 0.23 branch and running into a problem that I'm having difficulty debugging. My

Re: Requirements for patch review

2012-04-04 Thread Robert Evans
I personally like the clarification and it is in line with how I understood the original bylaw when I read it. I don't really want this to turn into a legal document but as this is getting more explicit with clarification it would be nice to put in a small exception for release managers when th

Re: Proto files

2012-03-26 Thread Robert Evans
I responded in the JIRA for this. Because we wrap proto in Hadoop RPC right now those .proto files are not going to do very many people a lot of good, unless they have a client that can also communicate over a simple form of Hadoop RPC. I think it would be good to move to a pure PB RPC impleme

Re: Question about Hadoop-8192 and rackToBlocks ordering

2012-03-22 Thread Robert Evans
If it really is the ordering of the hash map I would say no it should not, and the code should be updated. If ordering matters we need to use a map that guarantees a given order, and hash map is not one of them. --Bobby Evans On 3/22/12 7:24 AM, "Kumar Ravi" wrote: Hello, We have been look

Re: Compressor tweaks corresponding to HDFS-2834, 3051?

2012-03-07 Thread Robert Evans
I am a +1 on opening a new JIRA for a first stab at reducing the amount of data that gets copied around. --Bobby Evans On 3/7/12 1:26 AM, "Tim Broberg" wrote: In https://issues.apache.org/jira/browse/HDFS-2834, Todd says, " This is also useful whenever a native decompression codec is being

Re: Hadoop on non x86 systems

2012-03-06 Thread Robert Evans
There are kind of two ways to submit a proposal. 1 - send an e-mail here with the proposal. 2 - file a JIRA and attach your proposal to it. Usually it is a combination of the two. You start a conversation on the mailing list, and then at some point file a JIRA to track the work and capture the

Re: Execute a Map/Reduce Job Jar from Another Java Program.

2012-02-03 Thread Robert Evans
elegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) at com.amd.wrapper.main.ParserWrapper.main(ParserWrapper.java:31) Thanks, Abees On 2 February 2012 23:02, Robert Evans wrote: What happens?

Re: Execute a Map/Reduce Job Jar from Another Java Program.

2012-02-02 Thread Robert Evans
What happens? Is there an exception, does nothing happen? I am curious. Also how did you launch your other job that is trying to run this one. The hadoop script sets up a lot of environment variables classpath etc to make hadoop work properly, and some of that may not be set up correctly to

Re: Moving TB of data from NFS to HDFS

2012-01-25 Thread Robert Evans
aveen Sripati" wrote: > If it is divided up into several files and you can mount your NFS directory on each of the datanodes. Just curious, how will this help. Praveen On Wed, Jan 25, 2012 at 12:39 AM, Robert Evans wrote: > If it is divided up into several files and you can mount your

Re: Moving TB of data from NFS to HDFS

2012-01-24 Thread Robert Evans
If it is divided up into several files and you can mount your NFS directory on each of the datanodes, you could possibly use distcp to do it. I have never tried using distcp for this, but it should work. Or you can write your own streaming Map/Reduce script that does more or less the same thin

Re: Security in 0.23

2012-01-04 Thread Robert Evans
It should have all of the same security. Some of it has been renamed, and some of the token work is still on going. The LinuxTaskController has been renamed because "Tasks" are map reduce specific. It is now the LinuxContainerExecutor. I don't remember all of the updated config names, off t

Re: How Jobtracler stores tasktracker's information

2011-12-13 Thread Robert Evans
I am not completely sure what you mean by this. In Hadoop the TaskTracker will heartbeat into the JobTracker to report its status and get new tasks to launch. The Scheduler, which is pluggable, then matches up requests for tasks with the TaskTracker. If you want to see where the matching up o

Re: Automatically Documenting Apache Hadoop Configuration

2011-12-05 Thread Robert Evans
>From my work on yarn trying to document the configs there and to standardize >them, writing anything that is going to automatically detect config values >through static analysis is going to be very difficult. This is because most >of the configs in yarn are now built up using static string con

Re: Hadoop - non disk based sorting?

2011-12-01 Thread Robert Evans
Mingxi, My understanding was that just like with the maps that when a reducer's in memory buffer fills up it too will spill to disk as part of the sort. In fact I think it uses the exact same code for doing the sort as the map does. There may be an issue where your sort buffer is some how too

Re: Which branch for my patch?

2011-11-30 Thread Robert Evans
Niels, I think that the branch you put it on depends mostly on where you and others what to see this feature (splittable Gzip) go in. At a minimum you should target trunk. If you want to see it go into 1.* then you probably also want to port it to that line (branch-1). Once they are in porti

Re: Parallel mapred jobs in Yarn

2011-11-09 Thread Robert Evans
The configuration options are somewhat different for yarn, then they are for MRV1. You probably want to generate the documentation for yarn mvn site And then read through it about how to set up your cluster ./hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/target/site/index.html There i

Re: Viewing hadoop mapper output

2011-10-07 Thread Robert Evans
one I ran. I am not running in local mode. Is there some way by which I can get intermediate mapper outputs ? I would like to see for which site the mapper is getting stalled. Thanks, Aishwarya On Thu, Oct 6, 2011 at 1:41 PM, Robert Evans wrote: > Alshwarya, > > Are you running i

Re: Viewing hadoop mapper output

2011-10-06 Thread Robert Evans
t 12:37 PM, Robert Evans wrote: > A streaming jobs stderr is logged for the task, but its stdout is what is > sent to the reducer. The simplest way to get it is to turn off the > reducers, and then look at the output in HDFS. > > --Bobby Evans > > On 10/6/11 1:16 PM, "Aishw

Re: Viewing hadoop mapper output

2011-10-06 Thread Robert Evans
A streaming jobs stderr is logged for the task, but its stdout is what is sent to the reducer. The simplest way to get it is to turn off the reducers, and then look at the output in HDFS. --Bobby Evans On 10/6/11 1:16 PM, "Aishwarya Venkataraman" wrote: Hello, I want to view the mapper outp

Re: problem of lost name-node

2011-09-28 Thread Robert Evans
There is also some work underway to add in HA and failover to the namenode. You might get more success if you send your note to hdfs-dev instead of common-dev. One other thing that can sometimes get a discussion going is to just file a JIRA for it. People interested in it are likely to start

Re: Maven eclipse plugin issue

2011-09-20 Thread Robert Evans
to me. --Bobby Evans On 9/20/11 9:14 AM, "Alejandro Abdelnur" wrote: Bobby, What is the POM change you are referring to? Thanks. Alejandro On Tue, Sep 20, 2011 at 7:00 AM, Robert Evans wrote: > This is a known issue with the eclipse maven mojo > > http://jira.codehaus.

Re: Maven eclipse plugin issue

2011-09-20 Thread Robert Evans
This is a known issue with the eclipse maven mojo http://jira.codehaus.org/browse/MECLIPSE-37 The JIRA also describes a workaround, add the generated tests directory in the eclipse config with a pom change, which I think would be better then trying to move the phase where test code is generate

Re: Platform MapReduce - Enterprise Features

2011-09-12 Thread Robert Evans
Chi, Most of these features are things that Hadoop is working on. There is an HA branch in progress that should go into trunk relatively soon. As far as the batch system integration is concerned if what you care about is scheduling of jobs, which jobs run when and on which machines, you can wr

Re: JIRA attachments order

2011-09-09 Thread Robert Evans
Can I ask, though that we do add branch information in the patches. Too often a patch is intended to apply to some branch other then trunk, and there is no easy way to tell what branch it was intended for. --Bobby Evans On 9/9/11 10:52 AM, "Mattmann, Chris A (388J)" wrote: Wow, I didn't kn

Re: ERROR building latest trunk for Hadoop project

2011-08-31 Thread Robert Evans
I ran into the same error with mvn compile. There are some issues with dependency resolution in mvn and you need to run mvn test -DskipTests To compile the code. --Bobby On 8/30/11 7:21 AM, "Praveen Sripati" wrote: Rerun the build with the below options and see if you can get more informat

Re: Question about invoking an executable from Hadoop mapper

2011-08-25 Thread Robert Evans
I think this is a java issue. I don't think that it is launching a shell to run your command. I think it is just splitting on white space and then passing all the args to hadoop. What you want to do is to run sh -e 'hadoop dfs -cat file| myExec' Or with streaming white a small shell script t

Re: how to pass a hdfs file to a c++ process

2011-08-23 Thread Robert Evans
Hadoop streaming is the simplest way to do this, if you program is set up to take stdin as its input, write to stdout for the output, and each record "file" in your case is a single line of text. You need to be able to have it work with the following shell script Hadoop fs -cat | head -1 | ./m