David,

I'm afraid that I didn't do much to get it to work.
I added another build profile to the root pom file and added dependencies to 
hadoop-common, hadoop-mapreduce-client-core and -common from the Cloudera maven 
repo.
I built with mvn -Pcdh4.1.0 clean verify package

Nate

Date: Thu, 21 Feb 2013 11:11:40 -0500
From: db...@data-tactics-corp.com
To: user@giraph.apache.org
Subject: Re: Waiting for times required to be 19 (currently 18)


  
    
  
  
    Nate:

         I am fighting to get Giraph built against CDH 4.1.1.   Can you
      provide the process (mvn flags,) and changes you used to get 

      your build to work.   I can get core to build but examples tests
      fail out.  But I can get the page rank example in the

      core jar to work in some cases.

      

      I get a lot of timeouts with the master worker constantly waiting
      on data from the other workers.

      

      On 2/21/2013 11:06 AM, Nate wrote:

    
    
      
      
        
          
          I recently upgraded older Giraph code built
            against CDH3 to a git checkout from a few days ago that
            builds against CDH4.1.0 (MRv1) libraries.  All of the Giraph
            tests pass.

            

            When running my Giraph job with 20 workers, I usually get
            the above error in in 19 map processes:

            

            org.apache.giraph.utils.ExpectedBarrier:
                waitForRequiredPermits: Waiting for times required to be
                19 (currently 18)

            
            

            One map worker always shows something like:

            

            org.apache.giraph.comm.netty.NettyClient:
                waitSomeRequests: Waiting interval of 15000 msecs, 1
                open requests, waiting for it to be <= 0, and
                    some metrics ....

              org.apache.giraph.comm.netty.NettyClient:
                waitSomeRequests: Waiting for request (destTask=17,
                reqId=5032) -
                (reqId=5326,destAddr=host1:30017,elapsedNanos=...,
                started=...,
                writeDone=true, writeSuccess=true)

              repeats...

            
            

            I say this happens usually because the same giraph job does
            complete but only rarely.  I have a timeout of 100 minutes
            set, and the job is killed after that much time has elapsed.

            

            Also, the started field in the above
            output in this past run reads: "Wed Jan 21 14:21:31 EST
            1970"  All machines are synchronized by a single time server
            and currently read accurate times.  I don't think it
            affected the execution, but it still seems erroneous.

            

            I also don't see Hadoop maps having status messages set on
            them.  I see the GraphMapper giving the Context object to
            the GraphTaskManager instance, and I can see it calling
            "context.setStatus(...)" but those messages never show up in
            the map status column in the job tracker page.

            

            Is there something I've missed while upgrading the old code?

          
        
      
    
    

    

    -- 
========= mailto:db...@data-tactics.com ============
David W. Boyd                     
Director, Engineering, Research and Development       
Data Tactics Corporation    
7901 Jones Branch, Suite 240   
Mclean, VA 22102         
office:   +1-703-506-3735, ext 308    
fax:     +1-703-506-6703    
cell:     +1-703-402-7908
============== http://www.data-tactics.com/ ============
 

The information contained in this message may be privileged 
and/or confidential and protected from disclosure.  
If the reader of this message is not the intended recipient 
or an employee or agent responsible for delivering this message 
to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication 
is strictly prohibited.  If you have received this communication 
in error, please notify the sender immediately by replying to 
this message and deleting the material from any computer.

                                          

Reply via email to