Re: issue regarding akka, protobuf and Hadoop version

2013-11-06 Thread Reynold Xin
That is correct. However, there is no guarantee right now that Akka 2.3
will work correctly for us. We haven't tested it enough yet (or rather, we
haven't tested it at all) E.g. see:
https://github.com/apache/incubator-spark/pull/131

We want to make Spark 0.9.0 based on Scala 2.10, but we have also been
discussing ideas to make a Scala 2.10 version of Spark 0.8.x so it enables
users to move to Scala 2.10 earlier if they want.


On Wed, Nov 6, 2013 at 12:29 AM, Sandy Ryza sandy.r...@cloudera.com wrote:

 For my own understanding, is this summary correct?
 Spark will move to scala 2.10, which means it can support akka 2.3-M1,
 which supports protobuf 2.5, which will allow Spark to run on Hadoop 2.2.

 What will be the first Spark version with these changes?  Are the Akka
 features that Spark relies on stable in 2.3-M1?

 thanks,
 Sandy



 On Tue, Nov 5, 2013 at 12:12 AM, Liu, Raymond raymond@intel.com
 wrote:

  Just pushed a pull request which based on scala 2.10 branch for hadoop
  2.2.0.
  Yarn-standalone mode workable, but need a few more fine tune works.
  Not really for pull, but as a placeholder, and for someone who want to
  take a look.
 
  Best Regards,
  Raymond Liu
 
 
  -Original Message-
  From: Reynold Xin [mailto:r...@apache.org]
  Sent: Tuesday, November 05, 2013 10:07 AM
  To: dev@spark.incubator.apache.org
  Subject: Re: issue regarding akka, protobuf and Hadoop version
 
  I think we are near the end of Scala 2.9.3 development, and will merge
 the
  Scala 2.10 branch into master and make it the future very soon (maybe
 next
  week).  This problem will go away.
 
  Meantime, we are relying on periodically merging the master into the
 Scala
  2.10 branch.
 
 
  On Mon, Nov 4, 2013 at 5:53 PM, Liu, Raymond raymond@intel.com
  wrote:
 
   I plan to do the work on scala-2.10 branch, which already move to akka
   2.2.3, hope that to move to akka 2.3-M1 (which support protobuf 2.5.x)
   will not cause many problem and make it a test to see is there further
   issues, then wait for the formal release of akka 2.3.x
  
   While the issue is that I can see many commits on master branch is not
   merged into scala-2.10 branch yet. The latest merge seems to happen on
   OCT.11, while as I mentioned in the dev branch merge/sync thread,
   seems that many earlier commit is not included and which will surely
   bring extra works on future code merging/rebase. So again, what's the
   code sync strategy and what's the plan of merge back into master?
  
   Best Regards,
   Raymond Liu
  
  
   -Original Message-
   From: Reynold Xin [mailto:r...@apache.org]
   Sent: Tuesday, November 05, 2013 8:34 AM
   To: dev@spark.incubator.apache.org
   Subject: Re: issue regarding akka, protobuf and Hadoop version
  
   I chatted with Matt Massie about this, and here are some options:
  
   1. Use dependency injection in google-guice to make Akka use one
   version of protobuf, and YARN use the other version.
  
   2. Look into OSGi to accomplish the same goal.
  
   3. Rewrite the messaging part of Spark to use a simple, custom RPC
   library instead of Akka. We are really only using a very simple subset
   of Akka features, and we can probably implement a simple RPC library
   tailored for Spark quickly. We should only do this as the last resort.
  
   4. Talk to Akka guys and hope they can make a maintenance release of
   Akka that supports protobuf 2.5.
  
  
   None of these are ideal, but we'd have to pick one. It would be great
   if you have other suggestions.
  
  
   On Sun, Nov 3, 2013 at 11:46 PM, Liu, Raymond raymond@intel.com
   wrote:
  
Hi
   
I am working on porting spark onto Hadoop 2.2.0, With some
renaming and call into new YARN API works done. I can run up the
spark master. While I encounter the issue that Executor Actor could
not connecting to Driver actor.
   
After some investigation, I found the root cause is that the
akka-remote do not support protobuf 2.5.0 before 2.3. And hadoop
move to protobuf 2.5.0 from 2.1-beta.
   
The issue is that if I exclude the akka dependency from
hadoop and force protobuf dependency to 2.4.1, the compile/packing
will fail since hadoop common jar require a new interface from
  protobuf 2.5.0.
   
 So any suggestion on this?
   
Best Regards,
Raymond Liu
   
  
 



Re: issue regarding akka, protobuf and Hadoop version

2013-11-04 Thread Reynold Xin
Adding in a few guys so they can chime in.


On Mon, Nov 4, 2013 at 4:33 PM, Reynold Xin r...@apache.org wrote:

 I chatted with Matt Massie about this, and here are some options:

 1. Use dependency injection in google-guice to make Akka use one version
 of protobuf, and YARN use the other version.

 2. Look into OSGi to accomplish the same goal.

 3. Rewrite the messaging part of Spark to use a simple, custom RPC library
 instead of Akka. We are really only using a very simple subset of Akka
 features, and we can probably implement a simple RPC library tailored for
 Spark quickly. We should only do this as the last resort.

 4. Talk to Akka guys and hope they can make a maintenance release of Akka
 that supports protobuf 2.5.


 None of these are ideal, but we'd have to pick one. It would be great if
 you have other suggestions.


 On Sun, Nov 3, 2013 at 11:46 PM, Liu, Raymond raymond@intel.comwrote:

 Hi

 I am working on porting spark onto Hadoop 2.2.0, With some
 renaming and call into new YARN API works done. I can run up the spark
 master. While I encounter the issue that Executor Actor could not
 connecting to Driver actor.

 After some investigation, I found the root cause is that the
 akka-remote do not support protobuf 2.5.0 before 2.3. And hadoop move to
 protobuf 2.5.0 from 2.1-beta.

 The issue is that if I exclude the akka dependency from hadoop
 and force protobuf dependency to 2.4.1, the compile/packing will fail since
 hadoop common jar require a new interface from protobuf 2.5.0.

  So any suggestion on this?

 Best Regards,
 Raymond Liu





RE: issue regarding akka, protobuf and Hadoop version

2013-11-04 Thread Liu, Raymond
I plan to do the work on scala-2.10 branch, which already move to akka 2.2.3, 
hope that to move to akka 2.3-M1 (which support protobuf 2.5.x) will not cause 
many problem and make it a test to see is there further issues, then wait for 
the formal release of akka 2.3.x

While the issue is that I can see many commits on master branch is not merged 
into scala-2.10 branch yet. The latest merge seems to happen on OCT.11, while 
as I mentioned in the dev branch merge/sync thread, seems that many earlier 
commit is not included and which will surely bring extra works on future code 
merging/rebase. So again, what's the code sync strategy and what's the plan of 
merge back into master? 

Best Regards,
Raymond Liu


-Original Message-
From: Reynold Xin [mailto:r...@apache.org] 
Sent: Tuesday, November 05, 2013 8:34 AM
To: dev@spark.incubator.apache.org
Subject: Re: issue regarding akka, protobuf and Hadoop version

I chatted with Matt Massie about this, and here are some options:

1. Use dependency injection in google-guice to make Akka use one version of 
protobuf, and YARN use the other version.

2. Look into OSGi to accomplish the same goal.

3. Rewrite the messaging part of Spark to use a simple, custom RPC library 
instead of Akka. We are really only using a very simple subset of Akka 
features, and we can probably implement a simple RPC library tailored for Spark 
quickly. We should only do this as the last resort.

4. Talk to Akka guys and hope they can make a maintenance release of Akka that 
supports protobuf 2.5.


None of these are ideal, but we'd have to pick one. It would be great if you 
have other suggestions.


On Sun, Nov 3, 2013 at 11:46 PM, Liu, Raymond raymond@intel.com wrote:

 Hi

 I am working on porting spark onto Hadoop 2.2.0, With some 
 renaming and call into new YARN API works done. I can run up the spark 
 master. While I encounter the issue that Executor Actor could not 
 connecting to Driver actor.

 After some investigation, I found the root cause is that the 
 akka-remote do not support protobuf 2.5.0 before 2.3. And hadoop move 
 to protobuf 2.5.0 from 2.1-beta.

 The issue is that if I exclude the akka dependency from hadoop 
 and force protobuf dependency to 2.4.1, the compile/packing will fail 
 since hadoop common jar require a new interface from protobuf 2.5.0.

  So any suggestion on this?

 Best Regards,
 Raymond Liu