Have you looked at ikvm?

From: Kenneth Tran<mailto:o...@kentran.net>
Sent: ‎12/‎16/‎2013 7:43 PM
To: user<mailto:user@spark.incubator.apache.org>
Subject: Re: Best ways to use Spark with .NET code

Hi Matei,

1. If I understand pipe correctly, I don't think that it can solve the problem 
if the algorithm is iterative and requires a reduction step in each iteration. 
Consider this simple linear regression example

            // Example: Batch-gradient-descent logistic regression, ignoring 
            for (int i = 0; i < NIter; i++) {
                var gradient = data.Sum(p => (w dot p.x - p.y) * p.x);
                w -= rate * gradient;

In order to use pipe as you said, one needs to move the for loop to the calling 
code (in Java), which may not be simple when dealing with more complex code and 
would still require (major) re-factoring of the ML libraries. Furthermore, 
there will be I/O at each iteration, which makes Spark not different from 
Hadoop MapReduce.

2. Before asking this, I have also looked at jni4net. Besides the usage 
complexity, jni4net has a few red flags

 *   It hasn't been developed since 2011 although the latest status is alpha
 *   Its license terms (and code integrity) may not pass our legal department
 *   Its robustness and efficiency are dubious.

Anyway, I'm looking at some other alternatives (e.g. JNBridge).


On Mon, Dec 16, 2013 at 12:04 PM, Matei Zaharia 
<matei.zaha...@gmail.com<mailto:matei.zaha...@gmail.com>> wrote:
Hi Kenneth,

Try using the RDD.pipe() operator in Spark, which lets you call out to an 
external process by passing data to it through standard in/out. This will let 
you call programs written in C# (e.g. that use your ML libraries) from a Spark 

I believe there are other projects enabling communication from Java to .NET, 
e.g. http://jni4net.sourceforge.net, but I’m not sure how easy they’ll be to 


On Dec 16, 2013, at 10:54 AM, Kenneth Tran 
<o...@kentran.net<mailto:o...@kentran.net>> wrote:


We have a large ML code base in .NET. Spark seems cool and we want to leverage 
it. What would be the best strategies to bridge the our .NET code and Spark?

 1.  Initiate a Spark .NET project
 2.  A lightweight bridge between .NET and Java

While (1) sound too daunting, it's not clear to me how to do (2) easily and 

I'm willing to contribute to (1) if there's already an existing effort.

Reply via email to