Re: Tungsten in a mixed endian environment

2016-01-12 Thread Reynold Xin
How big of a deal this use case is in a heterogeneous endianness environment? If we do want to fix it, we should do it when right before Spark shuffles data to minimize performance penalty, i.e. turn big-endian encoded data into little-indian encoded data before it goes on the wire. This is a

Re: Write access to wiki

2016-01-12 Thread shane knapp
> Ok, sounds good. I think it would be great, if you could add installing the > 'docker-engine' package and starting the 'docker' service in there too. I > was planning to update the playbook if there were one in the apache/spark > repo but I didn't see one, hence my question. > we currently have

Re: Eigenvalue solver

2016-01-12 Thread David Hall
(I don't know anything spark specific, so I'm going to treat it like a Breeze question...) As I understand it, Spark uses ARPACK via Breeze for SVD, and presumably the same approach can be used for EVD. Basically, you make a function that multiplies your "matrix" (which might be represented

Re: Write access to wiki

2016-01-12 Thread Nick Pentreath
I'd also like to get Wiki write access - at the least it allows a few of us to amend the "Powered By" and similar pages when those requests come through (Sean has been doing a lot of that recently :) On Mon, Jan 11, 2016 at 11:01 PM, Sean Owen wrote: > ... I forget who can

Re: Dependency on TestingUtils in a Spark package

2016-01-12 Thread Reynold Xin
If you need it, just copy it over to your own package. That's probably the safest option. On Tue, Jan 12, 2016 at 12:50 PM, Ted Yu wrote: > There is no annotation in TestingUtils class indicating whether it is > suitable for consumption by external projects. > > You

Dependency on TestingUtils in a Spark package

2016-01-12 Thread Robert Dodier
Hi, I'm putting together a Spark package (in the spark-packages.org sense) and I'd like to make use of the class org.apache.spark.mllib.util.TestingUtils which appears in mllib/src/test. Can I declare a dependency in my build.sbt to pull in a suitable jar? I have searched around but I have not

Re: Dependency on TestingUtils in a Spark package

2016-01-12 Thread Ted Yu
There is no annotation in TestingUtils class indicating whether it is suitable for consumption by external projects. You should assume the class is not public since its methods may change in future Spark releases. Cheers On Tue, Jan 12, 2016 at 12:36 PM, Robert Dodier

ROSE: Spark + R on the JVM.

2016-01-12 Thread David
Hi all, I'd like to share news of the recent release of a new Spark package, [ROSE](http://spark-packages.org/package/onetapbeyond/opencpu-spark-executor). ROSE is a Scala library offering access to the full scientific computing power of the R programming language to Apache Spark batch and

Re: Tungsten in a mixed endian environment

2016-01-12 Thread Ted Yu
I logged SPARK-12778 where endian awareness in Platform.java should help in mixed endian set up. There could be other parts of the code base which are related. Cheers On Tue, Jan 12, 2016 at 7:01 AM, Adam Roberts wrote: > Hi all, I've been experimenting with DataFrame

Tungsten in a mixed endian environment

2016-01-12 Thread Adam Roberts
Hi all, I've been experimenting with DataFrame operations in a mixed endian environment - a big endian master with little endian workers. With tungsten enabled I'm encountering data corruption issues. For example, with this simple test code: import org.apache.spark.SparkContext import

Eigenvalue solver

2016-01-12 Thread Lydia Ickler
Hi, I wanted to know if there are any implementations yet within the Machine Learning Library or generally that can efficiently solve eigenvalue problems? Or if not do you have suggestions on how to approach a parallel execution maybe with BLAS or Breeze? Thanks in advance! Lydia Von meinem

Re: ROSE: Spark + R on the JVM.

2016-01-12 Thread Corey Nolet
David, Thank you very much for announcing this! It looks like it could be very useful. Would you mind providing a link to the github? On Tue, Jan 12, 2016 at 10:03 AM, David wrote: > Hi all, > > I'd like to share news of the recent release of a new Spark

Re: ROSE: Spark + R on the JVM.

2016-01-12 Thread David Russell
Hi Corey, > Would you mind providing a link to the github? Sure, here is the github link you're looking for: https://github.com/onetapbeyond/opencpu-spark-executor David "All that is gold does not glitter, Not all those who wander are lost." Original Message Subject: Re: