Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
On Thu, Sep 25, 2014 at 8:50 AM, Dmitriy Lyubimov wrote: > As for pure scala backend, it already exists and it is called Breeze > project (something MLib uses internally), supported by David Hall (among > others). It also includes a lot more common non-distributed math than just > algebra. By my estimate, it is one of the most well-rounded and > comprehensive math libraries in existence today ... > for JVM.
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
to be absolutely frank, if i could divorce easily from Colt, I would've divorced the entire scala code from Mahout. Unfortunately currently it is not very realistic case for me. More hopefully, we could patch Colt for major problems and add new backs there. As for pure scala backend, it already exists and it is called Breeze project (something MLib uses internally), supported by David Hall (among others). It also includes a lot more common non-distributed math than just algebra. By my estimate, it is one of the most well-round and comprehensive math libraries in existence today. It has, however, had significant difficulties dealing with sparse/dense operation optimizations in the past, as well as modelling, not sure as of this very moment. Colt at some point was marginally better in typing sparse in-memory idioms. On Thu, Sep 25, 2014 at 5:32 AM, Saikat Kanjilal wrote: > From a big picture perspective do we intend to keep colt around or write > scala implementations for functions like the aggregate, if so then I can > add scala code to do the aggregation and call it from the DSL for the norm. > > Sent from my iPhone > > > On Sep 25, 2014, at 12:25 AM, Ted Dunning wrote: > > > > On Wed, Sep 24, 2014 at 11:09 PM, Dmitriy Lyubimov > > wrote: > > > >> Aggregate is Colt's thing. Colt (aka Mahout-math) establish java-side > >> concept of different function types which are unfortunately not > compatible > >> with Scala literals. > > > > Dmitriy, > > > > Is this because we have other methods that describe the characteristics > of > > the function? > > > > What would be the Scala friendly idiom? Additional traits? >
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
Scala function literals (or any function literal) derive from a particular set of traits. It may be java classes are able to implement these traits (nobody that i know attempted to do that), and then maybe they will become supported as scala function types. But i think even that is a big if, since scala compiler tinkers with bytecode a lot, and compatibility at bytecode level is not guaranteed between scala major releases. Bottom line, even if it is possible to write scala functions in java, it is definitely not publicly documented feature. On the other hand, it is possible to use "function-like" Colt classes such as DoubleDoubleFunction just like a plain old reference-type object from either scala or Java, which is exactly how it happens in the example given in the question originally asked. On Thu, Sep 25, 2014 at 12:24 AM, Ted Dunning wrote: > On Wed, Sep 24, 2014 at 11:09 PM, Dmitriy Lyubimov > wrote: > > > Aggregate is Colt's thing. Colt (aka Mahout-math) establish java-side > > concept of different function types which are unfortunately not > compatible > > with Scala literals. > > > > Dmitriy, > > Is this because we have other methods that describe the characteristics of > the function? > > What would be the Scala friendly idiom? Additional traits? >
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
From a big picture perspective do we intend to keep colt around or write scala implementations for functions like the aggregate, if so then I can add scala code to do the aggregation and call it from the DSL for the norm. Sent from my iPhone > On Sep 25, 2014, at 12:25 AM, Ted Dunning wrote: > > On Wed, Sep 24, 2014 at 11:09 PM, Dmitriy Lyubimov > wrote: > >> Aggregate is Colt's thing. Colt (aka Mahout-math) establish java-side >> concept of different function types which are unfortunately not compatible >> with Scala literals. > > Dmitriy, > > Is this because we have other methods that describe the characteristics of > the function? > > What would be the Scala friendly idiom? Additional traits?
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
On Wed, Sep 24, 2014 at 11:09 PM, Dmitriy Lyubimov wrote: > Aggregate is Colt's thing. Colt (aka Mahout-math) establish java-side > concept of different function types which are unfortunately not compatible > with Scala literals. > Dmitriy, Is this because we have other methods that describe the characteristics of the function? What would be the Scala friendly idiom? Additional traits?
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
On Wed, Sep 24, 2014 at 9:15 PM, Saikat Kanjilal wrote: > Shannon/Dmitry,Quick question, I'm wanting to calculate the scala > equivalent of the frobenius norm per this API spec in python ( > http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html), > I dug into the mahout-math-scala project and found the following API to > calculate the norm: > > > > > > > > > def norm = sqrt(m.aggregate(Functions.PLUS, Functions.SQUARE)) > I believe the above is also calculating the frobenius norm, however I am > curious why we are calling a Java API from scala, the type of m above is a > java interface called Matrix, I'm guessing the implementation of aggregate > is happening in the math-math-scala somewhere, is that assumption correct? > We are colling Colt (i.e. java) for pretty much everything. As far as scala bindings are concerned, they are but a DSL wrapper to Colt (unlike distributed algebra which is much more). Aggregate is Colt's thing. Colt (aka Mahout-math) establish java-side concept of different function types which are unfortunately not compatible with Scala literals. > Thanks in advance. > > From: sxk1...@hotmail.com > > To: dev@mahout.apache.org > > Subject: RE: Mahout-1539-computation of gaussian kernel between 2 arrays > of shapes > > Date: Thu, 18 Sep 2014 12:51:36 -0700 > > > > Ok great I'll use the cartesian spark API call, so what I'd still like > some thoughts on where the code that calls the cartesian should live in our > directory structure. > > > Date: Thu, 18 Sep 2014 15:33:59 -0400 > > > From: squ...@gatech.edu > > > To: dev@mahout.apache.org > > > Subject: Re: Mahout-1539-computation of gaussian kernel between 2 > arrays of shapes > > > > > > Saikat, > > > > > > Spark has the cartesian() method that will align all pairs of points; > > > that's the nontrivial part of determining an RBF kernel. After that > it's > > > a simple matter of performing the equation that's given on the > > > scikit-learn doc page. > > > > > > However, like you said it'll also have to be implemented using the > > > Mahout DSL. I can envision that users would like to compute pairwise > > > metrics for a lot more than just RBF kernels (pairwise Euclidean > > > distance, etc), so my guess would be a DSL implementation of > cartesian() > > > is what you're looking for. You can build the other methods on top of > that. > > > > > > Correct me if I'm wrong. > > > > > > Shannon > > > > > > On 9/18/14, 3:28 PM, Saikat Kanjilal wrote: > > > > > http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html > > > > I need to implement the above in the scala world and expose a DSL > API to call the computation when computing the affinity matrix. > > > > > > > >> From: ted.dunn...@gmail.com > > > >> Date: Thu, 18 Sep 2014 10:04:34 -0700 > > > >> Subject: Re: Mahout-1539-computation of gaussian kernel between 2 > arrays of shapes > > > >> To: dev@mahout.apache.org > > > >> > > > >> There are number of non-traditional linear algebra operations like > this > > > >> that are important to implement. > > > >> > > > >> Can you describe what you intend to do so that we can discuss the > shape of > > > >> the API and computation? > > > >> > > > >> > > > >> > > > >> On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal < > sxk1...@hotmail.com> > > > >> wrote: > > > >> > > > >>> Dmitry et al,As part of the above JIRA I need to calculate the > gaussian > > > >>> kernel between 2 shapes, I looked through mahout-math-scala and > didnt see > > > >>> anything to do this, any objections to me adding some code under > > > >>> scalabindings to do this? > > > >>> Thanks in advance. > > > > > > > > > > >
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
Yes. That code is computing Frobenius norm. I can't answer the context question about Scala calling Java, however. On Wed, Sep 24, 2014 at 9:15 PM, Saikat Kanjilal wrote: > Shannon/Dmitry,Quick question, I'm wanting to calculate the scala > equivalent of the frobenius norm per this API spec in python ( > http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html), > I dug into the mahout-math-scala project and found the following API to > calculate the norm: > > > > > > > > > def norm = sqrt(m.aggregate(Functions.PLUS, Functions.SQUARE)) > I believe the above is also calculating the frobenius norm, however I am > curious why we are calling a Java API from scala, the type of m above is a > java interface called Matrix, I'm guessing the implementation of aggregate > is happening in the math-math-scala somewhere, is that assumption correct? > Thanks in advance. > > From: sxk1...@hotmail.com > > To: dev@mahout.apache.org > > Subject: RE: Mahout-1539-computation of gaussian kernel between 2 arrays > of shapes > > Date: Thu, 18 Sep 2014 12:51:36 -0700 > > > > Ok great I'll use the cartesian spark API call, so what I'd still like > some thoughts on where the code that calls the cartesian should live in our > directory structure. > > > Date: Thu, 18 Sep 2014 15:33:59 -0400 > > > From: squ...@gatech.edu > > > To: dev@mahout.apache.org > > > Subject: Re: Mahout-1539-computation of gaussian kernel between 2 > arrays of shapes > > > > > > Saikat, > > > > > > Spark has the cartesian() method that will align all pairs of points; > > > that's the nontrivial part of determining an RBF kernel. After that > it's > > > a simple matter of performing the equation that's given on the > > > scikit-learn doc page. > > > > > > However, like you said it'll also have to be implemented using the > > > Mahout DSL. I can envision that users would like to compute pairwise > > > metrics for a lot more than just RBF kernels (pairwise Euclidean > > > distance, etc), so my guess would be a DSL implementation of > cartesian() > > > is what you're looking for. You can build the other methods on top of > that. > > > > > > Correct me if I'm wrong. > > > > > > Shannon > > > > > > On 9/18/14, 3:28 PM, Saikat Kanjilal wrote: > > > > > http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html > > > > I need to implement the above in the scala world and expose a DSL > API to call the computation when computing the affinity matrix. > > > > > > > >> From: ted.dunn...@gmail.com > > > >> Date: Thu, 18 Sep 2014 10:04:34 -0700 > > > >> Subject: Re: Mahout-1539-computation of gaussian kernel between 2 > arrays of shapes > > > >> To: dev@mahout.apache.org > > > >> > > > >> There are number of non-traditional linear algebra operations like > this > > > >> that are important to implement. > > > >> > > > >> Can you describe what you intend to do so that we can discuss the > shape of > > > >> the API and computation? > > > >> > > > >> > > > >> > > > >> On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal < > sxk1...@hotmail.com> > > > >> wrote: > > > >> > > > >>> Dmitry et al,As part of the above JIRA I need to calculate the > gaussian > > > >>> kernel between 2 shapes, I looked through mahout-math-scala and > didnt see > > > >>> anything to do this, any objections to me adding some code under > > > >>> scalabindings to do this? > > > >>> Thanks in advance. > > > > > > > > > >
RE: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
Shannon/Dmitry,Quick question, I'm wanting to calculate the scala equivalent of the frobenius norm per this API spec in python (http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html), I dug into the mahout-math-scala project and found the following API to calculate the norm: def norm = sqrt(m.aggregate(Functions.PLUS, Functions.SQUARE)) I believe the above is also calculating the frobenius norm, however I am curious why we are calling a Java API from scala, the type of m above is a java interface called Matrix, I'm guessing the implementation of aggregate is happening in the math-math-scala somewhere, is that assumption correct? Thanks in advance. > From: sxk1...@hotmail.com > To: dev@mahout.apache.org > Subject: RE: Mahout-1539-computation of gaussian kernel between 2 arrays of > shapes > Date: Thu, 18 Sep 2014 12:51:36 -0700 > > Ok great I'll use the cartesian spark API call, so what I'd still like some > thoughts on where the code that calls the cartesian should live in our > directory structure. > > Date: Thu, 18 Sep 2014 15:33:59 -0400 > > From: squ...@gatech.edu > > To: dev@mahout.apache.org > > Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays of > > shapes > > > > Saikat, > > > > Spark has the cartesian() method that will align all pairs of points; > > that's the nontrivial part of determining an RBF kernel. After that it's > > a simple matter of performing the equation that's given on the > > scikit-learn doc page. > > > > However, like you said it'll also have to be implemented using the > > Mahout DSL. I can envision that users would like to compute pairwise > > metrics for a lot more than just RBF kernels (pairwise Euclidean > > distance, etc), so my guess would be a DSL implementation of cartesian() > > is what you're looking for. You can build the other methods on top of that. > > > > Correct me if I'm wrong. > > > > Shannon > > > > On 9/18/14, 3:28 PM, Saikat Kanjilal wrote: > > > http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html > > > I need to implement the above in the scala world and expose a DSL API to > > > call the computation when computing the affinity matrix. > > > > > >> From: ted.dunn...@gmail.com > > >> Date: Thu, 18 Sep 2014 10:04:34 -0700 > > >> Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays > > >> of shapes > > >> To: dev@mahout.apache.org > > >> > > >> There are number of non-traditional linear algebra operations like this > > >> that are important to implement. > > >> > > >> Can you describe what you intend to do so that we can discuss the shape > > >> of > > >> the API and computation? > > >> > > >> > > >> > > >> On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal > > >> wrote: > > >> > > >>> Dmitry et al,As part of the above JIRA I need to calculate the gaussian > > >>> kernel between 2 shapes, I looked through mahout-math-scala and didnt > > >>> see > > >>> anything to do this, any objections to me adding some code under > > >>> scalabindings to do this? > > >>> Thanks in advance. > > > > > >
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
you want a REALLY-REALLY big matrix? as in distributed matrix? On Thu, Sep 18, 2014 at 12:28 PM, Saikat Kanjilal wrote: > > http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html > I need to implement the above in the scala world and expose a DSL API to > call the computation when computing the affinity matrix. > > > From: ted.dunn...@gmail.com > > Date: Thu, 18 Sep 2014 10:04:34 -0700 > > Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays > of shapes > > To: dev@mahout.apache.org > > > > There are number of non-traditional linear algebra operations like this > > that are important to implement. > > > > Can you describe what you intend to do so that we can discuss the shape > of > > the API and computation? > > > > > > > > On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal > > wrote: > > > > > Dmitry et al,As part of the above JIRA I need to calculate the gaussian > > > kernel between 2 shapes, I looked through mahout-math-scala and didnt > see > > > anything to do this, any objections to me adding some code under > > > scalabindings to do this? > > > Thanks in advance. > >
RE: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
Ok great I'll use the cartesian spark API call, so what I'd still like some thoughts on where the code that calls the cartesian should live in our directory structure. > Date: Thu, 18 Sep 2014 15:33:59 -0400 > From: squ...@gatech.edu > To: dev@mahout.apache.org > Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays of > shapes > > Saikat, > > Spark has the cartesian() method that will align all pairs of points; > that's the nontrivial part of determining an RBF kernel. After that it's > a simple matter of performing the equation that's given on the > scikit-learn doc page. > > However, like you said it'll also have to be implemented using the > Mahout DSL. I can envision that users would like to compute pairwise > metrics for a lot more than just RBF kernels (pairwise Euclidean > distance, etc), so my guess would be a DSL implementation of cartesian() > is what you're looking for. You can build the other methods on top of that. > > Correct me if I'm wrong. > > Shannon > > On 9/18/14, 3:28 PM, Saikat Kanjilal wrote: > > http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html > > I need to implement the above in the scala world and expose a DSL API to > > call the computation when computing the affinity matrix. > > > >> From: ted.dunn...@gmail.com > >> Date: Thu, 18 Sep 2014 10:04:34 -0700 > >> Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays > >> of shapes > >> To: dev@mahout.apache.org > >> > >> There are number of non-traditional linear algebra operations like this > >> that are important to implement. > >> > >> Can you describe what you intend to do so that we can discuss the shape of > >> the API and computation? > >> > >> > >> > >> On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal > >> wrote: > >> > >>> Dmitry et al,As part of the above JIRA I need to calculate the gaussian > >>> kernel between 2 shapes, I looked through mahout-math-scala and didnt see > >>> anything to do this, any objections to me adding some code under > >>> scalabindings to do this? > >>> Thanks in advance. > > >
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
Saikat, Spark has the cartesian() method that will align all pairs of points; that's the nontrivial part of determining an RBF kernel. After that it's a simple matter of performing the equation that's given on the scikit-learn doc page. However, like you said it'll also have to be implemented using the Mahout DSL. I can envision that users would like to compute pairwise metrics for a lot more than just RBF kernels (pairwise Euclidean distance, etc), so my guess would be a DSL implementation of cartesian() is what you're looking for. You can build the other methods on top of that. Correct me if I'm wrong. Shannon On 9/18/14, 3:28 PM, Saikat Kanjilal wrote: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html I need to implement the above in the scala world and expose a DSL API to call the computation when computing the affinity matrix. From: ted.dunn...@gmail.com Date: Thu, 18 Sep 2014 10:04:34 -0700 Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes To: dev@mahout.apache.org There are number of non-traditional linear algebra operations like this that are important to implement. Can you describe what you intend to do so that we can discuss the shape of the API and computation? On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal wrote: Dmitry et al,As part of the above JIRA I need to calculate the gaussian kernel between 2 shapes, I looked through mahout-math-scala and didnt see anything to do this, any objections to me adding some code under scalabindings to do this? Thanks in advance.
RE: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.rbf_kernel.html I need to implement the above in the scala world and expose a DSL API to call the computation when computing the affinity matrix. > From: ted.dunn...@gmail.com > Date: Thu, 18 Sep 2014 10:04:34 -0700 > Subject: Re: Mahout-1539-computation of gaussian kernel between 2 arrays of > shapes > To: dev@mahout.apache.org > > There are number of non-traditional linear algebra operations like this > that are important to implement. > > Can you describe what you intend to do so that we can discuss the shape of > the API and computation? > > > > On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal > wrote: > > > Dmitry et al,As part of the above JIRA I need to calculate the gaussian > > kernel between 2 shapes, I looked through mahout-math-scala and didnt see > > anything to do this, any objections to me adding some code under > > scalabindings to do this? > > Thanks in advance.
Re: Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
There are number of non-traditional linear algebra operations like this that are important to implement. Can you describe what you intend to do so that we can discuss the shape of the API and computation? On Wed, Sep 17, 2014 at 9:28 PM, Saikat Kanjilal wrote: > Dmitry et al,As part of the above JIRA I need to calculate the gaussian > kernel between 2 shapes, I looked through mahout-math-scala and didnt see > anything to do this, any objections to me adding some code under > scalabindings to do this? > Thanks in advance.
Mahout-1539-computation of gaussian kernel between 2 arrays of shapes
Dmitry et al,As part of the above JIRA I need to calculate the gaussian kernel between 2 shapes, I looked through mahout-math-scala and didnt see anything to do this, any objections to me adding some code under scalabindings to do this? Thanks in advance.