[jira] [Commented] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2014-09-19 Thread Sam Halliday (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140319#comment-14140319
 ] 

Sam Halliday commented on SPARK-3403:
-

thanks guys. This looks like its even more upstream of me. Would be good if you 
can submit to OpenBLAS.

I've never seen great gains in OpenBLAS over ATLAS, and certainly the AMD/Intel 
versions are far superior so I recommend them if performance is really critical.

> NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)
> -
>
> Key: SPARK-3403
> URL: https://issues.apache.org/jira/browse/SPARK-3403
> Project: Spark
>  Issue Type: Bug
>  Components: MLlib
>Affects Versions: 1.0.2
> Environment: Setup: Windows 7, x64 libraries for netlib-java (as 
> described on https://github.com/fommil/netlib-java). I used OpenBlas x64 and 
> MinGW64 precompiled dlls.
>Reporter: Alexander Ulanov
> Fix For: 1.2.0
>
> Attachments: NativeNN.scala
>
>
> Code:
> val model = NaiveBayes.train(train)
> val predictionAndLabels = test.map { point =>
>   val score = model.predict(point.features)
>   (score, point.label)
> }
> predictionAndLabels.foreach(println)
> Result: 
> program crashes with: "Process finished with exit code -1073741819 
> (0xC005)" after displaying the first prediction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-8815) illegal java package names in jar

2015-07-03 Thread Sam Halliday (JIRA)
Sam Halliday created SPARK-8815:
---

 Summary: illegal java package names in jar
 Key: SPARK-8815
 URL: https://issues.apache.org/jira/browse/SPARK-8815
 Project: Spark
  Issue Type: Bug
Reporter: Sam Halliday


In ENSIME we were unable to index the spark jars and we investigated further... 
you have classes that look like this:

org.spark-project.guava.annotations.VisibleForTesting

Hyphens are not legal package names according to the java language spec, so I'm 
amazed that this can actually be read at runtime... certainly no compiler I 
know would allow it.

What I suspect is happening is that you're using a build plugin that 
internalises some of your dependencies and it is using your groupId but not 
validating it... and then blindly using that name in the ASM manipulation.

You might want to report this upstream with your build plugin.

For your next release, I recommend using an explicit name that is not your 
groupId. i.e. convert hyphens to underscores as Gosling recommends.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8815) illegal java package names in jar

2015-07-03 Thread Sam Halliday (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613534#comment-14613534
 ] 

Sam Halliday commented on SPARK-8815:
-

I wouldn't feel too bad about it, the SUN/Oracle J2SE implementation breaks the 
rules in at least one place as well :-) (no capital letters in package names)

Section 4.3 http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html

and also

http://www.oracle.com/technetwork/java/codeconventions-135099.html

We have a workaround already but this might break other DEV tooling: 
https://github.com/ensime/ensime-server/blob/master/core/src/test/scala/org/ensime/indexer/DescriptorParserSpec.scala#L69

> illegal java package names in jar
> -
>
> Key: SPARK-8815
> URL: https://issues.apache.org/jira/browse/SPARK-8815
> Project: Spark
>  Issue Type: Bug
>Reporter: Sam Halliday
>
> In ENSIME we were unable to index the spark jars and we investigated 
> further... you have classes that look like this:
> org.spark-project.guava.annotations.VisibleForTesting
> Hyphens are not legal package names according to the java language spec, so 
> I'm amazed that this can actually be read at runtime... certainly no compiler 
> I know would allow it.
> What I suspect is happening is that you're using a build plugin that 
> internalises some of your dependencies and it is using your groupId but not 
> validating it... and then blindly using that name in the ASM manipulation.
> You might want to report this upstream with your build plugin.
> For your next release, I recommend using an explicit name that is not your 
> groupId. i.e. convert hyphens to underscores as Gosling recommends.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3785) Support off-loading computations to a GPU

2015-02-13 Thread Sam Halliday (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319777#comment-14319777
 ] 

Sam Halliday commented on SPARK-3785:
-

Hi all, just joining the thread :-)

I'm the author of netlib-java. I recommend watching my ScalaX talk 
http://fommil.github.io/scalax14/#/ for anybody who hasn't seen it yet. I talk 
about beyond-CPU acceleration in the last few slides (just after the Breeze 
examples).

In my decade of industrial experience with these things, the GPU is a *lot* 
faster than the CPU for large matrix operations, but slower for smaller ones 
(1000 elements or less). Typically, operations that are highly parallelisable, 
such as matrix multiplication, have a constant time cost rather than linear in 
number of elements.

However, the big problem with GPUs is memory management. If you have a problem 
that you're happy to solve entirely on the GPU, you're going to get great 
performance at the cost of less portability... a major consideration for a JVM 
based application. The trick is minimising how much data you need to transmit 
between the traditional CPU memory space and the GPU memory space. And further 
optimisations can be obtained by using the GPU profilers that come with the 
card.

It is for this reason that GPU-backed implementations of BLAS/LAPACK can only 
match, but not surpass, the performance of Intel MKL. There exist BLAS-LIKE and 
LAPACK-LIKE implementations for GPUs (e.g. cuBLAS, clBLAS) but they can only be 
used when you hold pointers to the GPU memory regions and are not good for use 
from Java/Scala (unless you are using macros/code generators to really generate 
native code).

I have links with FPGA companies and I'd love to see a full BLAS implementation 
using that custom hardware... but it's such a mammoth task the FPGA 
implementors (not me) would need to be funded to do it.

I am very hopeful about the cutting edge commodity tech coming from Intel/AMD 
(e.g. APUs) which allow CPU and GPU to share the memory region. I would love to 
buy one of these machines and write a minimal BLAS implementation to do some 
benchmarks and see if we can get GPU performance without the memory transfer 
overhead. My project https://github.com/fommil/multiblas (which was abandoned 
until the tech caught up) would be a perfect place to do this and would involve 
only runtime changes for Spark users to benefit. But, to be honest, I'd 
probably need funding to turn my attention to this because I've got a few other 
personal priorities at the moment.

I've heard the raspberry pi has such a shared region. It might be interesting 
to use it as a cheapo dev environment.

> Support off-loading computations to a GPU
> -
>
> Key: SPARK-3785
> URL: https://issues.apache.org/jira/browse/SPARK-3785
> Project: Spark
>  Issue Type: Brainstorming
>  Components: MLlib
>Reporter: Thomas Darimont
>Priority: Minor
>
> Are there any plans to adding support for off-loading computations to the 
> GPU, e.g. via an open-cl binding? 
> http://www.jocl.org/
> https://code.google.com/p/javacl/
> http://lwjgl.org/wiki/index.php?title=OpenCL_in_LWJGL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3785) Support off-loading computations to a GPU

2015-02-13 Thread Sam Halliday (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319778#comment-14319778
 ] 

Sam Halliday commented on SPARK-3785:
-

One other thing, because somebody mentioned operations on collections, you 
might want to look at my friend's project 
https://github.com/JulesGosnell/clumatra which offloads clojure collection 
operations to an APU.

> Support off-loading computations to a GPU
> -
>
> Key: SPARK-3785
> URL: https://issues.apache.org/jira/browse/SPARK-3785
> Project: Spark
>  Issue Type: Brainstorming
>  Components: MLlib
>Reporter: Thomas Darimont
>Priority: Minor
>
> Are there any plans to adding support for off-loading computations to the 
> GPU, e.g. via an open-cl binding? 
> http://www.jocl.org/
> https://code.google.com/p/javacl/
> http://lwjgl.org/wiki/index.php?title=OpenCL_in_LWJGL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3785) Support off-loading computations to a GPU

2016-01-04 Thread Sam Halliday (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081751#comment-15081751
 ] 

Sam Halliday commented on SPARK-3785:
-

there are almost 40 people involved in this ticket. I'd recommend opening a new 
one to keep the chatter down. I'm unsubscribing, if anybody needs me please 
email :-)

> Support off-loading computations to a GPU
> -
>
> Key: SPARK-3785
> URL: https://issues.apache.org/jira/browse/SPARK-3785
> Project: Spark
>  Issue Type: Brainstorming
>  Components: MLlib
>Reporter: Thomas Darimont
>Priority: Minor
>
> Are there any plans to adding support for off-loading computations to the 
> GPU, e.g. via an open-cl binding? 
> http://www.jocl.org/
> https://code.google.com/p/javacl/
> http://lwjgl.org/wiki/index.php?title=OpenCL_in_LWJGL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8815) illegal java package names in jar

2015-07-10 Thread Sam Halliday (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622211#comment-14622211
 ] 

Sam Halliday commented on SPARK-8815:
-

Interesting. BTW, I see you're at ScalaX in December. I'll see you there! I 
gave a talk last year about high performance mathematics (i.e. netlib-java), 
but this year I'll be talking about generic programming.

> illegal java package names in jar
> -
>
> Key: SPARK-8815
> URL: https://issues.apache.org/jira/browse/SPARK-8815
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Reporter: Sam Halliday
>Priority: Minor
>
> In ENSIME we were unable to index the spark jars and we investigated 
> further... you have classes that look like this:
> org.spark-project.guava.annotations.VisibleForTesting
> Hyphens are not legal package names according to the java language spec, so 
> I'm amazed that this can actually be read at runtime... certainly no compiler 
> I know would allow it.
> What I suspect is happening is that you're using a build plugin that 
> internalises some of your dependencies and it is using your groupId but not 
> validating it... and then blindly using that name in the ASM manipulation.
> You might want to report this upstream with your build plugin.
> For your next release, I recommend using an explicit name that is not your 
> groupId. i.e. convert hyphens to underscores as Gosling recommends.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org