date:20150208

Re: Temporary jenkins issue

2015-02-08 Thread Josh Rosen

It looks like this may be fixed soon in Jenkins: https://issues.jenkins-ci.org/browse/JENKINS-25446 https://github.com/jenkinsci/flaky-test-handler-plugin/pull/1 On February 2, 2015 at 7:38:19 PM, Patrick Wendell (pwend...@gmail.com) wrote: Hey All, I made a change to the Jenkins

Re: Data source API | sizeInBytes should be to *Scan

2015-02-08 Thread Aniket Bhatnagar

Thanks for looking into this. If this true, isn't this an issue today? The default implementation of sizeInBytes is 1 + broadcast threshold. So, if catalyst's cardinality estimation estimates even a small filter selectivity, it will result in broadcasting the relation. Therefore, shouldn't the

Re: Improving metadata in Spark JIRA

2015-02-08 Thread Patrick Wendell

I think we already have a YARN component. https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20component%20%3D%20YARN I don't think JIRA allows it to be mandatory, but if it does, that would be useful. On Sat, Feb 7, 2015 at 5:08 PM, Nicholas Chammas

Re: Improving metadata in Spark JIRA

2015-02-08 Thread Nicholas Chammas

Oh derp, missed the YARN component. JIRA, does allow admins to make fields mandatory: https://confluence.atlassian.com/display/JIRA/Specifying+Field+Behavior#SpecifyingFieldBehavior-Makingafieldrequiredoroptional Nick On Sat Feb 07 2015 at 5:23:10 PM Patrick Wendell pwend...@gmail.com wrote:

Re: Using CUDA within Spark / boosting linear algebra

2015-02-08 Thread Evan R. Sparks

I would build OpenBLAS yourself, since good BLAS performance comes from getting cache sizes, etc. set up correctly for your particular hardware - this is often a very tricky process (see, e.g. ATLAS), but we found that on relatively modern Xeon chips, OpenBLAS builds quickly and yields performance

[RESULT] [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-08 Thread Patrick Wendell

This vote passes with 5 +1 votes (3 binding) and no 0 or -1 votes. +1 Votes: Krishna Sankar Sean Owen* Chip Senkbeil Matei Zaharia* Patrick Wendell* 0 Votes: (none) -1 Votes: (none) On Fri, Feb 6, 2015 at 5:12 PM, Patrick Wendell pwend...@gmail.com wrote: I'll add a +1 as well. On Fri, Feb

Re: Data source API | sizeInBytes should be to *Scan

2015-02-08 Thread Reynold Xin

We thought about this today after seeing this email. I actually built a patch for this (adding filter/column to data source stat estimation), but ultimately dropped it due to the potential problems the change the cause. The main problem I see is that column pruning/predicate pushdowns are

Re: Improving metadata in Spark JIRA

2015-02-08 Thread Nicholas Chammas

By the way, isn't it possible to make the Component field mandatory when people open new issues? Shouldn't we do that? Btw Patrick, don't we need a YARN component? I think our JIRA components should roughly match the components on the PR dashboard https://spark-prs.appspot.com/. Nick On Fri Feb

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-08 Thread Patrick Wendell

I'll add a +1 as well. On Fri, Feb 6, 2015 at 2:38 PM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Tested on Mac OS X. Matei On Feb 2, 2015, at 8:57 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.1! The

Re: Using CUDA within Spark / boosting linear algebra

2015-02-08 Thread Evan R. Sparks

Getting breeze to pick up the right blas library is critical for performance. I recommend using OpenBLAS (or MKL, if you already have it). It might make sense to force BIDMat to use the same underlying BLAS library as well. On Fri, Feb 6, 2015 at 4:42 PM, Ulanov, Alexander alexander.ula...@hp.com

Spark SQL Window Functions

2015-02-08 Thread Evan R. Sparks

Currently there's no standard way of handling time series data in Spark. We were kicking around some ideas in the lab today and one thing that came up was SQL Window Functions as a way to support them and query over time series (do things like moving average, etc.) These don't seem to be

Pull Requests on github

2015-02-08 Thread fommil

Hi all, I'm the author of netlib-java and I noticed that the documentation in MLlib was out of date and misleading, so I submitted a pull request on github which will hopefully make things easier for everybody to understand the benefits of system optimised natives and how to use them :-)

RE: Using CUDA within Spark / boosting linear algebra

2015-02-08 Thread Ulanov, Alexander

Evan, could you elaborate on how to force BIDMat and netlib-java to force loading the right blas? For netlib, I there are few JVM flags, such as -Dcom.github.fommil.netlib.BLAS=com.github.fommil.netlib.F2jBLAS, so I can force it to use Java implementation. Not sure I understand how to force use

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-08 Thread Matei Zaharia

+1 Tested on Mac OS X. Matei On Feb 2, 2015, at 8:57 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.1! The tag to be voted on is v1.2.1-rc3 (commit b6eaf77):

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

2015-02-08 Thread WangTaoTheTonic

Should we merge this commit into branch1.2 too? https://github.com/apache/spark/commit/2483c1efb6429a7d8a20c96d18ce2fec93a1aff9 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-2-1-RC3-tp10405p10503.html Sent from the

Unit tests

2015-02-08 Thread Patrick Wendell

Hey All, The tests are in a not-amazing state right now due to a few compounding factors: 1. We've merged a large volume of patches recently. 2. The load on jenkins has been relatively high, exposing races and other behavior not seen at lower load. For those not familiar, the main issue is

RE: Using CUDA within Spark / boosting linear algebra

2015-02-08 Thread Ulanov, Alexander

Hi Evan, Joseph I did few matrix multiplication test and BIDMat seems to be ~10x faster than netlib-java+breeze (sorry for weird table formatting): |A*B size | BIDMat MKL | Breeze+Netlib-java native_system_linux_x86-64| Breeze+Netlib-java f2jblas |

Re: Using CUDA within Spark / boosting linear algebra

2015-02-08 Thread Nicholas Chammas

Lemme butt in randomly here and say there is an interesting discussion on this Spark PR https://github.com/apache/spark/pull/4448 about netlib-java, JBLAS, Breeze, and other things I know nothing of, that y'all may find interesting. Among the participants is the author of netlib-java. On Sun Feb

Re: Pull Requests on github

2015-02-08 Thread Akhil Das

You can open a Jira issue pointing this PR to get it processed faster. :) Thanks Best Regards On Sat, Feb 7, 2015 at 7:07 AM, fommil sam.halli...@gmail.com wrote: Hi all, I'm the author of netlib-java and I noticed that the documentation in MLlib was out of date and misleading, so I

Re: Spark SQL Window Functions

2015-02-08 Thread Reynold Xin

This is the original ticket: https://issues.apache.org/jira/browse/SPARK-1442 I believe it will happen, one way or another :) On Fri, Feb 6, 2015 at 5:29 PM, Evan R. Sparks evan.spa...@gmail.com wrote: Currently there's no standard way of handling time series data in Spark. We were kicking

Re: Welcoming three new committers

2015-02-08 Thread Likun (Jacky)

Congratulations guys! Keep helping this awesome community. BR, Jacky Li - 发自 Smartisan T1 - 2015年2月4日，上午6:36于 Matei Zaharia matei.zaha...@gmail.com 写道： Hi all, The PMC recently voted to add three new committers: Cheng Lian, Joseph Bradley and Sean Owen. All three have been major contributors

Re: Temporary jenkins issue

Re: Data source API | sizeInBytes should be to *Scan

Re: Improving metadata in Spark JIRA

Re: Improving metadata in Spark JIRA

Re: Using CUDA within Spark / boosting linear algebra

[RESULT] [VOTE] Release Apache Spark 1.2.1 (RC3)

Re: Data source API | sizeInBytes should be to *Scan

Re: Improving metadata in Spark JIRA

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

Re: Using CUDA within Spark / boosting linear algebra

Spark SQL Window Functions

Pull Requests on github

RE: Using CUDA within Spark / boosting linear algebra

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

Re: [VOTE] Release Apache Spark 1.2.1 (RC3)

Unit tests

RE: Using CUDA within Spark / boosting linear algebra

Re: Using CUDA within Spark / boosting linear algebra

Re: Pull Requests on github

Re: Spark SQL Window Functions

Re: Welcoming three new committers

21 matches

Site Navigation

Mail list logo

Footer information