Re: Performance Question

2020-06-19 Thread Matthias Boehm
Thanks for the question and the detailed inputs - this is an effect of simplification rewrites that only apply in one of the cases. Specifically, (t(N)%*%t(M))[1,1] is rewritten to t(N)[1,] %*% t(M)[,1], which is a form of selection pushdown. You can do the following, for benchmarking*: introdu

Re: [DISCUSSION] Website stack `systemml.apache.org`. Thanks.

2020-06-18 Thread Matthias Boehm
ght, while trying to merge something else. That said nothing bad happened except you did not get a chance to look through the PR, and it is progress in the direction discussed in this thread. Best regards Sebastian ____ From: Matthias Boehm Sent: Saturday, June 6, 20

Re: [DISCUSSION] Website stack `systemml.apache.org`. Thanks.

2020-06-06 Thread Matthias Boehm
from my perspective it would be very important to have all builtin functions in a single markdown file to allow users to search for things and users don't care how a builtin function is internally implemented. So we might want to use this opportunity to consolidate the already documented builti

Re: Fwd: Help me improve my documentation search :)

2020-06-01 Thread Matthias Boehm
Janardhan, thanks for the initiative but please refrain from sending such advertisements to our dev mailing list and making any promises in the name of SystemDS/SystemML on potential inclusion of such dependencies into future releases. Right now we did not yet decide what our target documentat

Re: [DISCUSSION] Documentation dev & user along with builtins.

2020-05-23 Thread Matthias Boehm
thanks for the initiative and moving this discussion to the dev list. I think this would be very valuable. Regarding how to start, I would recommend to focus on external functionality first by documenting all dml-bodied and native builtin functions (see org.apache.sysds.common.Builtins) and ope

Re: Docker Container Organisation

2020-05-17 Thread Matthias Boehm
thanks for following-up on this. @Berthold: do I remember correctly that you looked into a similar setup a while ago? Regards, Matthias On 5/15/2020 7:46 PM, Baunsgaard, Sebastian wrote: Hi SystemDS developers. Currently we have some docker containers that are associated with my personal d

Re: [#904] Can the ONNX-SystemDS implementation be reviewed. Thanks.

2020-05-12 Thread Matthias Boehm
yes this is the classic blocking issue - we only held back because you commented you want to review it in detail. We'll take care of it now. Regards, Matthias On 5/12/2020 8:05 AM, Janardhan wrote: Hi, @lukas-jkl have implemented ONNX support for SystemDS, also well documented (in code and in

[DISCUSS] Draft Report Apache SystemML - May 2020

2020-05-10 Thread Matthias Boehm
Hi all, given the recent merge and upcoming name change, I'd like to share a draft of our board report (due May 13). If you have any concerns, let's discuss them. ## Description: SystemML is a declarative large-scale machine learning (ML) system that allows the flexible specification of ML a

Welcome 4 New Committers

2020-05-01 Thread Matthias Boehm
The Project Management Committee (PMC) for Apache SystemML (SystemDS) has invited Arnab Phani, Mark Dokter, Shafaq Siddiqi, and Kevin Innerebner to become committers, and we are pleased to announce that all four have accepted. The new committers cover many important areas of current and future

Re: Roadmap Merge and Rename SystemDS

2020-04-10 Thread Matthias Boehm
k you, Janardhan On Tue, Mar 24, 2020 at 6:28 PM Matthias Boehm wrote: that's a good point Henry. Yes, with SystemDS 0.1.0, we removed the MapReduce compiler and runtime backend, the pydml parser and language support, the Java-UDF framework, and the script-level debugger. We are concentratin

Re: Roadmap Merge and Rename SystemDS

2020-03-24 Thread Matthias Boehm
when SystemDS being merged to SystemML repository? - Henry On Sat, Mar 21, 2020 at 2:47 PM Matthias Boehm wrote: just FYI, we created a ticket for the suitable name search, and shared the related results [1]. So from my perspective, it really boils down to the question if we accept the closene

Re: Roadmap Merge and Rename SystemDS

2020-03-21 Thread Matthias Boehm
ine because of the very different objectives and because SystemDS reflects both the origin from SystemML and its new focus on data science pipelines. [1] https://issues.apache.org/jira/projects/PODLINGNAMESEARCH/issues/PODLINGNAMESEARCH-179?filter=allissues Regards, Matthias On 3/9/2020 6:

Roadmap Merge and Rename SystemDS

2020-03-09 Thread Matthias Boehm
Hi all, as you're probably aware, development activities of Apache SystemML significantly slowed down and were virtually non-existing in the last year for various reasons. Part of that was that my team and I [1] decided to start SystemDS [2,3] as a fork of SystemML in 09/2018 with a new visio

Re: DML scripts under scripts/staging

2019-10-31 Thread Matthias Boehm
Hi Remy, generally these scripts are not much different from the ones found in scripts/algorithms. However, the staging scripts were either in development, or did not receive enough testing to move to algorithms yet. If you have a specific algorithm in mind, let us know and we help determine

Re: SYSTEMML PyPi Statistics over the last six months.

2019-08-10 Thread Matthias Boehm
great - thank you so much for the summary. It would be awesome to get it for a longer history as well and aggregate it with the release/maven downloads. Regards, Matthias On 10/08/2019 16:23, Janardhan wrote: Hi, The following are the queried SystemML download statistics to understand the pr

Re: upgrading hadoop version

2019-02-20 Thread Matthias Boehm
Raising the minimum version to 2.7.x is a good idea and fine by me. I would suggest updating the pom to version 2.7.7 but only mention 2.7 as a minimum requirement in the README because as far as I know no APIs changed that wouldn't allow running SystemML on older versions as well. Regards, Ma

Re: Autoencoder codegen testing with R

2019-01-15 Thread Matthias Boehm
yes, you're absolutely right - in this form, results would always differ. Even if we feed a seed to both R and DML scripts, the implementation of our rand is very different as we need to ensure that, given a seed, we generate the same data in local and distributed operations. Accordingly, we de

New committer: Guobao Li

2018-09-04 Thread Matthias Boehm
The Project Management Committee (PMC) for Apache SystemML has invited Guobao Li to become a committer and we are pleased to announce that he has accepted. Guobao was instrumental in creating language and runtime support for parameter servers in SystemML. This includes local and distributed runtim

Fwd: [ComDev] High resolution project logos wanted!

2018-08-23 Thread Matthias Boehm
Could someone with access to our logos please commit them into the mentioned repo? @Deron: I remember you gave me once the archive with all versions of our logos. Thanks. Regards, Matthias -- Forwarded message -- From: Daniel Gruno Date: Thu, Aug 23, 2018 at 1:52 PM Subject: [Co

Re: [VOTE] Apache SystemML 1.2.0 (RC1)

2018-08-19 Thread Matthias Boehm
+1 I ran the perftest suite multiple times up to 80GB with and without codegen. After fixing all the issues and regressions, the entire suite ran successfully against Spark 2.2 and 2.3 and all use cases showed equal or better performance compared to SystemML 1.1. Regards, Matthias On Fri, Aug 17

Re: Release Planning SystemML 1.2

2018-08-11 Thread Matthias Boehm
>> Subject:Re: Release Planning SystemML 1.2 >> >> >> >> One more thing, I am already building a docker image. But, which image do >> you prefer >> >> 1. CentOS 7 or >> 2. Ubuntu - Later extensible to GPU very easily. >> >> This d

Re: GSoC Project Presentation Guobao (Parameter Server)

2018-08-09 Thread Matthias Boehm
https://hangouts.google.com/call/Wwq1uz89KHlqgkLCkginAAEE Regards, Matthias On Wed, Aug 8, 2018 at 1:46 PM, Matthias Boehm wrote: > just as a reminder: tomorrow at 10am PST, Guobao will give us an > overview of his work on parameter servers this summer. I'll post the > hangout li

Re: GSoC Project Presentation Guobao (Parameter Server)

2018-08-08 Thread Matthias Boehm
just as a reminder: tomorrow at 10am PST, Guobao will give us an overview of his work on parameter servers this summer. I'll post the hangout link here and at our ASF slack channel 10min before the presentation. Regards, Matthias On Thu, Jul 19, 2018 at 12:16 AM, Matthias Boehm wrote: >

Re: [DISCUSS] Adding SystemML to OSS Fuzz

2018-07-19 Thread Matthias Boehm
action items that we can discuss here. If I don't here back in 3 days, I'll recommend to close the issue at google/oss-fuzz. Regards, Matthias On Mon, May 21, 2018 at 5:29 PM, Matthias Boehm wrote: > Well, in general this can be interesting. Apart from our default > testsuite, we o

GSoC Project Presentation Guobao (Parameter Server)

2018-07-19 Thread Matthias Boehm
Hi all, please mark your calendars. Guobao will present the results of his GSoC project on local and distributed parameter servers in SystemML on Thu, August 9, 10am PST. Everyone interested is welcome to join. We'll use Google Hangouts for the presentation and demo. The details will be posted sho

Re: Release Planning SystemML 1.2

2018-06-24 Thread Matthias Boehm
given the current status of open tasks and the delay with regard to QA, I think we need to push this release out by a couple of weeks. Does mid to end July sound good to everyone? Regards, Matthias On Wed, Jun 6, 2018 at 11:28 PM, Matthias Boehm wrote: > thanks Berthold - that sounds g

Re: Release Planning SystemML 1.2

2018-06-06 Thread Matthias Boehm
> > > From: Krishna Kalyan > To: dev@systemml.apache.org > Date: 06/05/2018 10:09 PM > Subject:Re: Release Planning SystemML 1.2 > > > > +1 > > I am completely available to help with the QA cycle and help with > switching > to new perf test su

Release Planning SystemML 1.2

2018-06-05 Thread Matthias Boehm
Hi all, given our current release cadence of about 3 month, we should start talking about our SystemML 1.2 release. There have been many new features, improvements over all backends, and various critical fixes. I know there are still some tasks that should go into 1.2, and for this QA cycle we'd l

Re: [DISCUSS] Adding SystemML to OSS Fuzz

2018-05-21 Thread Matthias Boehm
Well, in general this can be interesting. Apart from our default testsuite, we occasionally ran static code analysis tools. Having additional tests for partially valid scripts and inputs can help to find more issues. That being said, I don't think we currently qualify as a project with "significan

Re: Website header unclickable

2018-05-20 Thread Matthias Boehm
Thanks for catching this Daiki. I was able to reproduce this in my env as well. So far there is no JIRA for it yet, but I just gave you permissions. So, feel free to create one. Regards, Matthias On Sun, May 20, 2018 at 5:05 PM, Daiki Matsunaga wrote: > Hi, > I noticed while looking at the syste

Re: SYSTEMML-447

2018-05-10 Thread Matthias Boehm
This particular JIRA is only partially related. Niketan and Nakul worked out the details - the only reason I show up as the reporter is that, if I remember correctly, we split a larger scoped JIRA for low-level optimizations (GPU, codegen, compression) into individual JIRAs and created the detailed

Re: Questions about MNIST LeNet example

2018-05-10 Thread Matthias Boehm
entries via l1[7] or l2['g'] accordingly. We're still working on additional features to make the integration with IPA, functions, and size/type propagation smoother, but the basic functionality is already available. Regards, Matthias On Sun, May 6, 2018 at 1:08 PM, Matthias Boehm wro

Re: Questions about MNIST LeNet example

2018-05-06 Thread Matthias Boehm
Hi Guobao, that sounds very good. In general, the "model" refers to the collection of all weights and bias matrices of a given architecture. Similar to a classic regression model, we can view the weights as the "slope", i.e., multiplicative terms, while the biases are the "intercept", i.e., additi

GSoC 2018 Student Guobao Li

2018-05-01 Thread Matthias Boehm
Hi all, please join me in welcoming Guobao Li as a GSoC 2018 student, who will be working on SYSTEMML-2083 (Language and runtime for parameter servers) this summer. We're currently in the community bonding phase, but the project will start May 14. Krishna already kindly volunteered (on the dev li

Re: distributed cholesky on systemml

2018-04-23 Thread Matthias Boehm
generate a Spark task converting things > into RDD operators. > Thanks so much for the patience and detailed instructions. I have a much > better understanding of the system now. > > On Sun, Apr 22, 2018 at 7:47 PM, Matthias Boehm wrote: >> >> well, SystemML decides the execut

Re: distributed cholesky on systemml

2018-04-22 Thread Matthias Boehm
;> >>> Cheers, J >>> >>> >>> Jerome Nilmeier, PhD >>> Data Scientist and Engineer >>> IBM Spark Technology Center >>> http://www.spark.tc/ >>> >>> >>> >>> - Original message - >>> From: Ma

Re: distributed cholesky on systemml

2018-04-22 Thread Matthias Boehm
sible parallelization points. > > Cheers, J > Jerome Nilmeier, PhD > Data Scientist and Engineer > IBM Spark Technology Center > http://www.spark.tc/ > > > > - Original message - > From: Matthias Boehm > To: dev@systemml.apache.org > Cc: Qifan Pu > Subje

Re: distributed cholesky on systemml

2018-04-22 Thread Matthias Boehm
Pu wrote: > Matthias, > > Thanks so much for taking time to fix. Really appreciated it. > Does the same reasoning apply to the cholesky script? The recursive approach > also looks inherently sequential. > > Best, > Qifan > > On Sat, Apr 21, 2018 at 11:39 PM, Matthias

Re: distributed cholesky on systemml

2018-04-21 Thread Matthias Boehm
should look into a bottom-up, breadth-first approach to parallelize over the blocks in each level, which could be done via parfor at script level. Regards, Matthias On Sat, Apr 21, 2018 at 6:59 PM, Matthias Boehm wrote: > thanks for catching this - I just ran a toy example and this seems to >

Re: distributed cholesky on systemml

2018-04-21 Thread Matthias Boehm
imeException: Invalid values for > matrix indexing: [1667:,1:1666] must be within matrix dimensions > [1000,1000] > > > Am I missing some configuration here? > > > [1] > https://github.com/apache/systemml/blob/master/scripts/staging/scalable_linalg/test/test_triangul

Re: distributed cholesky on systemml

2018-04-21 Thread Matthias Boehm
Hi Qifan, thanks for your feedback. You're right, the builtin functions cholesky, inverse, eigen, solve, svd, qr, and lu are currently only supported as single-node operations because they're still implemented via Apache commons.math. However, there is an experimental script for distributed chole

Re: Kind request for presentation. Thanks.

2018-04-07 Thread Matthias Boehm
Hi Janardhan, thanks again for organizing this. As mentioned in an offline discussion already, I'll not be able to attend tonight on such short notice, but maybe someone else can jump in. Regards, Matthias On Sat, Apr 7, 2018 at 12:28 AM, Janardhan Pulivarthi wrote: > Hi sir, > > we (apache + N

Fwd: Fw: Request for a beginner JIRA

2018-04-03 Thread Matthias Boehm
Thanks for your interest Daiki. I created two JIRAs SYSTEMML-2233 and SYSTEMML-2232 that might me a good starting point. I would recommend to begin with 2233 as a basic cleanup task, which is meant to get you comfortable. The other task is then a bit more involved but would improve our function nam

Re: Contribution to SystemML

2018-04-02 Thread Matthias Boehm
e some tips to > select one? > regards, > Govinda > > > On Thu, Mar 29, 2018 at 10:06 AM Matthias Boehm wrote: > >> ------ Forwarded message -- >> From: Matthias Boehm >> Date: Wed, Mar 28, 2018 at 9:34 PM >> Subject: Re: Contribution to SystemML &g

Re: Draft Release Notes 1.1.0

2018-03-28 Thread Matthias Boehm
thanks for the initial draft and extensions - I would remove internals #2/#3 because they are still open, move the other internals to performance, and include (or extend) the following: * codegen extensions (operation support, extended optimizer, see SYSTEMML-2065) * new accumulator operator += (n

Fwd: Contribution to SystemML

2018-03-28 Thread Matthias Boehm
-- Forwarded message -- From: Matthias Boehm Date: Wed, Mar 28, 2018 at 9:34 PM Subject: Re: Contribution to SystemML To: Govinda Malavipathirana well, first of all sorry that you wasted one of your proposals because it would have been better to combine them into a single

Re: [SYSTEMML-2084][SYSTEMML-2085] Language and Compiler Extension, Basic Runtime Primitives.

2018-03-25 Thread Matthias Boehm
Well, SYSTEMML-2084 aims to integrate the new paramserv builtin function - as described in the JIRA - into SystemML's semantic validation and compiler. This would entail to first finalize the design of this builtin function (e.g., function signature and semantics) and then integrate it into the HOP

Re: [VOTE] Apache SystemML 1.1.0 (RC2)

2018-03-24 Thread Matthias Boehm
+1, I reran all tests mentioned before for RC2 as well without any issues. Regards, Matthias On Fri, Mar 23, 2018 at 5:26 PM, Berthold Reinwald wrote: > Please vote on releasing the following candidate as Apache SystemML > version 1.1.0 > > The vote is open for at least 72 hours and passes if a

Re: Apache SystemML 1.1.0 : Performance Test

2018-03-22 Thread Matthias Boehm
awesome - thanks for sharing Krishna. Regards, Matthias On Wed, Mar 21, 2018 at 2:10 AM, Krishna Kalyan wrote: > Hello All, > Sharing a small Shiny App to visualize the runtime performance for System > ML algorithms. This is work still in progress. Any feedback would be really > appreciated :).

Re: [VOTE] Apache SystemML 1.1.0 (RC1)

2018-03-21 Thread Matthias Boehm
-1, sorry but I have to change my vote due to SYSTEMML-2201. The change will be in master tomorrow - once this is done, we can cut another RC. Regards, Matthias On Wed, Mar 21, 2018 at 4:12 PM, Matthias Boehm wrote: > +1 > > I ran the perftest suite up to 80GB with and without codeg

Re: [VOTE] Apache SystemML 1.1.0 (RC1)

2018-03-21 Thread Matthias Boehm
+1 I ran the perftest suite up to 80GB with and without codegen and it ran just fine without errors or performance issues. There are two additional performance improvements that are not included in RC1 but it should be fine to postpone these till the next release. Regards, Matthias On Mon, Mar 1

Fwd: Sub projects in Language and run time for parameter servers [SYSTEMML-2083]

2018-03-17 Thread Matthias Boehm
-- Forwarded message -- From: Matthias Boehm Date: Sat, Mar 17, 2018 at 5:41 PM Subject: Re: Sub projects in Language and run time for parameter servers [SYSTEMML-2083] To: Chamath Abeysinghe great to see that you're making progress on your proposal. However and as a ge

Re: [SYSTEMML-2089] Extended Caffe2DML and Keras2DML script generators.

2018-03-17 Thread Matthias Boehm
.watson.ibm.com/researcher/view.php?person=us-npansar >> >> [image: Inactive hide details for Govinda Malavipathirana ---03/15/2018 >> 12:38:42 PM---Hi' I'm currently working on the GSoC proposal an]Govinda >> Malavipathirana ---03/15/2018 12:38:42 PM---Hi

Re: Release Planning

2018-03-16 Thread Matthias Boehm
Thanks for the patience. Meanwhile, all known issues have been resolved and we're ready to cut an RC. Until this is done (or we have a 1.1 branch), I would recommend to limit all commits to critical fixes. Regards, Matthias On Tue, Mar 13, 2018 at 12:18 AM, Matthias Boehm wrote: > just

Fwd: Extending Codegen algorithm tests for heuristics

2018-03-13 Thread Matthias Boehm
-- Forwarded message -- From: Matthias Boehm Date: Tue, Mar 13, 2018 at 1:00 PM Subject: Re: Extending Codegen algorithm tests for heuristics To: Chamath Abeysinghe without debugging it's hard to tell, but usually something like this happens if blocks are incorrectly al

Re: Release Planning

2018-03-13 Thread Matthias Boehm
avoid unnecessary release efforts. Regards, Matthias On Sun, Mar 11, 2018 at 5:43 PM, Matthias Boehm wrote: > well, after trying to run our perftest suite with Spark 2.3 and Spark 2.2 > this seems to be more complicated. Although the version update from 4.5.3 > to 4.7.1 solved the problem

Re: Release Planning

2018-03-11 Thread Matthias Boehm
gt; > +1 on upgrading > > Original message From: Matthias Boehm < > mboe...@gmail.com> Date: 3/8/18 5:19 PM (GMT-08:00) To: > dev@systemml.apache.org Subject: Re: Release Planning > > related to Spark 2.3, we might want to update our ANTLR version because > >

Re: [DISCUSS] integrated testing for MLContext, SPARK, codegen.

2018-03-10 Thread Matthias Boehm
Hi Janardhan, in general, we prefer to compare against R because it helps detecting issues that are common across different optimizers and execution modes. So for small scripts like PCA, I would recommend to simply create an R script, which should be very similar to the dml script. However, for m

Re: Sub projects in Language and run time for parameter servers [SYSTEMML-2083]

2018-03-09 Thread Matthias Boehm
Hi Chamath, ad 1: Yes, this is absolutely correct. However, it is important to realize that within the workers, we want to run dml functions, and for these we'll reuse our existing compiler, runtime, operations, and data structures. ad 2: Yes, this is also correct. Indeed we can use an existing p

Re: Release Planning

2018-03-08 Thread Matthias Boehm
; Regards, Matthias On Thu, Mar 1, 2018 at 5:22 PM, Matthias Boehm wrote: > Hi all, > > I'm sure you've seen that Spark 2.3 just got released. This lines up > beautifully with our own SystemML 1.1 release. Accordingly, I would > recommend to use Spark 2.3 for our due

Fwd: Extending Codegen algorithm tests for heuristics

2018-03-06 Thread Matthias Boehm
-- Forwarded message -- From: Matthias Boehm Date: Tue, Mar 6, 2018 at 10:14 PM Subject: Re: Extending Codegen algorithm tests for heuristics To: Chamath Abeysinghe Hi Chamath, great thanks for your contribution - I left a couple of comments but we should be ready to merge

Re: Release Planning

2018-03-01 Thread Matthias Boehm
ime to run over reasonably large data and fix all related issues. Regards, Matthias On Tue, Feb 6, 2018 at 12:51 PM, Matthias Boehm wrote: > yes, absolutely. Here is a list of new features and improvements - please > feel free to extend as needed: > > 1) Extended Caffe2DML and Keras2DML

Re: Extending Codegen algorithm tests for heuristics

2018-03-01 Thread Matthias Boehm
egards, > Chamath > > [1] https://github.com/apache/systemml/compare/master... > chamathabeysinghe:SYSTEMML-2159?diff=split&name=SYSTEMML-2159 > > > On Tue, Feb 27, 2018 at 1:54 AM, Matthias Boehm wrote: > >> -- Forwarded message -- >> From

Fwd: Extending Codegen algorithm tests for heuristics

2018-02-26 Thread Matthias Boehm
-- Forwarded message -- From: Matthias Boehm Date: Mon, Feb 26, 2018 at 11:59 AM Subject: Re: Extending Codegen algorithm tests for heuristics To: Chamath Abeysinghe great - thanks for taking this over Chamath. In general, I would recommend to use this task to explore SystemML

GSoC Project Proposals

2018-02-22 Thread Matthias Boehm
Hi all, first of all, thanks again for your interest. It's great to see that you are excited about our GSoC project idea on language and runtime support for parameter servers in SystemML. I'd like to give a couple of pointers to clarify potential questions some of you might have. 1) GSoC Guidelin

Matrices with zero rows and/or columns

2018-02-07 Thread Matthias Boehm
Hi all, just a heads-up: tomorrow I'm going to push a deep-cutting change that will allow matrices with zero rows and/or columns to simply common use cases and remove quirks such as returning an empty row after removeEmpty on an empty matrix. Since this touches hundreds of conditions in the compi

Re: Release Planning

2018-02-06 Thread Matthias Boehm
IBM Almaden Research Center > > office: (408) 927 2208; T/L: 457 2208 > > e-mail: reinw...@us.ibm.com > > > > > > > > From: Matthias Boehm > > To: dev@systemml.apache.org > > Date: 02/05/2018 11:05 PM > > Subject:Release Planning > > >

Release Planning

2018-02-05 Thread Matthias Boehm
Hi all, since our 1.0 release in Dec, we already got a number enhancements and new features in, so I think it would be good to discuss the timeline for our next SystemML 1.1 release. How about, we target mid March for a first RC? Also, Berthold would you be willing to serve again as the release ma

Re: [Discuss] GSoC 2018

2018-01-27 Thread Matthias Boehm
gards, Matthias On Sat, Jan 27, 2018 at 1:57 PM, Nakul Jindal wrote: > This is awesome! > I am guessing the goal is to have this epic be a summer worth of > mini-projects for a single GSoC student, isthat correct? > > > > On Fri, Jan 26, 2018 at 7:46 PM, Matthias Boehm w

Re: [Discuss] GSoC 2018

2018-01-26 Thread Matthias Boehm
just FYI: I've created https://issues.apache.org/jira/browse/SYSTEMML-2083 with the gsoc2018 label. If you have additional project ideas, please file the respective JIRAs. Thanks. Regards, Matthias On Mon, Jan 22, 2018 at 12:13 AM, Matthias Boehm wrote: > yes, that is a good idea and w

Re: [Discuss] GSoC 2018

2018-01-22 Thread Matthias Boehm
yes, that is a good idea and we should leverage this opportunity. I'm happy to mentor a project as well, specifically on parameter server architectures for distributed deep learning in SystemML. Right now we can emulate synchronous parameter servers with parfor, but there are other architectur

Re: Passing a CoordinateMatrix to SystemML

2018-01-10 Thread Matthias Boehm
parse vectors to a DML script without issue. Sorry for the > slow confirmation on this - I've been out of the office for the last couple > weeks. Thanks for your help debugging this! > > Best, > > Anthony > > On Mon, Dec 25, 2017 at 5:35 AM, Matthias Boehm wrote: > >

Re: [review request] for the bayesian optimization support

2018-01-08 Thread Matthias Boehm
Great to hear that Janardhan. Could you please ping individual people directly from the PRs that are ready for review or need discussion? I'll take care of #715, and #716 but I'm happy to help on the other components as well. Regards, Matthias On Mon, Jan 8, 2018 at 10:16 AM, Janardhan Pulivarthi

Re: Can BITWISE_XOR be added. Thanks.

2018-01-01 Thread Matthias Boehm
Hi Janardhan, sure - adding such bitwise operations is a nice addition. There is still an open task (SYSTEMML-1931) to generalize the existing NOT, AND, OR, and XOR to matrix arguments which should be straightforward as is seamlessly fits into the existing binary operator. In a separate task, we

Re: Passing a CoordinateMatrix to SystemML

2017-12-25 Thread Matthias Boehm
n(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Best, Anthony On Sun, Dec 24, 2017 at 3:14 AM, Matthias Boehm wrote: Thanks again for catching this issue Anthony - this IJV reblock issue with large ultra-sparse matrices is now fixed in master. It likely did no

Build server down

2017-12-24 Thread Matthias Boehm
Hi all, could somebody at the STC please find out what's wrong with our build server? Thanks and Happy Holidays. Regards, Matthias

Re: Passing a CoordinateMatrix to SystemML

2017-12-24 Thread Matthias Boehm
issue that I could not reproduce yet, so it would be very helpful if you could give it another try. Thanks. Regards, Matthias On 12/24/2017 9:57 AM, Matthias Boehm wrote: Hi Anthony, thanks for helping to debug this issue. There are no limits other than the dimensions and number of non-zeros

Re: Passing a CoordinateMatrix to SystemML

2017-12-24 Thread Matthias Boehm
readPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Anthony On Sat, Dec 23, 2017 at 4:27 AM, Matthias Boehm wrote: Given the line numbers from the stacktrace, it seems that you use a rather old version of SystemML. Hence, I would recommend to upgrade to SystemML 1.0 or

Re: Passing a CoordinateMatrix to SystemML

2017-12-23 Thread Matthias Boehm
SCALARĀ·STRINGĀ°24 at org.apache.sysml.runtime.controlprogram.Program. execute(Program.java:130) at org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram( ScriptExecutor.java:388) ... 16 more ... On Fri, Dec 22, 2017 at 5:48 AM, Matthias Boehm wrote: well, let's do the following to figure this out: 1) If the schema is ind

Re: Passing a CoordinateMatrix to SystemML

2017-12-22 Thread Matthias Boehm
well, let's do the following to figure this out: 1) If the schema is indeed [label: Integer, features: SparseVector], please change the third line to val y = input_data.select("label"). 2) For debugging, I would recommend to use a simple script like "print(sum(X));" and try converting X and y

Re: Jenkins build became unstable: SystemML-DailyTest #1424

2017-12-17 Thread Matthias Boehm
yes, that's indeed the case. The original PR was fine but I introduced this issue during some final cleanups while merging this PR. Regards, Matthias On 12/17/2017 11:49 PM, Ted Yu wrote: XorTest failure seems to be related to: [SYSTEMML-1883] New xor builtin functions over scalars On Sun,

Re: [VOTE] Apache SystemML 1.0.0 (RC2)

2017-12-12 Thread Matthias Boehm
view.php? > person=us-npansar> > > > > > > "Glenn Weidner" ---12/11/2017 09:49:48 AM---+1 I ran Linear Regression, > > > Logistic Regression, SVM, Naive Bayes Python tests > > > > > > From: "Glenn Weidner" > > > To: dev@s

Re: [VOTE] Apache SystemML 1.0.0 (RC2)

2017-12-09 Thread Matthias Boehm
+1 I ran the perftest suite with the artifact on Spark 2.2 up to 80GB without any failures or performance issues. On earlier versions, I also ran the perftest suite with Spark 2.1 and 2.2, w/ and w/o codegen, and w/ auto compression up to 800GB without remaining issues. As a minor nitpick (to be

[DISCUSS] Roadmap SystemML 1.1 and beyond

2017-12-08 Thread Matthias Boehm
Hi all, with our SystemML 1.0 release around the corner, I think we should start the discussion on the roadmap for SystemML 1.1 and beyond. Below is an initial list as a starting point, but please help to add relevant items, especially for algorithms and APIs, which are barely covered so far. 1)

Re: [VOTE] Apache SystemML 1.0.0 (RC1)

2017-12-07 Thread Matthias Boehm
-1 due to the issue mentioned by Niketan, as well as additional correctness and performance issues fixed in the last couple of days. Regards, Matthias On Tue, Dec 5, 2017 at 6:25 PM, Niketan Pansare wrote: > Soft -1 as GPU backend is in experimental mode. GPU matrix multiplication > tests are f

Re: add. of matrices of different dim. (the bias term)

2017-12-06 Thread Matthias Boehm
Hi Janardhan, this is a good suggestion - so far we only support matrix-vector but no vector-matrix binary operations. However, there is already an open issue (SYSTEMML-1434) for generalizing our binary operations accordingly. Regards, Matthias On Wed, Dec 6, 2017 at 8:26 AM, Janardhan Pulivarth

Re: SystemML 1.0 release timeline

2017-12-02 Thread Matthias Boehm
Meanwhile, the remaining correctness and performance issues of sparse maxpooling backward operations have been fixed (SYSTEMML-2034, 2035). So I think we're in good shape to cut an RC1 now. Regards, Matthias On Fri, Dec 1, 2017 at 12:45 PM, Matthias Boehm wrote: > After multiple run

Re: dev environment on windows

2017-12-02 Thread Matthias Boehm
I'm using eclipse on win and simply defined two maven run configurations for (1) the default build (base=${workspace_loc:/systemml}, goals=package), and (2) the distribution build ((base=${workspace_loc:/systemml}, goals=package, profiles=distribution). I'm sure intellij provides a similar builti

Re: SystemML 1.0 release timeline

2017-12-01 Thread Matthias Boehm
would recommend to defer the RC1 for a couple of days until these issues are fixed as well. Regards, Matthias On Tue, Nov 14, 2017 at 1:31 PM, Krishna Kalyan wrote: > +1 > > Regards, > Krishna > > On Sun, Nov 12, 2017 at 5:53 AM, Matthias Boehm wrote: > > > just FYI: I&

Re: Distribution functions such as gamma, weibull etc.

2017-11-13 Thread Matthias Boehm
unfortunately, our cdf and invcdf currently only support the distributions normal, exp, chisq, f, and t and scalar inputs. So you would have to emulate this at script level. Extending the list of distribution functions and adding matrix support would be a good addition though. Regards, Matthias O

Re: SystemML 1.0 release timeline

2017-11-11 Thread Matthias Boehm
-+1. Thanks, > > From: "Niketan Pansare" > To: dev@systemml.apache.org > Date: 11/08/2017 06:51 AM > Subject: Re: SystemML 1.0 release timeline > -- > > > > > +1. > > Thanks, > > Niketan. > > > On Nov 7,

SystemML 1.0 release timeline

2017-11-07 Thread Matthias Boehm
Hi all, we made some good progress regarding deep learning support, code generation, and low-latency scoring - so, I'm looking forward to our upcoming 1.0 release. Since it's our first stable release, I think it would be a good idea to allocate some extra time for QA. How about we shoot for a rele

SystemML 1.0 release timeline

2017-11-07 Thread Matthias Boehm
Hi all, we made some good progress regarding deep learning support, code generation, and low-latency scoring - so, I'm looking forward to our upcoming 1.0 release. Since it's our first stable release, I think it would be a good idea to allocate some extra time for QA. How about we shoot for a rele

Re: My life made easier, now!

2017-10-30 Thread Matthias Boehm
Janardhan, could you please elaborate a little what issues you faced? SystemML itself does not require any specific installation. Also to be productive, you might want to setup a dev environment, where you can run tests locally directly from your IDE. Regards, Matthias On Mon, Oct 30, 2017 at 4:5

Re: Get plans before and after rewrites

2017-10-25 Thread Matthias Boehm
erties file, but I > cannot find anything relevant to 'org.apache.sysml.hops.rewrite'. Is > there > another file I should check? > > Thanks again, > Nantia > > 2017-10-13 23:29 GMT+03:00 Matthias Boehm : > > > Hi Nantia, > > > > in optimiza

Re: [DISCUSS] Support for lower precision in SystemML

2017-10-24 Thread Matthias Boehm
+1 this is really great. I like the semantics of a best-effort use of single-precision (when configured we're free to use single precision but can fall back to double precision if certain operations or backends don't support it yet). This allows us to add single precision support incrementally one

Re: Jenkins build is still unstable: SystemML-DailyTest #1297

2017-10-15 Thread Matthias Boehm
thanks Ted - this was my fault. A recent change toward a more aggressive use of CSR revealed a number of hidden issues, which I already fixed last night - so, the next build should be fine. Regards, Matthias On Sun, Oct 15, 2017 at 6:36 AM, Ted Yu wrote: > *There seems to be some regression in

Re: Get plans before and after rewrites

2017-10-13 Thread Matthias Boehm
Hi Nantia, in optimization level 1, we disable the following rewrites and the explain hops or runtime output will show the resulting plan: * Disable common-subexpression elimination * Disable algebraic simplifications (static and dynamic) * Disable inter-procedural analysis * Disable branch remova

Re: Minor script changes for SVM with `MLContext`, `spark_submit` etc.

2017-10-04 Thread Matthias Boehm
as mentioned on PR-673, I'm probably not the right person to comment on algorithm or API-related changes, but I'll try to have a look tomorrow. Regards, Matthias On Tue, Oct 3, 2017 at 6:52 AM, Janardhan Pulivarthi < janardhan.pulivar...@gmail.com> wrote: > Hi Matthias, > > Based on your commen

Re: file location for `import org.apache.sysml.parser.dml.DmlParser.WhileStatementContext`

2017-09-22 Thread Matthias Boehm
I guess this is a just a delayed message, right? Ted was right, it's a generated class, and hence not in the repo. However, when you build SystemML, the generated source ends up in the src directory of the org.apache.sysml.parser.dml package as well. Note that for new builtin functions such as xor

  1   2   >