[jira] [Created] (SYSTEMML-1908) Non-deterministic number of fused operators on GLM

2017-09-14 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1908:


 Summary: Non-deterministic number of fused operators on GLM
 Key: SYSTEMML-1908
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1908
 Project: SystemML
  Issue Type: Bug
Reporter: Matthias Boehm


Experiments regarding the codegen compilation overhead on GLM, revealed a 
non-deterministic number of fused operators, which should not be the case given 
constant inputs and the deterministic algorithm. The same can be reproduced, 
for instance, with the codegen unit test: {{GLM Binomial Dense w/ rewrite 
dense}}. Different runs give the following output:

{code}
Codegen compile (DAG,CP,JC):974/134/72.
Codegen enum (ALLt/p,EVALt/p):  264380/262794/8971/8858.
Codegen compile times (DAG,JC): 1.045/0.547 sec.
Codegen plan cache hits:31/103.
{code}

{code}
Codegen compile (DAG,CP,JC):974/134/69.
Codegen enum (ALLt/p,EVALt/p):  264380/262794/8971/8858.
Codegen compile times (DAG,JC): 1.034/0.509 sec.
Codegen plan cache hits:34/103.
{code}

After debugging this issue, it turns out that is was caused by 
non-deterministic ordering of scalars and dense inputs (by the number of 
non-zeros). Since the ordering of inputs (per category of matrices, vectors, 
scalars) is only beneficial for sparse inputs, this task aims to harden these 
ordering conditions. Overall, fixing this issue leads to fewer fusion operators 
(more reuse) and thus, less java compilation and JIT compilation overheads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (SYSTEMML-1743) Investigate creation of lightweight artifact

2017-09-14 Thread Deron Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deron Eriksson resolved SYSTEMML-1743.
--
   Resolution: Fixed
Fix Version/s: SystemML 1.0

Fixed by [PR578|https://github.com/apache/systemml/pull/578].

> Investigate creation of lightweight artifact
> 
>
> Key: SYSTEMML-1743
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1743
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
> Fix For: SystemML 1.0
>
>
> SystemML currently has a large number of required dependencies. For certain 
> tasks such as running algorithms using JMLC, only a subset of these 
> dependencies are actually needed. Therefore, it should be possible to 
> selectively build a lightweight artifact that holds only the "required" 
> classes. 
> Determination of the required classes can potentially be done by querying the 
> classloader for the classes that have been loaded after executing the 
> operations that should be supported by the lightweight artifact. In addition, 
> other techniques such as looking at the imports of the project's java files 
> can be used to determine additional required classes.
> This information can be fed to a maven assembly to build a lightweight jar 
> file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (SYSTEMML-1743) Investigate creation of lightweight artifact

2017-09-14 Thread Deron Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deron Eriksson closed SYSTEMML-1743.


> Investigate creation of lightweight artifact
> 
>
> Key: SYSTEMML-1743
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1743
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
> Fix For: SystemML 1.0
>
>
> SystemML currently has a large number of required dependencies. For certain 
> tasks such as running algorithms using JMLC, only a subset of these 
> dependencies are actually needed. Therefore, it should be possible to 
> selectively build a lightweight artifact that holds only the "required" 
> classes. 
> Determination of the required classes can potentially be done by querying the 
> classloader for the classes that have been loaded after executing the 
> operations that should be supported by the lightweight artifact. In addition, 
> other techniques such as looking at the imports of the project's java files 
> can be used to determine additional required classes.
> This information can be fed to a maven assembly to build a lightweight jar 
> file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SYSTEMML-1907) Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz

2017-09-14 Thread Deron Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166860#comment-16166860
 ] 

Deron Eriksson commented on SYSTEMML-1907:
--

+1 for name change.

For consistency, the other .tgz artifacts should also probably be changed to 
.tar.gz


> Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz
> 
>
> Key: SYSTEMML-1907
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1907
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Niketan Pansare
>Assignee: Glenn Weidner
>
> I encountered this issue because pypi has migrated to a new process: 
> https://packaging.python.org/guides/migrating-to-pypi-org/#uploading
> As noted in the above document, the recommended way to upload python packages 
> to pypi is now via `twine`. However, if we use `twine` with our current 
> package naming scheme (i.e. tgz), then it complains `ValueError: Unknown 
> distribution format: 'systemml-0.15.0-python.tgz'`. Hence, I would recommend 
> to use suffix `tar.gz` in our subsequent releases. This way we also are 
> compatible with default package naming convention of pypi: `tar.gz`.
> [~acs_s] [~gweidner] [~deron] [~dusenberrymw] [~reinwald] Suggestions ? Any 
> takers ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SYSTEMML-1907) Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz

2017-09-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166772#comment-16166772
 ] 

Niketan Pansare commented on SYSTEMML-1907:
---

thanks [~gweidner] :)

> Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz
> 
>
> Key: SYSTEMML-1907
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1907
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Niketan Pansare
>Assignee: Glenn Weidner
>
> I encountered this issue because pypi has migrated to a new process: 
> https://packaging.python.org/guides/migrating-to-pypi-org/#uploading
> As noted in the above document, the recommended way to upload python packages 
> to pypi is now via `twine`. However, if we use `twine` with our current 
> package naming scheme (i.e. tgz), then it complains `ValueError: Unknown 
> distribution format: 'systemml-0.15.0-python.tgz'`. Hence, I would recommend 
> to use suffix `tar.gz` in our subsequent releases. This way we also are 
> compatible with default package naming convention of pypi: `tar.gz`.
> [~acs_s] [~gweidner] [~deron] [~dusenberrymw] [~reinwald] Suggestions ? Any 
> takers ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SYSTEMML-1907) Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz

2017-09-14 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166759#comment-16166759
 ] 

Mike Dusenberry commented on SYSTEMML-1907:
---

+1 for the name change.

> Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz
> 
>
> Key: SYSTEMML-1907
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1907
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Niketan Pansare
>Assignee: Glenn Weidner
>
> I encountered this issue because pypi has migrated to a new process: 
> https://packaging.python.org/guides/migrating-to-pypi-org/#uploading
> As noted in the above document, the recommended way to upload python packages 
> to pypi is now via `twine`. However, if we use `twine` with our current 
> package naming scheme (i.e. tgz), then it complains `ValueError: Unknown 
> distribution format: 'systemml-0.15.0-python.tgz'`. Hence, I would recommend 
> to use suffix `tar.gz` in our subsequent releases. This way we also are 
> compatible with default package naming convention of pypi: `tar.gz`.
> [~acs_s] [~gweidner] [~deron] [~dusenberrymw] [~reinwald] Suggestions ? Any 
> takers ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (SYSTEMML-1907) Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz

2017-09-14 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1907:

Sprint: Sprint 6

> Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz
> 
>
> Key: SYSTEMML-1907
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1907
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Niketan Pansare
>Assignee: Glenn Weidner
>
> I encountered this issue because pypi has migrated to a new process: 
> https://packaging.python.org/guides/migrating-to-pypi-org/#uploading
> As noted in the above document, the recommended way to upload python packages 
> to pypi is now via `twine`. However, if we use `twine` with our current 
> package naming scheme (i.e. tgz), then it complains `ValueError: Unknown 
> distribution format: 'systemml-0.15.0-python.tgz'`. Hence, I would recommend 
> to use suffix `tar.gz` in our subsequent releases. This way we also are 
> compatible with default package naming convention of pypi: `tar.gz`.
> [~acs_s] [~gweidner] [~deron] [~dusenberrymw] [~reinwald] Suggestions ? Any 
> takers ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SYSTEMML-1907) Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz

2017-09-14 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1612#comment-1612
 ] 

Glenn Weidner commented on SYSTEMML-1907:
-

Thanks for catching this - I'll take this one.

> Rename python package from systemml-*-python.tgz to systemml-*-python.tar.gz
> 
>
> Key: SYSTEMML-1907
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1907
> Project: SystemML
>  Issue Type: Improvement
>  Components: Build
>Reporter: Niketan Pansare
>Assignee: Glenn Weidner
>
> I encountered this issue because pypi has migrated to a new process: 
> https://packaging.python.org/guides/migrating-to-pypi-org/#uploading
> As noted in the above document, the recommended way to upload python packages 
> to pypi is now via `twine`. However, if we use `twine` with our current 
> package naming scheme (i.e. tgz), then it complains `ValueError: Unknown 
> distribution format: 'systemml-0.15.0-python.tgz'`. Hence, I would recommend 
> to use suffix `tar.gz` in our subsequent releases. This way we also are 
> compatible with default package naming convention of pypi: `tar.gz`.
> [~acs_s] [~gweidner] [~deron] [~dusenberrymw] [~reinwald] Suggestions ? Any 
> takers ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)