[jira] [Commented] (MADLIB-1057) Reduce memory footprint for DT

2017-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973811#comment-15973811
 ] 

ASF GitHub Bot commented on MADLIB-1057:


GitHub user iyerr3 opened a pull request:

https://github.com/apache/incubator-madlib/pull/120

DT: Assign memory only for reachable nodes

JIRA: MADLIB-1057

TreeAccumulator assigns a matrix to track the statistics of rows
reaching the last layer of nodes. This matrix assumes a complete 
tree and assigns memory for all nodes. As the tree gets deeper, 
most of the nodes are unreachable, resulting in excessive wasted
memory. This commit reduces that waste by only assigning memory
for nodes that are reachable and accessing them through a lookup 
table.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iyerr3/incubator-madlib 
feature/dt_reduce_memory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-madlib/pull/120.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #120


commit b1cea55925ee1e3f6569d2d7aafac16e608c43b3
Author: Rahul Iyer 
Date:   2017-04-15T00:54:31Z

Initial commit for sparser stats matrices

commit a0875f23ff69f22462a227b500612965976e0358
Author: Rahul Iyer 
Date:   2017-04-18T20:38:04Z

Build lookup index vector

commit 67cb1b121a4829f4840f33f7cdc7eabe839ec343
Author: Rahul Iyer 
Date:   2017-04-19T00:39:24Z

Remove warnings




> Reduce memory footprint for DT
> --
>
> Key: MADLIB-1057
> URL: https://issues.apache.org/jira/browse/MADLIB-1057
> Project: Apache MADlib
>  Issue Type: Improvement
>  Components: Module: Decision Tree
>Reporter: Frank McQuillan
>Assignee: Rahul Iyer
> Fix For: v1.11
>
>
> Follow on from spike 
> https://issues.apache.org/jira/browse/MADLIB-1035
> Step 1
> As a madlib developer I want to recreate the RF memory issue (reported in 
> https://issues.apache.org/jira/browse/MADLIB-1035). 
> The current datasets we have are 
> dt_adult : 32K rows 14 columns
> ecommerce : 1M rows 4 columns (ecommerce isn’t actually suitable for DT/RF)
> We need a table with ~2.2M rows and ~130 features (the actual target table 
> has ~1300 features). Randomly filling them might help diagnosing the issue 
> but ideally we would want a somewhat sensible dataset. The problem seems to 
> involve relatively short trees (depth 5) which means a random dataset will 
> probably fill the whole tree which might not be true for a structured dataset.
> Step 2
> Refactoring DT for for smaller memory footprint.
> Tree Accumulator has 2 matrices for continuous and categorical variables. 
> The whole structure is recreated at every level. 
> Every matrix has 2^i rows (i is the level)
> The categorical matrix size depends on the total number of categories 
> (weather : {sunny, cloudy, rainy}, isWeekend : {true, false} means this total 
> is 3+2=5) 
> The continuous matrix size depends on the number of cont. features * the 
> number of bins.
> Tree accumulator works like an array not a linked list. Even if the output is 
> not a complete tree, the tree accumulator creates rows for nonexistent 
> branches in proper order and fills them with 0 values. 
> The refactored version would create a small index table that has the same 
> number of rows as the old tree accumulator (a complete tree) but only a 
> single index column that points to the new tree accumulator row. 
> This will allow us to keep most of the internal function interfaces same but 
> the code to access (read/write) the tree accumulator will have to change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MADLIB-1077) Double check binary distribution

2017-04-18 Thread Frank McQuillan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MADLIB-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan updated MADLIB-1077:

Priority: Major  (was: Minor)

> Double check binary distribution
> 
>
> Key: MADLIB-1077
> URL: https://issues.apache.org/jira/browse/MADLIB-1077
> Project: Apache MADlib
>  Issue Type: Task
>  Components: All Modules
>Reporter: Frank McQuillan
>Assignee: Roman Shaposhnik
> Fix For: v1.11
>
>
> Double check that binary distribution licensing issues are all OK.
> For example, see comments from Ed Espino on 1.10 RC-2 review on thread
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E
> {code}
>  I was performing the build from a simple perspective. Download
>   source, configure, make and glance at docs (in this order).
>   As we have dealt with auto-downloaded files in the HAWQ project, I
>   was a surprised that the following packages were automatically
>   downloaded for me. On the HAWQ project we were instructed to require
>   these as pre-requisites and or make them optional included via
>   command line options (configure).  I'm guessing other packages would
>   have been automatically downloaded if they were not found on system
>   (eg: boost).
>   Automatically downloaded packages:
>   https://github.com/madlib/eigen/archive/branches/3.2.tar.gz
>   http://sourceforge.net/projects/pyxb/files/PyXB-1.2.4.tar.gz
>   
> Issue: As "make" was running, the following message was a bit alarming:
>PyXB: Removing GPL component from code base
>   
> This comes from the script src/patch/PyXB.sh run after PyXB source
> is downloaded.
> 
>   ...
>   echo "PyXB: Removing GPL component from code base"
>   rm -f doc/extapi.py
>   rm -f doc/extapi.pyc
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan updated MADLIB-1076:

Priority: Major  (was: Minor)

> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Roman Shaposhnik
> Fix For: v1.11
>
>
> Comments from Ed Espino on 1.10 RC-2 review on thread
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E
> {code}
> LICENSE
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
> From README.md, I only saw an incomplete reference to the third party 
> components.
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>   
>   argparse 1.2.1 "provides an easy, declarative interface for creating 
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source 
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
> {code}
> To dos:
> 1) Confirm that LICENSE file is up to date
> 2) Update README.md with any required 3rd aprty/licensing clarifications



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan reassigned MADLIB-1076:
---

Assignee: Roman Shaposhnik  (was: Frank McQuillan)

> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Roman Shaposhnik
>Priority: Minor
> Fix For: v1.11
>
>
> Comments from Ed Espino on 1.10 RC-2 review on thread
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E
> {code}
> LICENSE
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
> From README.md, I only saw an incomplete reference to the third party 
> components.
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>   
>   argparse 1.2.1 "provides an easy, declarative interface for creating 
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source 
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
> {code}
> To dos:
> 1) Confirm that LICENSE file is up to date
> 2) Update README.md with any required 3rd aprty/licensing clarifications



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:28 AM:
---

Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

*Note 1*
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy in Sept 2015 when MADlib entered as a Apache incubating project:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

*Note 2*
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy in Sept 2015 when MADlib entered as a Apache incubating project, as per 
Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components

The current license file 
https://github.com/apache/incubator-madlib/blob/master/LICENSE
lists:
* src/libstemmer is taken from Snowball stem project
* src/madpack/yaml is taken from PyYAML project
* cmake/UseLATEX.cmake is taken from LaTeX project
which seems to be incomplete.

Roman can you please advise on what should be in 
https://github.com/apache/incubator-madlib/blob/master/LICENSE
give the list of 3rd party components that MADlib uses as listed above.

Thanks.








was (Author: fmcquillan):
Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

*Note 1*
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-compone

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:16 AM:
---

Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

*Note 1*
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

*Note 2*
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components

The current license file 
https://github.com/apache/incubator-madlib/blob/master/LICENSE
lists:
* src/libstemmer is taken from Snowball stem project
* src/madpack/yaml is taken from PyYAML project
* cmake/UseLATEX.cmake is taken from LaTeX project
which seems to be incomplete.

Roman can you please advise on what should be in 
https://github.com/apache/incubator-madlib/blob/master/LICENSE

Thanks.








was (Author: fmcquillan):
Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

*Note 1*
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component 

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:14 AM:
---

Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

*Note 1*
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

*Note 2*
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components





was (Author: fmcquillan):
Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

Note 1
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

Note 2
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
t

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:12 AM:
---

Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html |  |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html | See Note 
1 below |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

Note 1
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

Note 2
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components





was (Author: fmcquillan):
Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

Note 1
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

Note 2
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the M

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:11 AM:
---

Here is a summary of the 3rd party components used at the current time:

1) Bundled with source code:

||cpt||version||licence||license link||notes|| 
| libstemmer-porter | porter2 | BSD | http://snowballstem.org/license.html |  |
| m_widen_init | none | MIT | 
https://github.com/apache/incubator-madlib/blob/master/licenses/third_party/_M_widen_init.txt
 |  |
| python-argparse | 1.2.1 | Python | 
https://wiki.python.org/moin/PythonSoftwareFoundationLicenseFaq |  |
| pyyaml | 3.10 | MIT | http://pyyaml.org/wiki/PyYAML |  |
| UseLATEX.cmake | 2.1.1 | BSD | See header of 
https://github.com/kmorel/UseLATEX/blob/master/UseLATEX.cmake |  |

2) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

Note 1
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

Note 2
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components





was (Author: fmcquillan):
Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

Note 1
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

Note 2
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:03 AM:
---

Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

Note 1
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

Note 2
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components





was (Author: fmcquillan):
Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

***Note 1***
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”
.

*** Note 2 ***
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components




> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Frank McQuillan
>   

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:03 AM:
---

Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

***Note 1***
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”
.

*** Note 2 ***
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components





was (Author: fmcquillan):
Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

***Note 1***
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

*** Note 2 ***
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components




> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Frank 

[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:02 AM:
---

Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|

***Note 1***
PyXB: Python XML Schema Bindings
http://pyxb.sourceforge.net/

http://pyxb.sourceforge.net/legal.html
says
“PyXB as a whole is made available under the Apache License v 2.0.”
See also
https://github.com/pabigot/pyxb/blob/next/LICENSE
which has the standard Apache License Version 2.0 verbage.

However, we noticed that it has a GPL 3.0 sub-component:
https://github.com/pabigot/pyxb/blob/next/doc/extapi.py
(This seems strange, but anyways…)

Remedy from Sept 2015:  
* At build time, we remove this GPL sub-component since it is not needed,
just to be 100% sure we are only including Apache License Version 2.0 software. 
 
* This is the reason there is a message during build that says “PyXB: Removing 
GPL component from code base”

*** Note 2 ***
Eigen
http://eigen.tuxfamily.org/

http://eigen.tuxfamily.org/index.php?title=Main_Page#License
says
“Eigen is Free Software. Starting from the 3.1.1 version, it is licensed under 
the MPL2…Note that currently, a few features rely on third-party code licensed 
under the LGPL: SimplicialCholesky, AMD ordering, and constrained_cg.”  

MADlib uses a later version than 3.1.1.

Remedy from Sept 2015, as per Roman's guidance at that time and as per 
http://www.apache.org/legal/resolved.html#category-b :
* Cloned Eigen header files and made needed changes, then maintained on a 
separate GitHub repo https://github.com/madlib/eigen
* This modified Eigen GitHub project carries forward the MPL license.  The 
commits showing changes made are here 
https://github.com/madlib/eigen/tree/branches/3.2
* At build time, MADlib project will then include the Eigen header files from 
this repo
* Also we use the EIGEN_MPL2_ONLY preprocessor symbol to explicitly exclude the 
LGPL components





was (Author: fmcquillan):
Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|


> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Frank McQuillan
>Priority: Minor
> Fix For: v1.11
>
>
> Comments from Ed Espino on 1.10 RC-2 review on thread
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E
> {code}
> LICENSE
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
> From README.md, I only saw an incomplete reference to the third party 
> components.
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>   
>   argparse 1.2.1 "provides an easy, declarative interface for creating 
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source 
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
> {code}
> To dos:
> 1) Confirm that LICENSE file is up to date
> 2) Update README.md with any required 3rd aprty/licensing clarifications



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan edited comment on MADLIB-1076 at 4/19/17 12:00 AM:
---

Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||cpt||version||licence||license link||notes|| 
| boost | 1.61.0 | MIT | http://www.boost.org/users/license.html | See Note 1 
below |
| pyxb | 1.2.4 | Apache 2.0 | http://pyxb.sourceforge.net/legal.html |  |
| eigen | 3.2 | MPL 2.0 | 
http://eigen.tuxfamily.org/index.php?title=Main_Page#License | See Note 2 below 
|



was (Author: fmcquillan):
Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||heading 1||heading 2||heading 3|| 
|col A1|col A2|col A3| 
|col B1|col B2|col B3|


> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Frank McQuillan
>Priority: Minor
> Fix For: v1.11
>
>
> Comments from Ed Espino on 1.10 RC-2 review on thread
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E
> {code}
> LICENSE
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
> From README.md, I only saw an incomplete reference to the third party 
> components.
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>   
>   argparse 1.2.1 "provides an easy, declarative interface for creating 
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source 
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
> {code}
> To dos:
> 1) Confirm that LICENSE file is up to date
> 2) Update README.md with any required 3rd aprty/licensing clarifications



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MADLIB-1076) Review LICENSE file and README.md

2017-04-18 Thread Frank McQuillan (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973765#comment-15973765
 ] 

Frank McQuillan commented on MADLIB-1076:
-

Here is a summary of the 3rd party components used.

1) Downloaded at build time:

||heading 1||heading 2||heading 3|| 
|col A1|col A2|col A3| 
|col B1|col B2|col B3|


> Review LICENSE file and README.md
> -
>
> Key: MADLIB-1076
> URL: https://issues.apache.org/jira/browse/MADLIB-1076
> Project: Apache MADlib
>  Issue Type: Task
>  Components: Documentation
>Reporter: Frank McQuillan
>Assignee: Frank McQuillan
>Priority: Minor
> Fix For: v1.11
>
>
> Comments from Ed Espino on 1.10 RC-2 review on thread
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-user/201703.mbox/%3CCAHAuQDzarS7K4u-rOsLLhbwSHCyFn5cKSyjLinE%2BZ%3DjSpU59qw%40mail.gmail.com%3E
> {code}
> LICENSE
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
> From README.md, I only saw an incomplete reference to the third party 
> components.
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>   
>   argparse 1.2.1 "provides an easy, declarative interface for creating 
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source 
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
> {code}
> To dos:
> 1) Confirm that LICENSE file is up to date
> 2) Update README.md with any required 3rd aprty/licensing clarifications



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MADLIB-1057) Reduce memory footprint for DT

2017-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MADLIB-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973311#comment-15973311
 ] 

ASF GitHub Bot commented on MADLIB-1057:


Github user iyerr3 commented on the issue:

https://github.com/apache/incubator-madlib/pull/117
  
No, that's a separate JIRA: MADLIB-1057
. This one is just about
setting the defaults to a more reasonable value considering the data that
users have shared.

The commit is a little more than just changing two numbers since I updated
the way these defaults are set. Previously they were set in overloaded
function declaration (in SQL). Changed this to set the default in the main
function definition, eliminating redundancy.

Thanks,
Rahul



> Reduce memory footprint for DT
> --
>
> Key: MADLIB-1057
> URL: https://issues.apache.org/jira/browse/MADLIB-1057
> Project: Apache MADlib
>  Issue Type: Improvement
>  Components: Module: Decision Tree
>Reporter: Frank McQuillan
>Assignee: Rahul Iyer
> Fix For: v1.11
>
>
> Follow on from spike 
> https://issues.apache.org/jira/browse/MADLIB-1035
> Step 1
> As a madlib developer I want to recreate the RF memory issue (reported in 
> https://issues.apache.org/jira/browse/MADLIB-1035). 
> The current datasets we have are 
> dt_adult : 32K rows 14 columns
> ecommerce : 1M rows 4 columns (ecommerce isn’t actually suitable for DT/RF)
> We need a table with ~2.2M rows and ~130 features (the actual target table 
> has ~1300 features). Randomly filling them might help diagnosing the issue 
> but ideally we would want a somewhat sensible dataset. The problem seems to 
> involve relatively short trees (depth 5) which means a random dataset will 
> probably fill the whole tree which might not be true for a structured dataset.
> Step 2
> Refactoring DT for for smaller memory footprint.
> Tree Accumulator has 2 matrices for continuous and categorical variables. 
> The whole structure is recreated at every level. 
> Every matrix has 2^i rows (i is the level)
> The categorical matrix size depends on the total number of categories 
> (weather : {sunny, cloudy, rainy}, isWeekend : {true, false} means this total 
> is 3+2=5) 
> The continuous matrix size depends on the number of cont. features * the 
> number of bins.
> Tree accumulator works like an array not a linked list. Even if the output is 
> not a complete tree, the tree accumulator creates rows for nonexistent 
> branches in proper order and fills them with 0 values. 
> The refactored version would create a small index table that has the same 
> number of rows as the old tree accumulator (a complete tree) but only a 
> single index column that points to the new tree accumulator row. 
> This will allow us to keep most of the internal function interfaces same but 
> the code to access (read/write) the tree accumulator will have to change.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)