Re: [VOTE]: MADlib repo(s) migration

2017-08-09 Thread Orhan Kislal
1

Orhan Kislal

On Wed, Aug 9, 2017 at 2:32 PM, Nandish Jayaram  wrote:

> Hi All,
>
> With MADlib's graduation to TLP, it's time to migrate its github
> repos from `*incubator-madlib*` to `*madlib*`. We will have to open
> an Apache Infrastructure ticket to request this move for the following
> repos (along with other stuff like wiki, jenkins etc):
> https://git1-us-west.apache.org/repos/asf?p=incubator-madlib.git
>  (Read/Write)
> https://github.com/apache/incubator-madlib (Github mirror- read only)
> https://git1-us-west.apache.org/repos/asf?p=incubator-madlib-site.git
> https://github.com/apache/incubator-madlib-site (GitHub mirror)
>
> There are two ways to go about this, and the Infra ticket has to be
> raised accordingly.
> 1) Just maintain the current set-up, but have the repos renamed from
> incubator-madlib to madlib.
> 2) Use Gitbox to enable github repo as a R/W repo and not just read-only.
> Check this email (
> https://mail-archives.apache.org/mod_mbox/incubator-madlib-
> dev/201708.mbox/%3cCA+ULb+vP0ViWH4Nc=4eaXvbT0KOmeFtQzp4eAa3p0fKPP7c
> 8...@mail.gmail.com%3e)
> for further information.
>
> Please vote you preference and we can decide to move accordingly.
>
> NJ
>


Re: Question regarding tracking Release v1.12 specific changes.

2017-07-13 Thread Orhan Kislal
Hi Ed,

We have two sql files in the src/madpack folder to track the changes:
diff_udf and diff_udt. To use them, you install the old version of madlib
in the schema "madlib_old_vers" and the new version in the default schema
"madlib". With this setup, running these sql files automatically give you
the changed UDTs and UDFs. The rest is handled manually.

I hope it helps, please let me know if there is anything else I can help
with.

Thanks,

Orhan


On Thu, Jul 13, 2017 at 12:14 PM, Ed Espino  wrote:

> MADlib dev,
>
> While reviewing a PR, I noticed the doc top-level generated html page has a
> reference to previous releases (
> https://madlib.incubator.apache.org/docs/latest/). I found the
> corresponding v1.11 changelist (
> https://github.com/apache/incubator-madlib/commit/
> 648b05798826956e9621027447af501c194392b8)
> which updated the previous release versions to include v1.10 (in addition
> to other v1.11 related changes). Aside from this PR, how does the project
> track these types of release changes (Jira, wiki, email, other). I could
> not find a reference to them in the project's wiki. It might be staring me
> in the face so I apologize if it is. Any guidance is greatly appreciated.
>
> -=e
>
> p.s. And yes, this is in preparation for my v1.12 release manager role.
> Gaining context comes at a price.
>
> --
> *Ed Espino*
>


Re: Installation issue - OSError: [Errno 2] No such file or directory: '/usr/local/madlib/Versions/1.10.0/ports/postgres'

2017-05-09 Thread Orhan Kislal
Hi MADlib community,

I sincerely apologize for the error. The RPM for MADlib v1.10.0 and its
signatures have been updated. Note that the new features/improvements of
v1.11 will not be available with this file (feel free to check the v1.11
RPM Rahul and Frank mentioned for them). Please let us know if you have any
questions.

Thanks,

Orhan Kislal


On Tue, May 9, 2017 at 9:47 AM, Frank McQuillan 
wrote:

> Apologies on the RPM error, Atsushi.
>
> The 1.11 release candidate is posted now
> https://dist.apache.org/repos/dist/dev/incubator/madlib/1.
> 11-incubating-rc3/
> so you could use that RPM.
>
> As Rahul mentioned, we are in the process of voting on the 1.11 release
> currently.  However, I don't expect it to change.  Official release will
> hopefully happen later this week.
>
> Frank
>
>
>
> On Tue, May 9, 2017 at 9:13 AM, Rahul Iyer  wrote:
>
>> +dev for the problem with RPM
>>
>> Hi Atsushi,
>>
>> Thanks for bringing this to our notice!
>>
>> We might have to remove the 1.10 binary from the Apache dist to avoid
>> others from having this problem. We're in the process of releasing 1.11
>> and
>> would redirect to that binary once that goes through the voting process.
>>
>> @Louis, could you please clear the `/usr/local/madlib` folder and try
>> again
>> with pgxn (or compiling from source as suggested by Markus)?
>>
>>
>>
>>
>> On Mon, May 8, 2017 at 11:53 PM, Neki, Atsushi <
>> neki.atsu...@jp.fujitsu.com>
>> wrote:
>>
>> > Hi Louis, Rahul,
>> >
>> >
>> >
>> >
>> >
>> > It seems that the installation using RPM binary doesn’t work for 1.10.0.
>> >
>> > The RPM doesn’t have anything but hawq under ports directory.
>> >
>> >
>> >
>> > $ rpm -qlpi ./apache-madlib-1.10.0-incubating-bin-Linux.rpm | grep
>> ports
>> >
>> >
>> >
>> > /usr/local/madlib/Versions/1.10.0/ports/hawq
>> >
>> >   (snip)
>> >
>> >
>> >
>> > For 1.9.1 RPM binary, it doesn’t look so.
>> >
>> >
>> >
>> > /usr/local/madlib/Versions/1.9.1/ports/greenplum
>> >
>> > /usr/local/madlib/Versions/1.9.1/ports/hawq
>> >
>> > /usr/local/madlib/Versions/1.9.1/ports/postgres
>> >
>> >
>> >
>> >
>> >
>> > Unfortunately, I have no idea about the problem with pgxn.
>> >
>> >
>> >
>> >
>> >
>> > Regards,
>> >
>> > Atsushi Neki
>> >
>> >
>> >
>> >
>> >
>> > *From:* Markus Paaso [mailto:markus.pa...@gmail.com]
>> > *Sent:* Saturday, May 6, 2017 2:02 PM
>> > *To:* u...@madlib.incubator.apache.org
>> > *Subject:* Re: Installation issue - OSError: [Errno 2] No such file or
>>
>> > directory: '/usr/local/madlib/Versions/1.10.0/ports/postgres'
>> >
>> >
>> >
>> > Hi Louis,
>> >
>> >
>> >
>> > I have installed madlib on Ubuntu 16.04 using following commands:
>> >
>> >
>> >
>> >
>> >
>> > PSQL_HOST="127.0.0.1"
>> >
>> > PSQL_DB="testing"
>> >
>> > PSQL_USER="testuser"
>> >
>> > PSQL_PASS=""
>> >
>> >
>> >
>> > psql -h $PSQL_HOST template1 -c "CREATE ROLE $PSQL_USER PASSWORD
>> > '$PSQL_PASS'"
>> >
>> > createdb -h $PSQL_HOST $PSQL_DB -O $PSQL_USER
>> >
>> >
>> >
>> > sudo apt install -y cmake m4
>> >
>> > wget https://github.com/apache/incubator-madlib/archive/rel/v1.
>> 10.0.tar.gz
>> >
>> > tar -xzf v1.10.0.tar.gz
>> >
>> > cd incubator-madlib-rel-v1.10.0
>> >
>> > ./configure
>> >
>> > cd build
>> >
>> > make
>> >
>> > sudo make install
>> >
>> >
>> >
>> > MADLIB_USER="mad"
>> >
>> > MADLIB_PASS="$(openssl rand -base64 32)"
>> >
>> > psql -h $PSQL_HOST $PSQL_DB -c "CREATE USER $MADLIB_USER SUPERUSER
>> > PASSWORD '$MADLIB_PASS'"
>> >
>> > PGPASSWORD="$MADLIB_PASS" /usr/local/madlib/bin/madpack -p postgres -c
>> > $MADLIB_USER@$PSQL_HOST/$PSQL_DB install
>> >
>> >
>> >
>> > psql -h $PSQL_HOST $PSQL_D

Re: [VOTE] MADlib v1.11-rc3

2017-05-05 Thread Orhan Kislal
+1

On Fri, May 5, 2017 at 10:15 AM, Joseph Hellerstein <
hellerst...@berkeley.edu> wrote:

> dmg install smooth with OSX Postgres.app. Looks clean.
>
> +1
>
> On Fri, May 5, 2017 at 9:34 AM, Frank McQuillan 
> wrote:
>
> > I just want to comment on a couple items raised in the RC1 and RC2 votes
> > that pertain to RC3:
> >
> > (1)
> > “I happened to open the file "CMakeLists.txt" in the root directory
> > and noticed it does not have the standard ASF header. I know there
> > were IP issues resolved globally for the project recently. I
> > noticed many of them are excluded in the pom.xml file. Regardless
> > of the IP issues, shouldn't these files contain the ASF header?”
> >
> > Since this file existed before MADlib’s move to ASF, it does not need an
> > ASF header as per the guidance from ASF on this topic
> > https://issues.apache.org/jira/browse/LEGAL-293
> >
> > (2)
> > “The DMG(apache-madlib-1.11-incubating-bin-Darwin.dmg) contains a
> > pkg file named "madlib-1.11-Darwin.pkg". Shouldn't it be called
> > "apache-madlib-1.11-incubating-Darwin.pkg"?
> >
> > Similarly, the DMG base folder name is madlib-1.11.Darwin.“
> >
> > As per guidance from Roman our mentor, it is not necessary to rename all
> > packages and files.  Also, this may affect some functional tests that
> look
> > for certain file names.
> >
> > (3)
> > “There are still three outstanding Jira issues in an "Unresolved" state
> > with a fix version of v1.11.  Are they going to be resolved soon? They
> can
> > be seen with the following url:
> >
> > https://issues.apache.org/jira/browse/MADLIB/fixforversion/12339592/?
> > selectedTab=com.atlassian.jira.jira-projects-plugin:
> version-summary-panel
> > ”
> >
> > Regarding the JIRAs that are not closed, the actual work has been done so
> > there is nothing material pending.  But I did not close them because I
> > wanted Roman to do that, since he was the one overseeing them.
> >
> > (4)
> > Convenience binaries are being voted on, as Rashmi’s email calls out.
> >
> > (5)
> > I tried out the RC3 dmg and found that install, reinstall, upgrade work
> > fine with the soft link on my OS X box on PG 9.6
> >
> > So...
> >
> >
> > +1
> >
> >
> >
> >
> > On Thu, May 4, 2017 at 6:10 PM, Rashmi Raghu  wrote:
> >
> > > Hello MADlib community,
> > >
> > > We have created a MADlib 1.11 RC-3, with the artifacts below (source
> and
> > > convenience binaries) up for a vote.
> > >
> > > Note that voting for the RC-2 release has been cancelled due to the
> need
> > > for minor corrections based on community feedback. Sorry for the
> > > inconvenience.
> > >
> > > RC-3 replaces RC-2 with the following minor changes:
> > > * Ensure product naming is consistently 'Apache MADlib (incubating)'
> > > * Git revision tag changed to rc/1.11-rc3
> > >
> > > This will be the 5th release for Apache MADlib (incubating).
> > >
> > > The main goals of this release are:
> > > * new module (PageRank for graph analytics with grouping support
> > included)
> > > * improvements to existing modules (add grouping support to Single
> Source
> > > Shortest Path, reduce memory footprint of DT and RF, include NULL
> > features
> > > in training DT, add support for array and svec output for Pivot module,
> > > utility to unnest 2-D arrays into rows of 1-D arrays)
> > > * platform updates (GPDB 5)
> > > * updates for Apache Top Level Project readiness and build process on
> > > Apache infrastructure
> > > * bug fixes
> > > * doc improvements
> > >
> > > For more information including release notes, please see:
> > > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.11
> > >
> > > *** Please download, review and vote by Tue May 09, 2017 @ 6pm PDT ***
> > >
> > > We're voting upon the source and convenience binaries below:
> > >
> > > Source Repository (tag):  rc/1.11-rc3
> > > https://github.com/apache/incubator-madlib/tree/rc/1.11-rc3
> > >
> > > Source Files and convenience Binaries:
> > > https://dist.apache.org/repos/dist/dev/incubator/madlib/1.
> > > 11-incubating-rc3/
> > >
> > > Commit:
> > > https://github.com/apache/incubator-madlib/commit/
> > > 8e2778a3921aa99f009962756881ce4bea5eee16
> > >
> > > KEYS file containing PGP Keys we use to sign the release:
> > > https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> > >
> > > To help in tallying the vote, PMC members please be sure to indicate
> > > "(binding)" with the vote.
> > >
> > > [ ] +1  approve
> > > [ ] +0  no opinion
> > > [ ] -1  disapprove (and reason why)
> > >
> > >
> > > Regards,
> > > Rashmi Raghu
> > >
> > > --
> > > Rashmi Raghu, Ph.D.
> > > Pivotal Data Science
> > >
> >
>


Re: [VOTE] MADlib v1.11-rc1

2017-05-01 Thread Orhan Kislal
+1

On Mon, May 1, 2017 at 4:25 PM, Rahul Iyer  wrote:

> +1
>
> On May 1, 2017 3:55 PM, "Rashmi Raghu"  wrote:
>
> > Hello MADlib community,
> >
> > We have created a MADlib 1.11 RC-1, with the artifacts below up for a
> vote.
> >
> > This will be the 5th release for Apache MADlib (incubating).
> >
> > The main goals of this release are:
> > * new module (PageRank for graph analytics with grouping support
> included)
> > * improvements to existing modules (add grouping support to Single Source
> > Shortest Path, reduce memory footprint of DT and RF, include NULL
> features
> > in training DT, add support for array and svec output for Pivot module,
> > utility to unnest 2-D arrays into rows of 1-D arrays)
> > * platform updates (GPDB 5)
> > * updates for Apache Top Level Project readiness and build process on
> > Apache infrastructure
> > * bug fixes
> > * doc improvements
> >
> > For more information including release notes, please see:
> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.11
> >
> > *** Please download, review and vote by Thu May 04, 2017 @ 6pm PDT ***
> >
> > We're voting upon the source (tag):  rc/1.11-rc1
> > https://github.com/apache/incubator-madlib/tree/rc/1.11-rc1
> >
> > Source Files:
> > https://dist.apache.org/repos/dist/dev/incubator/madlib/1.
> > 11-incubating-rc1/
> >
> > Commit to be voted upon:
> > https://github.com/apache/incubator-madlib/commit/
> > 0ff829a7060d08f284e8468ebf35c31b6e231d58
> >
> > KEYS file containing PGP Keys we use to sign the release:
> > https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> >
> > To help in tallying the vote, PMC members please be sure to indicate
> > "(binding)" with the vote.
> >
> > [ ] +1  approve
> > [ ] +0  no opinion
> > [ ] -1  disapprove (and reason why)
> >
> >
> > Regards,
> > Rashmi Raghu
> >
> > --
> > Rashmi Raghu, Ph.D.
> > Pivotal Data Science
> >
>


Graph SSSP Scale Tests

2017-04-05 Thread Orhan Kislal
Hello MADlib community,



We have been doing some additional scale testing on SSSP introduced in the
1.10 release

http://madlib.incubator.apache.org/docs/latest/group__grp__sssp.html



A sample of results, going up to 100M vertices and 5B edges can be found in
the following links:


https://drive.google.com/file/d/0B62dTQMossK9eml5LV9EZ09LcmM/
view?usp=sharing

https://drive.google.com/file/d/0B62dTQMossK9dU1rSEs1TTBZN1U/
view?usp=sharing


So scaling looks pretty good.



Please let me know if you have any comments.


Orhan Kislal

­­


Re: [VOTE] MADlib v1.10-rc2

2017-03-03 Thread Orhan Kislal
+1

On Fri, Mar 3, 2017 at 4:14 PM, Rahul Iyer  wrote:

> +1
>
> On Fri, Mar 3, 2017 at 11:17 AM, Frank McQuillan 
> wrote:
>
> > Hello MADlib community,
> >
> > I am sending this email on behalf of the release manager Satoshi
> Nagayasu <
> > sn...@uptime.jp> .
> >
> > We have created a MADlib 1.10 RC-2, with the artifacts below up for a
> vote.
> >
> > From project mentor Roman Shaposhnik we heard the ultimate resolution on
> > the IP issue:
> >* we don't do anything with existing (BSD) files even if we edit them
> >* every new file we create gets an ASF license header
> >* more details:
> >
> > https://issues.apache.org/jira/browse/LEGAL-293?
> focusedCommentId=15881595&
> > page=com.atlassian.jira.plugin.system.issuetabpanels:
> > comment-tabpanel#comment-15881595
> >
> > RC-2 replaces RC-1 with the following changes:
> >
> > * Multiple: Update license headers per Apache guidance
> > https://github.com/apache/incubator-madlib/commit/
> > a3863b6c2407eb28ba007f6288d167bf88674e6d
> >
> > * Build: Fix module sort order for PGXN installation
> > https://github.com/apache/incubator-madlib/commit/
> > fa80240f72a6551c2ee567d471afa499fd1d1efe
> >
> > * Update the copyright year.
> > https://github.com/apache/incubator-madlib/commit/
> > 0b8415e7eec5c9ebb83fbf22923c69a99b0056ef
> >
> > * Build: Add error for missing server includedir
> > https://github.com/apache/incubator-madlib/commit/
> > b3495c50bf491139ac245a21d97963e81892c610
> >
> > * Encode categorical: Add distributed_by in Postgresql w/ no-op
> > https://github.com/apache/incubator-madlib/commit/
> > 7055dceb3fbde35bae602ac80d4b70486f015748
> >
> > * Renamed the top level source directory as suggested:
> > apache-madlib-src-1.10-incubating
> >
> > This will be the 4th release for Apache MADlib (incubating).
> >
> > The main goals of this release are:
> > * new modules (single source shortest path for graph analytics, encode
> > categorical variables, K-nearest neighbors)
> > * improvements to existing modules (add grouping support to elastic
> > net and PCA, add cross validation to elastic net, array input for
> > K-means, verbose output option for DT and RF, limit itemset size in
> > association rules, various madpack installer improvements)
> > * platform updates (PostgreSQL 9.6)
> > * bug fixes
> > * doc improvements
> >
> > For more information including release notes, please see:
> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.10
> >
> > *** Please download, review and vote by Mon Mar 6, 2017 @ 6pm Pacific
> Time
> > USA ***
> >
> > We're voting upon the source (tag):  rc/1.10.0-rc2
> > https://github.com/apache/incubator-madlib/tree/rc/1.10.0-rc2
> >
> > Source Files:
> > https://dist.apache.org/repos/dist/dev/incubator/madlib/1.
> > 10.0-incubating-rc2/
> >
> > Commit to be voted upon:
> > https://github.com/apache/incubator-madlib/commit/
> > a3863b6c2407eb28ba007f6288d167bf88674e6d
> >
> > KEYS file containing PGP Keys we use to sign the release:
> > https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> >
> > To help in tallying the vote, can PMC members please be sure to
> > indicate "(binding)" with their vote.
> >
> > [ ] +1  approve
> > [ ] +0  no opinion
> > [ ] -1  disapprove (and reason why)
> >
> > Regards,
> > Frank McQuillan
> >
>


Re: [VOTE] MADlib v1.10-rc1

2017-02-16 Thread Orhan Kislal
Hi Ed,

Thanks for the review. One of the comments from the previous release was a
preference towards a signature with an Apache id. Since Satoshi-san is not
an Apache committer yet, I took care of the signing process.

Thanks,

Orhan Kislal

On Thu, Feb 16, 2017 at 11:58 AM, Ed Espino  wrote:

> A few MADlib v1.10-rc1 observations from a HAWQ incubator committer.
>
>- The Copyright year (2016) in the NOTICE file needs to be updated to
>2017. I believe this can be handled in next release.
>- As it still applies, similar to a past comment by Roman ([VOTE] MADlib
>v1.9.1-rc2
><https://lists.apache.org/thread.html/981b4c24eaa2ab069b8e18f7aa4bdd
> c7a78d3a9dc26bf659af94fcfe@%3Cgeneral.incubator.apache.org%3E>)
>- *"* name of the top level folder in the archive is weird. The usual
>practice is to call the top level folder as - ID>*"*
> (example: *apache-madlib-src-1.10-incubating* instead of
>*incubator-madlib*)
>- I'm more curious than anything. Why did Orhan sign the release? I was
>expecting the release manager (Satoshi Nagayasu) to have signed the
> release.
>- Checksums and PGP signature are good.
>-  ASF headers check: I spot checked files added (git whatchanged
>--diff-filter=A) since the last release. ASF headers look good.  Nice
> Job!
>
> I was going to try and build but I ran past my allotted time limit for this
> review. Hopefully, I can try this soon.
>
> Regards,
> -=ed espino
>
> On Thu, Feb 16, 2017 at 10:05 AM, Orhan Kislal  wrote:
>
> > +1
> >
> > Orhan Kislal
> >
> > On Thu, Feb 16, 2017 at 9:23 AM, Joe Hellerstein <
> hellerst...@berkeley.edu
> > >
> > wrote:
> >
> > > +1
> > >
> > > Sent from a telephone.
> > >
> > > > On Feb 16, 2017, at 9:17 AM, Frank McQuillan 
> > > wrote:
> > > >
> > > > +1
> > > >
> > > > Frank McQuillan
> > > >
> > > >> On Wed, Feb 15, 2017 at 7:27 PM, Satoshi Nagayasu 
> > > wrote:
> > > >>
> > > >> Hello MADlib community,
> > > >>
> > > >> We have created a MADlib 1.10 RC-1, with the artifacts below up for
> a
> > > vote.
> > > >>
> > > >> This will be the 4th release for Apache MADlib (incubating).
> > > >>
> > > >> The main goals of this release are:
> > > >> * new modules (single source shortest path for graph analytics,
> encode
> > > >> categorical variables, K-nearest neighbors)
> > > >> * improvements to existing modules (add grouping support to elastic
> > > >> net and PCA, add cross validation to elastic net, array input for
> > > >> K-means, verbose output option for DT and RF, limit itemset size in
> > > >> association rules, various madpack installer improvements)
> > > >> * platform updates (PostgreSQL 9.6)
> > > >> * bug fixes
> > > >> * doc improvements
> > > >>
> > > >> For more information including release notes, please see:
> > > >> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.10
> > > >>
> > > >> *** Please download, review and vote by Sat Feb 18, 2017 @ 6pm PST
> ***
> > > >>
> > > >> We're voting upon the source (tag):  rc/1.10.0-rc1
> > > >> https://github.com/apache/incubator-madlib/tree/rc/1.10.0-rc1
> > > >>
> > > >> Source Files:
> > > >> https://dist.apache.org/repos/dist/dev/incubator/madlib/1.
> > > >> 10.0-incubating-rc1/
> > > >>
> > > >> Commit to be voted upon:
> > > >> https://github.com/apache/incubator-madlib/commit/
> > > >> ea17530bfe22a1fde173d7fa83508cbcd9924c20
> > > >>
> > > >> KEYS file containing PGP Keys we use to sign the release:
> > > >> https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> > > >>
> > > >> To help in tallying the vote, can PMC members please be sure to
> > > >> indicate "(binding)" with their vote.
> > > >>
> > > >> [ ] +1  approve
> > > >> [ ] +0  no opinion
> > > >> [ ] -1  disapprove (and reason why)
> > > >>
> > > >> --
> > > >> Satoshi Nagayasu 
> > > >>
> > >
> >
>
>
>
> --
> *Ed Espino*
> *esp...@apache.org *
>


Re: [VOTE] MADlib v1.10-rc1

2017-02-16 Thread Orhan Kislal
+1

Orhan Kislal

On Thu, Feb 16, 2017 at 9:23 AM, Joe Hellerstein 
wrote:

> +1
>
> Sent from a telephone.
>
> > On Feb 16, 2017, at 9:17 AM, Frank McQuillan 
> wrote:
> >
> > +1
> >
> > Frank McQuillan
> >
> >> On Wed, Feb 15, 2017 at 7:27 PM, Satoshi Nagayasu 
> wrote:
> >>
> >> Hello MADlib community,
> >>
> >> We have created a MADlib 1.10 RC-1, with the artifacts below up for a
> vote.
> >>
> >> This will be the 4th release for Apache MADlib (incubating).
> >>
> >> The main goals of this release are:
> >> * new modules (single source shortest path for graph analytics, encode
> >> categorical variables, K-nearest neighbors)
> >> * improvements to existing modules (add grouping support to elastic
> >> net and PCA, add cross validation to elastic net, array input for
> >> K-means, verbose output option for DT and RF, limit itemset size in
> >> association rules, various madpack installer improvements)
> >> * platform updates (PostgreSQL 9.6)
> >> * bug fixes
> >> * doc improvements
> >>
> >> For more information including release notes, please see:
> >> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.10
> >>
> >> *** Please download, review and vote by Sat Feb 18, 2017 @ 6pm PST ***
> >>
> >> We're voting upon the source (tag):  rc/1.10.0-rc1
> >> https://github.com/apache/incubator-madlib/tree/rc/1.10.0-rc1
> >>
> >> Source Files:
> >> https://dist.apache.org/repos/dist/dev/incubator/madlib/1.
> >> 10.0-incubating-rc1/
> >>
> >> Commit to be voted upon:
> >> https://github.com/apache/incubator-madlib/commit/
> >> ea17530bfe22a1fde173d7fa83508cbcd9924c20
> >>
> >> KEYS file containing PGP Keys we use to sign the release:
> >> https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> >>
> >> To help in tallying the vote, can PMC members please be sure to
> >> indicate "(binding)" with their vote.
> >>
> >> [ ] +1  approve
> >> [ ] +0  no opinion
> >> [ ] -1  disapprove (and reason why)
> >>
> >> --
> >> Satoshi Nagayasu 
> >>
>


Re: 1.10 release status and release manager

2017-02-02 Thread Orhan Kislal
Hi Satoshi-san,

I wanted to follow up on the release tasks. Our RAT file (release audit
tool - pom.xml) is quite large and it might be hard to track the correct
license headers for the new files. I also think that it might be a good
idea to provide a list of new and changed files for the reviewers
convenience. Please let us know if you need a hand with anything.

Thanks

Orhan

On Fri, Jan 27, 2017 at 8:30 PM, Satoshi Nagayasu  wrote:

> Hi Frank and all,
>
> Thank you for involving me to the development process.
> I believe that In-Database Analytics is one of the next big things and
> MADlib looks very special to me.
> I'm happy to be able to contribute to the product and its community.
> Let's make the next release great. :)
>
> Regards,
>
> 2017-01-28 6:36 GMT+09:00 Frank McQuillan :
> > MADlib community,
> >
> > We are getting fairly close to completing the software for the 1.10
> release
> > and putting up an RC.
> >
> > The PR list is getting smaller as we review and complete testing
> > https://github.com/apache/incubator-madlib/pulls
> >
> > Satoshi Nagayasu
> > satoshi.nagay...@gmail.com
> > https://github.com/snaga
> > has graciously offered to be the release manager for 1.10.  Thank you
> very
> > much Satoshi for your help!
> >
> > Regards,
> > Frank
>
> --
> Satoshi Nagayasu 
>


Upgrade support

2017-02-01 Thread Orhan Kislal
Dear MADlib community,

I started working on the upgrade support for our upcoming release (MADlib
1.10.0) and made some progress. Historically, MADlib supported upgrades
from any 1.x version. However, with every version, this task becomes more
and more time consuming. Note that all upgrades have to be tested for 6
platforms (last 2 versions Postgres, Greenplum and HAWQ). I believe we can
drop support for upgrades for versions prior to 1.8 but I wanted to consult
with you before taking this action. This change will not disable upgrade
for older versions entirely. The upgrade might not give proper error
messages but it should still work if there are no dependencies. In
addition, it is possible to follow an upgrade chain 1.x -> 1.9.1 -> 1.10.0.

Please let us know if this change is not reasonable.

Thanks

Orhan Kislal


Re: 1.10 release status and release manager

2017-01-27 Thread Orhan Kislal
Hi MADlib,

Thank you Satoshi-san for taking this responsibility. I have some
experience with this and I would be happy to help with any step of the
process.

Best,

Orhan Kislal

On Fri, Jan 27, 2017 at 1:36 PM, Frank McQuillan 
wrote:

> MADlib community,
>
> We are getting fairly close to completing the software for the 1.10 release
> and putting up an RC.
>
> The PR list is getting smaller as we review and complete testing
> https://github.com/apache/incubator-madlib/pulls
>
> Satoshi Nagayasu
> satoshi.nagay...@gmail.com
> https://github.com/snaga
> has graciously offered to be the release manager for 1.10.  Thank you very
> much Satoshi for your help!
>
> Regards,
> Frank
>


Re: [VOTE] MADlib v1.9.1-rc2

2016-09-06 Thread Orhan Kislal
+1

On Tue, Sep 6, 2016 at 9:50 AM, Frank McQuillan 
wrote:

> Gentle reminder to vote on this release.  Voting ends at 6 pm Pacific time
> today.
>
> Thanks,
> Frank
>
> On Fri, Sep 2, 2016 at 10:26 AM, Frank McQuillan 
> wrote:
>
> > Hello MADlib community,
> >
> > We have created a MADlib 1.9.1 RC-2, with the artifacts below up for a
> > vote.
> >
> > This release candidate replaces RC-1.  The only difference between RC-1
> > and RC-2 is
> > that some ._’ files were sneaked in by OSX during the packaging.
> > These have been removed.
> >
> > This will be the 3rd release for Apache MADlib (incubating).
> >
> > The main goals of this release are:
> > * new modules (1-class SVM for novelty detection, prediction metrics,
> > sessionization, pivoting)
> > * improvements to existing modules (class weights in SVM, overlapping
> > patterns in path)
> > * performance improvements (path)
> > * platform updates (PostgreSQL 9.5 and 9.6)
> > * bug fixes
> > * doc improvements
> >
> > For more information including release notes, please see:
> > https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.9.1
> >
> > *** Please download, review and vote by Tues Sep 6, 2016 @ 6pm PST ***
> >
> > We're voting upon the source (tag):  rc/1.9.1-rc2
> >
> > Source Files:
> > https://dist.apache.org/repos/dist/dev/incubator/madlib/1.9.
> > 1-incubating-rc2
> >
> > Commit to be voted upon:
> > https://git-wip-us.apache.org/repos/asf?p=incubator-madlib.
> git;a=commit;h=
> > e1c99c1538dc124c9b323ba76382ba2af05c6892
> >
> > KEYS file containing PGP Keys we use to sign the release:
> > https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> >
> > To help in tallying the vote, can PMC members please be sure to indicate
> > "(binding)" with their vote.
> >
> > [ ] +1  approve
> > [ ] +0  no opinion
> > [ ] -1  disapprove (and reason why)
> >
> > Thank you,
> > Frank McQuillan
> >
> >
>


Re: [VOTE] MADlib v1.9.1-rc1

2016-09-02 Thread Orhan Kislal
I am not an apache veteran but I think it should be a new RC. Here is a
passage from the apache release best practices website.

Modifications
Once an artifact has been uploaded, it must never be modified. Not only is
there no guarantee that the modified artifact will be correctly archived or
mirrored but a change to an existing artifact is the signature of an attack.

Orhan

On Fri, Sep 2, 2016 at 8:39 AM, Rahul Iyer  wrote:

> Hi Satoshi,
>
> Thanks for trying the RC and discovering the issue. The '._' files were
> sneaked in by osx during the packaging - I'll create a new RC and upload
> that in a few minutes.
>
> @apache-veterans: Should this be uploaded as a new RC or just update the
> existing RC and signature files?
>
> - Rahul
>
> On Fri, Sep 2, 2016 at 7:05 AM, Satoshi Nagayasu  wrote:
>
> > Hi Frank,
> >
> > Running "find . | fgrep /._ | xargs rm" before building fixes the build
> > error,
> > so I think those binary files must be removed before the official
> release.
> >
> > Regards,
> >
> >
> > 2016-09-02 22:14 GMT+09:00 Satoshi Nagayasu :
> > > Frank,
> > >
> > > I have tried to build RC1 from the tarball, and I have a question.
> > >
> > > I got a build error when building on CentOS 6.6 as below.
> > >
> > > [snaga@localhost build]$ env LANG=C make
> > > [  0%] Built target EP_boost
> > > [  1%] Built target EP_eigen
> > > [  1%] Built target EP_pyxb
> > > [  1%] Built target pythonFiles
> > > [  1%] Built target sqlFiles
> > > [  1%] Built target madlibPatches
> > > [  5%] Built target madpackFiles
> > > [  5%] Built target binaryFiles
> > > [  7%] Built target configFiles
> > > [ 28%] Built target sqlFiles_postgresql
> > > [ 28%] Building CXX object
> > > src/ports/postgres/9.6/CMakeFiles/madlib_postgresql_
> > 9_6.dir/__/__/__/modules/tsa/._arima.cpp.o
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:1:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\5' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\26' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\7' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:5:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\2' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:7:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:17:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\2' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:27:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:35:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\260' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:39:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\2' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:43:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\342' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:47:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:89:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\342' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:97:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:
> > > error: stray '\230' in program
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:101:
> > > warning: null character(s) ignored
> > > /home/snaga/madlib/apache-madlib-1.9.1-incubating-
> > source/src/modules/tsa/._arima.cpp:1:105:
> > > warning: null character(s)

Re: MADlib 1.9.1 release manager

2016-07-13 Thread Orhan Kislal
I volunteer for the release manager duties. I will still need some help
from a committer to complete the procedures but I believe I can take care
of the rest.

Thanks

Orhan

On Mon, Jul 11, 2016 at 4:55 PM, Frank McQuillan 
wrote:

> As we start moving towards the 1.9.1 release, I am wondering if someone
> would like to offer be release manager this time around.
>
> A general description of the process is at
> http://incubator.apache.org/guides/releasemanagement.html
>
> I will of course help walk thru the steps with anyone who would like to
> participate.
>
> Frank
>


Prediction Metrics

2016-04-08 Thread Orhan Kislal
Hello MADlib community,

I think it might make sense to add a module to MADlib for prediction
metrics.
Since there are quite a bit of options, I decided to start with the list of
metrics from PDLTools [1]. You can see my proposed interface at attachment
of
the associated JIRA [2,3]. I'll paste a snippet just as an example. I would
like
the feedback of the community on a number of questions that came up.

1) Are there any other metrics that should take precedence over these ones?
Please note that binary_classifier reports multiple metrics (tpr, fpr, acc,
f1
etc.)

2) How should we handle grouping? As you can see in the example, the
function
returns a double value for regular execution but an output table is used if
grouping parameter is passed. This dual interface doesn't seem clean and
returning a table with a single value for the regular execution feels wrong.

Thanks

Orhan Kislal


[1]
http://pivotalsoftware.github.io/PDLTools/group__grp__prediction__metrics.html

[2] https://issues.apache.org/jira/browse/MADLIB-907

[3]
https://issues.apache.org/jira/secure/attachment/12797816/interface_v1.sql

---

CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.area_under_roc(
table_inTEXT,
prediction_col TEXT,
  observed_col TEXT,
table_out TEXT,
grouping_col TEXT
) RETURNS VOID
AS $$
PythonFunctionBodyOnly(`pred_metrics', `pred_metrics')
return pred_metrics.area_under_roc(schema_madlib,
table_in, prediction_col, observed_col, table_out, grouping_col)
$$ LANGUAGE plpythonu
m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');

CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.area_under_roc(
table_inTEXT,
prediction_col TEXT,
  observed_col TEXT
) RETURNS DOUBLE PRECISION
AS $$
PythonFunctionBodyOnly(`pred_metrics', `pred_metrics')
return pred_metrics.area_under_roc(schema_madlib,
table_in, prediction_col, observed_col)
$$ LANGUAGE plpythonu
m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');

---


Re: [VOTE] MADlib v1.9alpha-rc1

2016-02-23 Thread Orhan Kislal
+1

On Tue, Feb 23, 2016 at 2:02 AM, WangChenLiang  wrote:

> +1
>
> > Subject: Re: [VOTE] MADlib v1.9alpha-rc1
> > From: xt...@pivotal.io
> > Date: Mon, 22 Feb 2016 22:50:13 -0800
> > To: dev@madlib.incubator.apache.org
> >
> > +1
> >
> > > On Feb 22, 2016, at 10:30 PM, Rahul Iyer  wrote:
> > >
> > > +1 (binding)
> > > On Feb 22, 2016 9:59 PM, "Atri Sharma"  wrote:
> > >
> > >> +1 (binding)
> > >>
> > >> Regards,
> > >>
> > >> Atri
> > >> On 23 Feb 2016 11:20 am, "Greg Chase"  wrote:
> > >>
> > >>> Sorry to be slow to respond to this.
> > >>>
> > >>> Big +1
> > >>>
> > >>> Great idea working through the IP clearance as a first release, and
> then
> > >>> moving on to new functionality!
> > >>>
> > >>> -Greg
> > >>>
> > >>> On Fri, Feb 19, 2016 at 6:21 PM, Frank McQuillan <
> fmcquil...@pivotal.io>
> > >>> wrote:
> > >>>
> > >>>> Hello MADlib community,
> > >>>>
> > >>>> We have created a MADlib 1.9 alpha release, with the artifacts
> below up
> > >>> for
> > >>>> a vote.
> > >>>>
> > >>>> This is the 1st release for Apache MADlib (incubating).
> > >>>>
> > >>>> First of all, a big thanks to Orhan Kislal for being the release
> > >> manager
> > >>>> for this release.
> > >>>>
> > >>>> There are two main goals of this release:
> > >>>> * Clear all potential IP issues in the code base and make it legally
> > >>> ready
> > >>>> to be adopted by the community.
> > >>>> * Share the new features that have been developed so far, in order
> to
> > >>> give
> > >>>> the community a good sense of the upcoming 1.9 release.
> > >>>>
> > >>>> For more information including release notes, please see:
> > >>>> https://cwiki.apache.org/confluence/display/MADLIB/MADlib+1.9+alpha
> > >>>>
> > >>>> This is a source code tarball only release.
> > >>>>
> > >>>> *** Please download, review and vote by Wed Feb 24, 2016 @ 6pm PST
> ***
> > >>>>
> > >>>> We're voting on the source (tag):
> > >>>> rc/v1.9alpha-rc1
> > >>>>
> > >>>> Source Files:
> > >>>>
> > >>>>
> > >>>
> > >>
> https://dist.apache.org/repos/dist/dev/incubator/madlib/1.9alpha-incubating-rc1/
> > >>>>
> > >>>> Commit to be voted on:
> > >>>>
> > >>>>
> > >>>
> > >>
> https://git-wip-us.apache.org/repos/asf?p=incubator-madlib.git;a=commit;h=581d07b03ba6c7f81fd791548f1b0f7c4909c710
> > >>>>
> > >>>> KEYS file containing PGP Keys we use to sign the release:
> > >>>> https://dist.apache.org/repos/dist/dev/incubator/madlib/KEYS
> > >>>>
> > >>>> To help in tallying the vote, PMC members please be sure to indicate
> > >>>> "(binding)" with your vote.
> > >>>>
> > >>>> [ ] +1  approve
> > >>>> [ ] +0  no opinion
> > >>>> [ ] -1  disapprove (and reason why)
> > >>>>
> > >>>>
> > >>>> Thank you,
> > >>>> Frank McQuillan
> > >>>>
> > >>>
> > >>
> >
>
>


Re: Hello World Example Errors

2016-02-19 Thread Orhan Kislal
Hello Babak,

Thanks for pointing out these errors. I remember making changes to the
hello world examples when I first started as well. I suggest making a pull
request with your updated files (as you noted) so that the next developer
does not get confused with the same bugs. Here is the website for
contribution guidelines:
https://cwiki.apache.org/confluence/display/MADLIB/Contribution+Guidelines
Since you are interested in contributing to MADlib, I would suggest
checking the following link as well for contribution ideas.
https://cwiki.apache.org/confluence/display/MADLIB/Ideas+for+Contribution

Best Regards,

Orhan Kislal


On Thu, Feb 18, 2016 at 6:34 PM, Babak Alipour 
wrote:

> Greetings everyone,
>
> I am Babak Alipour, a student at University of Florida. I have been trying
> to use MADlib and I hope to contribute to the community later on.
>
> While trying to go through the quick start guide for developers (available
> here :
>
> https://cwiki.apache.org/confluence/display/MADLIB/Quick+Start+Guide+for+Developers
> ),
> I realized that in avg_var.cpp there are some lines that use 'this.member'.
> I think 'this' is a pointer and it should be used with '->'.
> So I tried to build MADlib from source with these files added, and it
> doesn't.
> It produces multiple compile-time errors:
> E.g. ' error: request for member ‘avg’ in ‘this’, which is of pointer type
> ...'
>
> Replacing every 'this.'  with 'this->' takes care of those errors but there
> are still a few more errors in the given example.
>
> Line 53:
> double a = static_cast(state.numRows) / normalizer;
> Produces an error, since 'state' is not defined in this scope. Since this
> function is overloading '+=', I think it's safe to say this was supposed to
> be the state of 'this', so I replaced it with 'this->' and that error went
> away.
>
> Lines 44 & 45:
> template 
> AvgVarTransitionState &operator+=(const double x){
>
> This was also producing compile errors. (failed to deduce OtherHandle
> template type) This takes a double and adds it to the state, so I don't see
> any uses for 'OtherHandle' in this context and I commented line 44.
>
> All said and done, MADlib now compiles, installs and works just fine.
>
> I was wondering if anyone else encountered these issues while going through
> the hello world example.
>
> I could fork the project, fix the example and submit a pull-request if you
> think that's the way to go.
>
>
> Best regards,
> *Babak Alipour ,*
> *University of Florida*
>


MADlib 1.9 alpha Release

2016-02-12 Thread Orhan Kislal
Dear MADlib Community,

We have been working on the 1.9 alpha release tracked by this JIRA
https://issues.apache.org/jira/browse/MADLIB-957. It seems most of the
tasks are completed and the remaining ones like code testing will be
completed soon. Roman will be publishing the keys. I would like to ask if
there are any comments or suggestions before we finalize the release.

Thank you for using and contributing to MADlib.

Orhan Kislal


Re: Proposal for 1.9 alpha release

2016-01-19 Thread Orhan Kislal
Thanks for the vote of confidence everyone. I'll start going through this
document right away.

Orhan

On Tue, Jan 19, 2016 at 2:39 PM, Caleb Welton  wrote:

> >
> >
> > > There's no process requirement like that. Being a committer on the
> > project
> > > helps with certain things, but in fact volunteering as an RM is a great
> > way
> > > for new folks to start generating karma towards being considered a
> > > committer
> > > on the project.
> >
> > FM >>> Thanks Orhan for volunteering
> >
>
>
> Perfect, +1 from me :)
>
> Orhan, here's a resource on the Apache Release process [1] in case that is
> helpful to you.
>
> [1] http://incubator.apache.org/guides/releasemanagement.html
>


Re: Proposal for 1.9 alpha release

2016-01-15 Thread Orhan Kislal
Hello,

I have started working on MADLib recently but I am looking forward to being
more involved with the project. I would like to volunteer for the release
manager position. With some guidance, I believe I can handle this
responsibility.

Thanks

Orhan Kislal

On Fri, Jan 15, 2016 at 9:26 AM, Frank McQuillan 
wrote:

> Hello,
>
> During the second community call held 12/18/15, Roman suggested that
> the MADlib community consider a 1.9 alpha release in the near term.
> There was general agreement on the call and I would like to open this
> question up to the broader community.
>
> Please reply to this thread with your thoughts.  Good idea?
>
> A Release Manager has not yet been identified, so any volunteers would
> be welcome.  I believe it is not an onerous responsibility and Roman
> can help guide you through it.
>
> Frank
>