Re: [VOTE] MADlib v1.10-rc2

2017-03-06 Thread Frank McQuillan
Reminder to please vote today on RC-2 as vote closes at 6 pm Pacific today.

thanks,
Frank

On Fri, Mar 3, 2017 at 8:26 PM, Ed Espino  wrote:

> I had some time and had been wanting to perform a MADlib build.  Here are
> my notes from my quick review of MADlib v1.10-rc2. Sorry if the information
> is a bit scattered.
>
> Regards,
> -=ed espino
>
> ==
> Checksums are good
> ==
> PGP signature is good
> ==
> Extracted tarball base directory (apache-madlib-src-1.10-incubating)
> good
> ==
> LICENSE
>
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
>
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
>
> From README.md, I only saw an incomplete reference to the third party
> components.
>
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>
>   argparse 1.2.1 "provides an easy, declarative interface for creating
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
>
> ==
> DISCLAIMER good
> ==
> NOTICE good
> ==
> BUILD, INSTALL and INSTALL-CHECK
>
> I was able to build the package and successfully ran MADlib
> install-check against PostgreSQL 9.6.2.
>
> Issue: There is no obvious reference to the PostgreSQL libxml
>dependency in dev documentation. The madpack install-check
>has failures (see below) if "--with-libxml" configure
>option is not specified for PostgreSQL.
>
>install-check errors encountered due to PostgreSQL
>configuration without "--with-libxml" option:
>
>  psql:/tmp/madlib.0UIPlZ/pmml/test/table_to_pmml.sql_in.tmp:73:
> ERROR:  unsupported XML feature
>  DETAIL:  This functionality requires the server to be built with
> libxml support.
>  HINT:  You need to rebuild PostgreSQL using --with-libxml.
>  CONTEXT:  while creating return value
>  PL/Python function "pmml"
>
> Issue: AUTO DOWNLOADED PACKAGES
>
>   I was performing the build from a simple perspective. Download
>   source, configure, make and glance at docs (in this order).
>
>   As we have dealt with auto-downloaded files in the HAWQ project, I
>   was a surprised that the following packages were automatically
>   downloaded for me. On the HAWQ project we were instructed to require
>   these as pre-requisites and or make them optional included via
>   command line options (configure).  I'm guessing other packages would
>   have been automatically downloaded if they were not found on system
>   (eg: boost).
>
>   Automatically downloaded packages:
>
>   https://github.com/madlib/eigen/archive/branches/3.2.tar.gz
>   http://sourceforge.net/projects/pyxb/files/PyXB-1.2.4.tar.gz
>
> Issue: As "make" was running, the following message was a bit alarming:
>PyXB: Removing GPL component from code base
>
> This comes from the script src/patch/PyXB.sh run after PyXB source
> is downloaded.
>
>   ...
>   echo "PyXB: Removing GPL component from code base"
>   rm -f doc/extapi.py
>   rm -f doc/extapi.pyc
>
> ==
> JIRA: There is one open Jira for the fix-version v1.10:
> https://issues.apache.org/jira/browse/MADLIB-1005
> ==
> PRODUCT VERSION
>
>   After building from source, shouldn't the version contain the string
>   "incubating" somwehere?
>
> /usr/local/madlib/bin/madpack version
> madpack.py : INFO : MADlib tools version= 1.10.0
> (/usr/local/madlib/Versions/1.10.0/bin/../madpack/madpack.py)
>
> ==
>
> --
> Attached to this email: For reference: here is the entire build log
> (including PostgreSQL 9.6.2) and test run attempts. Several of the
> issues above can be seen in the log.
> --
>
>
> On Fri, Mar 3, 2017 at 4:20 PM, Orhan Kislal  wrote:
>
>> +1
>>
>> On Fri, Mar 3, 2017 at 

Re: [VOTE] MADlib v1.10-rc2

2017-03-06 Thread Feng, Xixuan (Aaron)
+1

2017年3月7日(火) 8:37 Joseph Hellerstein :

> +1
>
> On Fri, Mar 3, 2017 at 8:26 PM, Ed Espino  wrote:
>
> > I had some time and had been wanting to perform a MADlib build.  Here are
> > my notes from my quick review of MADlib v1.10-rc2. Sorry if the
> information
> > is a bit scattered.
> >
> > Regards,
> > -=ed espino
> >
> > ==
> > Checksums are good
> > ==
> > PGP signature is good
> > ==
> > Extracted tarball base directory (apache-madlib-src-1.10-incubating)
> > good
> > ==
> > LICENSE
> >
> >   Shouldn't the components with files in licenses/third_party be
> >   referenced in LICENSE file?
> >
> > Boost_Software_License_v1.txt
> > Eigen_v3.1.2.txt
> > PyXB_v1.2.3.txt
> > PyYAML_v3.10.txt
> > Python_License_v2.7.1.txt
> > UseLATEX_v1.9.4.txt
> > _M_widen_init.txt
> > argparse_v1.2.1.txt
> >
> > From README.md, I only saw an incomplete reference to the third party
> > components.
> >
> >   Third Party Components
> >   MADlib incorporates material from the following third-party components
> >
> >   argparse 1.2.1 "provides an easy, declarative interface for creating
> > command line tools"
> >   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source
> > libraries"
> >   Eigen 3.2.2 "is a C++ template library for linear algebra"
> >   PyYAML 3.10 "is a YAML parser and emitter for Python"
> >   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
> >
> > ==
> > DISCLAIMER good
> > ==
> > NOTICE good
> > ==
> > BUILD, INSTALL and INSTALL-CHECK
> >
> > I was able to build the package and successfully ran MADlib
> > install-check against PostgreSQL 9.6.2.
> >
> > Issue: There is no obvious reference to the PostgreSQL libxml
> >dependency in dev documentation. The madpack install-check
> >has failures (see below) if "--with-libxml" configure
> >option is not specified for PostgreSQL.
> >
> >install-check errors encountered due to PostgreSQL
> >configuration without "--with-libxml" option:
> >
> >  psql:/tmp/madlib.0UIPlZ/pmml/test/table_to_pmml.sql_in.tmp:73:
> > ERROR:  unsupported XML feature
> >  DETAIL:  This functionality requires the server to be built with
> > libxml support.
> >  HINT:  You need to rebuild PostgreSQL using --with-libxml.
> >  CONTEXT:  while creating return value
> >  PL/Python function "pmml"
> >
> > Issue: AUTO DOWNLOADED PACKAGES
> >
> >   I was performing the build from a simple perspective. Download
> >   source, configure, make and glance at docs (in this order).
> >
> >   As we have dealt with auto-downloaded files in the HAWQ project, I
> >   was a surprised that the following packages were automatically
> >   downloaded for me. On the HAWQ project we were instructed to require
> >   these as pre-requisites and or make them optional included via
> >   command line options (configure).  I'm guessing other packages would
> >   have been automatically downloaded if they were not found on system
> >   (eg: boost).
> >
> >   Automatically downloaded packages:
> >
> >   https://github.com/madlib/eigen/archive/branches/3.2.tar.gz
> >   http://sourceforge.net/projects/pyxb/files/PyXB-1.2.4.tar.gz
> >
> > Issue: As "make" was running, the following message was a bit alarming:
> >PyXB: Removing GPL component from code base
> >
> > This comes from the script src/patch/PyXB.sh run after PyXB source
> > is downloaded.
> >
> >   ...
> >   echo "PyXB: Removing GPL component from code base"
> >   rm -f doc/extapi.py
> >   rm -f doc/extapi.pyc
> >
> > ==
> > JIRA: There is one open Jira for the fix-version v1.10:
> > https://issues.apache.org/jira/browse/MADLIB-1005
> > ==
> > PRODUCT VERSION
> >
> >   After building from source, shouldn't the version contain the string
> >   "incubating" somwehere?
> >
> > /usr/local/madlib/bin/madpack version
> > madpack.py : INFO : MADlib tools version= 1.10.0
> > (/usr/local/madlib/Versions/1.10.0/bin/../madpack/madpack.py)
> >
> > ==
> >
> > --
> > Attached to this email: For reference: here is the entire build log
> > (including PostgreSQL 9.6.2) and test run attempts. Several of the
> > 

Re: [VOTE] MADlib v1.10-rc2

2017-03-06 Thread Joseph Hellerstein
+1

On Fri, Mar 3, 2017 at 8:26 PM, Ed Espino  wrote:

> I had some time and had been wanting to perform a MADlib build.  Here are
> my notes from my quick review of MADlib v1.10-rc2. Sorry if the information
> is a bit scattered.
>
> Regards,
> -=ed espino
>
> ==
> Checksums are good
> ==
> PGP signature is good
> ==
> Extracted tarball base directory (apache-madlib-src-1.10-incubating)
> good
> ==
> LICENSE
>
>   Shouldn't the components with files in licenses/third_party be
>   referenced in LICENSE file?
>
> Boost_Software_License_v1.txt
> Eigen_v3.1.2.txt
> PyXB_v1.2.3.txt
> PyYAML_v3.10.txt
> Python_License_v2.7.1.txt
> UseLATEX_v1.9.4.txt
> _M_widen_init.txt
> argparse_v1.2.1.txt
>
> From README.md, I only saw an incomplete reference to the third party
> components.
>
>   Third Party Components
>   MADlib incorporates material from the following third-party components
>
>   argparse 1.2.1 "provides an easy, declarative interface for creating
> command line tools"
>   Boost 1.47.0 (or newer) "provides peer-reviewed portable C++ source
> libraries"
>   Eigen 3.2.2 "is a C++ template library for linear algebra"
>   PyYAML 3.10 "is a YAML parser and emitter for Python"
>   PyXB 1.2.4 "is a Python library for XML Schema Bindings"
>
> ==
> DISCLAIMER good
> ==
> NOTICE good
> ==
> BUILD, INSTALL and INSTALL-CHECK
>
> I was able to build the package and successfully ran MADlib
> install-check against PostgreSQL 9.6.2.
>
> Issue: There is no obvious reference to the PostgreSQL libxml
>dependency in dev documentation. The madpack install-check
>has failures (see below) if "--with-libxml" configure
>option is not specified for PostgreSQL.
>
>install-check errors encountered due to PostgreSQL
>configuration without "--with-libxml" option:
>
>  psql:/tmp/madlib.0UIPlZ/pmml/test/table_to_pmml.sql_in.tmp:73:
> ERROR:  unsupported XML feature
>  DETAIL:  This functionality requires the server to be built with
> libxml support.
>  HINT:  You need to rebuild PostgreSQL using --with-libxml.
>  CONTEXT:  while creating return value
>  PL/Python function "pmml"
>
> Issue: AUTO DOWNLOADED PACKAGES
>
>   I was performing the build from a simple perspective. Download
>   source, configure, make and glance at docs (in this order).
>
>   As we have dealt with auto-downloaded files in the HAWQ project, I
>   was a surprised that the following packages were automatically
>   downloaded for me. On the HAWQ project we were instructed to require
>   these as pre-requisites and or make them optional included via
>   command line options (configure).  I'm guessing other packages would
>   have been automatically downloaded if they were not found on system
>   (eg: boost).
>
>   Automatically downloaded packages:
>
>   https://github.com/madlib/eigen/archive/branches/3.2.tar.gz
>   http://sourceforge.net/projects/pyxb/files/PyXB-1.2.4.tar.gz
>
> Issue: As "make" was running, the following message was a bit alarming:
>PyXB: Removing GPL component from code base
>
> This comes from the script src/patch/PyXB.sh run after PyXB source
> is downloaded.
>
>   ...
>   echo "PyXB: Removing GPL component from code base"
>   rm -f doc/extapi.py
>   rm -f doc/extapi.pyc
>
> ==
> JIRA: There is one open Jira for the fix-version v1.10:
> https://issues.apache.org/jira/browse/MADLIB-1005
> ==
> PRODUCT VERSION
>
>   After building from source, shouldn't the version contain the string
>   "incubating" somwehere?
>
> /usr/local/madlib/bin/madpack version
> madpack.py : INFO : MADlib tools version= 1.10.0
> (/usr/local/madlib/Versions/1.10.0/bin/../madpack/madpack.py)
>
> ==
>
> --
> Attached to this email: For reference: here is the entire build log
> (including PostgreSQL 9.6.2) and test run attempts. Several of the
> issues above can be seen in the log.
> --
>
>
> On Fri, Mar 3, 2017 at 4:20 PM, Orhan Kislal  wrote:
>
>> +1
>>
>> On Fri, Mar 3, 2017 at 4:14 PM, Rahul Iyer  wrote:
>>
>> > +1
>> >
>> > On Fri, Mar 3, 2017 

[GitHub] incubator-madlib pull request #105: Graph:

2017-03-06 Thread orhankislal
Github user orhankislal commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/105#discussion_r104527390
  
--- Diff: src/ports/postgres/modules/graph/graph_utils.py_in ---
@@ -0,0 +1,102 @@
+# coding=utf-8
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Graph Methods
+
+# Please refer to the graph.sql_in file for the documentation
+
+"""
+@file graph.py_in
+
+@namespace graph
+"""
+
+import plpy
+from utilities.control import MinWarning
+from utilities.utilities import _assert
+from utilities.utilities import extract_keyvalue_params
+from utilities.utilities import unique_string
+from utilities.validate_args import get_cols
+from utilities.validate_args import unquote_ident
+from utilities.validate_args import table_exists
+from utilities.validate_args import columns_exist_in_table
+from utilities.validate_args import table_is_empty
+
+
+def validate_graph_coding(vertex_table, vertex_id, edge_table, edge_params,
+   out_table, **kwargs):
+   """
+   Validates graph tables (vertex and edge) as well as the output table.
+   """
+   _assert(out_table and out_table.strip().lower() not in ('null', ''),
+   "Graph SSSP: Invalid output table name!")
+   _assert(not table_exists(out_table),
+   "Graph SSSP: Output table already exists!")
+
+   _assert(vertex_table and vertex_table.strip().lower() not in ('null', 
''),
+   "Graph SSSP: Invalid vertex table name!")
+   _assert(table_exists(vertex_table),
+   "Graph SSSP: Vertex table ({0}) is 
missing!".format(vertex_table))
+   _assert(not table_is_empty(vertex_table),
+   "Graph SSSP: Vertex table ({0}) is empty!".format(vertex_table))
+
+   _assert(edge_table and edge_table.strip().lower() not in ('null', ''),
+   "Graph SSSP: Invalid edge table name!")
+   _assert(table_exists(edge_table),
+   "Graph SSSP: Edge table ({0}) is missing!".format(edge_table))
+   _assert(not table_is_empty(edge_table),
+   "Graph SSSP: Edge table ({0}) is empty!".format(edge_table))
+
+   existing_cols = set(unquote_ident(i) for i in get_cols(vertex_table))
+   _assert(vertex_id in existing_cols,
+   """Graph SSSP: The vertex column {vertex_id} is not present in 
vertex
+   table ({vertex_table}) """.format(**locals()))
+   _assert(columns_exist_in_table(edge_table, edge_params.values()),
+   "Graph SSSP: Not all columns from {0} present in edge table 
({1})".
+   format(edge_params.values(), edge_table))
+
+   return None
+
+def get_graph_usage(schema_madlib, func_name, other_text):
+
+   usage = """

+
+USAGE

+
+ SELECT {schema_madlib}.{func_name}(
+vertex_table  TEXT, -- Name of the table that contains the vertex data.
+vertex_id TEXT, -- Name of the column containing the vertex ids.
+edge_tableTEXT, -- Name of the table that contains the edge data.
+edge_args TEXT, -- A comma-delimited string containing multiple
+   -- named arguments of the form "name=value".
+{other_text}
+out_table TEXT  -- Name of the table to store the result of SSSP.
+);
--- End diff --

I think we might want to move `out_table` to the other parameters as well. 
For some functions like graph diameter, we don't have to create an output 
table. That will allow the pagerank to place its optional parameters after the 
`out_table`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or 

[GitHub] incubator-madlib pull request #105: Graph:

2017-03-06 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/105#discussion_r104523519
  
--- Diff: src/ports/postgres/modules/graph/graph_utils.py_in ---
@@ -0,0 +1,102 @@
+# coding=utf-8
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Graph Methods
+
+# Please refer to the graph.sql_in file for the documentation
+
+"""
+@file graph.py_in
+
+@namespace graph
+"""
+
+import plpy
+from utilities.control import MinWarning
+from utilities.utilities import _assert
+from utilities.utilities import extract_keyvalue_params
+from utilities.utilities import unique_string
+from utilities.validate_args import get_cols
+from utilities.validate_args import unquote_ident
+from utilities.validate_args import table_exists
+from utilities.validate_args import columns_exist_in_table
+from utilities.validate_args import table_is_empty
+
+
+def validate_graph_coding(vertex_table, vertex_id, edge_table, edge_params,
+   out_table, **kwargs):
+   """
+   Validates graph tables (vertex and edge) as well as the output table.
+   """
+   _assert(out_table and out_table.strip().lower() not in ('null', ''),
+   "Graph SSSP: Invalid output table name!")
+   _assert(not table_exists(out_table),
+   "Graph SSSP: Output table already exists!")
+
+   _assert(vertex_table and vertex_table.strip().lower() not in ('null', 
''),
+   "Graph SSSP: Invalid vertex table name!")
+   _assert(table_exists(vertex_table),
+   "Graph SSSP: Vertex table ({0}) is 
missing!".format(vertex_table))
+   _assert(not table_is_empty(vertex_table),
+   "Graph SSSP: Vertex table ({0}) is empty!".format(vertex_table))
+
+   _assert(edge_table and edge_table.strip().lower() not in ('null', ''),
+   "Graph SSSP: Invalid edge table name!")
+   _assert(table_exists(edge_table),
+   "Graph SSSP: Edge table ({0}) is missing!".format(edge_table))
+   _assert(not table_is_empty(edge_table),
+   "Graph SSSP: Edge table ({0}) is empty!".format(edge_table))
+
+   existing_cols = set(unquote_ident(i) for i in get_cols(vertex_table))
+   _assert(vertex_id in existing_cols,
+   """Graph SSSP: The vertex column {vertex_id} is not present in 
vertex
+   table ({vertex_table}) """.format(**locals()))
+   _assert(columns_exist_in_table(edge_table, edge_params.values()),
+   "Graph SSSP: Not all columns from {0} present in edge table 
({1})".
+   format(edge_params.values(), edge_table))
+
+   return None
+
+def get_graph_usage(schema_madlib, func_name, other_text):
+
+   usage = """

+
+USAGE

+
+ SELECT {schema_madlib}.{func_name}(
+vertex_table  TEXT, -- Name of the table that contains the vertex data.
+vertex_id TEXT, -- Name of the column containing the vertex ids.
+edge_tableTEXT, -- Name of the table that contains the edge data.
+edge_args TEXT, -- A comma-delimited string containing multiple
+   -- named arguments of the form "name=value".
+{other_text}
+out_table TEXT  -- Name of the table to store the result of SSSP.
+);
--- End diff --

Yes, that might work better. We can have `other_madatory_params` and 
`optional_params`, before and after `out_table` respectively. We may have to 
follow this rule for other graph modules: `out_table` must be our last 
mandatory param, to maintain some consistency. 

But this might not be in line with our existing modules. For example, I 
checked elastic_net and the output table is one of the mandatory params 
specified early on. There are several algorithm specific mandatory params 
following the output table name.
We 

[GitHub] incubator-madlib pull request #105: Graph:

2017-03-06 Thread orhankislal
Github user orhankislal commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/105#discussion_r104521502
  
--- Diff: src/ports/postgres/modules/graph/graph_utils.py_in ---
@@ -0,0 +1,102 @@
+# coding=utf-8
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Graph Methods
+
+# Please refer to the graph.sql_in file for the documentation
+
+"""
+@file graph.py_in
+
+@namespace graph
+"""
+
+import plpy
+from utilities.control import MinWarning
+from utilities.utilities import _assert
+from utilities.utilities import extract_keyvalue_params
+from utilities.utilities import unique_string
+from utilities.validate_args import get_cols
+from utilities.validate_args import unquote_ident
+from utilities.validate_args import table_exists
+from utilities.validate_args import columns_exist_in_table
+from utilities.validate_args import table_is_empty
+
+
+def validate_graph_coding(vertex_table, vertex_id, edge_table, edge_params,
+   out_table, **kwargs):
+   """
+   Validates graph tables (vertex and edge) as well as the output table.
+   """
+   _assert(out_table and out_table.strip().lower() not in ('null', ''),
+   "Graph SSSP: Invalid output table name!")
+   _assert(not table_exists(out_table),
+   "Graph SSSP: Output table already exists!")
+
+   _assert(vertex_table and vertex_table.strip().lower() not in ('null', 
''),
+   "Graph SSSP: Invalid vertex table name!")
+   _assert(table_exists(vertex_table),
+   "Graph SSSP: Vertex table ({0}) is 
missing!".format(vertex_table))
+   _assert(not table_is_empty(vertex_table),
+   "Graph SSSP: Vertex table ({0}) is empty!".format(vertex_table))
+
+   _assert(edge_table and edge_table.strip().lower() not in ('null', ''),
+   "Graph SSSP: Invalid edge table name!")
+   _assert(table_exists(edge_table),
+   "Graph SSSP: Edge table ({0}) is missing!".format(edge_table))
+   _assert(not table_is_empty(edge_table),
+   "Graph SSSP: Edge table ({0}) is empty!".format(edge_table))
+
+   existing_cols = set(unquote_ident(i) for i in get_cols(vertex_table))
+   _assert(vertex_id in existing_cols,
+   """Graph SSSP: The vertex column {vertex_id} is not present in 
vertex
+   table ({vertex_table}) """.format(**locals()))
+   _assert(columns_exist_in_table(edge_table, edge_params.values()),
+   "Graph SSSP: Not all columns from {0} present in edge table 
({1})".
+   format(edge_params.values(), edge_table))
+
+   return None
+
+def get_graph_usage(schema_madlib, func_name, other_text):
+
+   usage = """

+
+USAGE

+
+ SELECT {schema_madlib}.{func_name}(
+vertex_table  TEXT, -- Name of the table that contains the vertex data.
+vertex_id TEXT, -- Name of the column containing the vertex ids.
+edge_tableTEXT, -- Name of the table that contains the edge data.
+edge_args TEXT, -- A comma-delimited string containing multiple
+   -- named arguments of the form "name=value".
+{other_text}
+out_table TEXT  -- Name of the table to store the result of SSSP.
+);
--- End diff --

I was trying to avoid changing the SSSP notation but I guess it is 
inevitable. Do you think separating `other_text` into two (`mandatory_params` 
and `optional_params`) could work for future graph algorithms?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---