Cool! I'd like to join the next discussion.
Best,
Chenliang Wang
On 01/13/2016 06:36 PM, Greg Chase wrote:
As I said, our next call is not China-friendly:
http://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/201601.mbox/%3CCAMg1VtnKB-WoyVqCstfMNCcJVOn2HKQQ6wNfqdovhgnB7zd5cw%40mail.gmail.com%3E
This is this Friday, 10AM Pacifc Standard Time which is 2AM Saturday
Beijing time.
We will arrange a next call in a couple weeks at an Asia friendly time to
support contributors in Asia.
However, if you make the next call, we will make time for you to talk :)
Regards,
-Greg
On Wed, Jan 13, 2016 at 2:18 AM, Kuien Liu <k...@pivotal.io> wrote:
Great, I would like to join it, please send me an invitation if possible.
Cheers,
Kuien Liu
On Wed, Jan 13, 2016 at 6:10 PM, Greg Chase <gch...@gmail.com> wrote:
Perhaps ChenLiang would like to join a call with the MADlib community and
discuss his contribution?
We have a call this Friday 10AM PST which is not a friendly time for
China, but we can schedule a next call at a friendlier time.
This email encrypted by tiny buttons & fat thumbs, beta voice
recognition, and autocorrect on my iPhone.
On Jan 13, 2016, at 1:53 AM, Ivan Novick <inov...@pivotal.io> wrote:
Cool!
On Wed, Jan 13, 2016 at 5:52 PM, Kuien Liu <k...@pivotal.io> wrote:
Got it, I think I can have a (f2f) talk with Chenliang Wang, as he was
graduated from an institute of CAS which is not far from our Beijing
office, and I am familiar with his supervisor and lab director. So I
think
it is highly possible to find him directly in Beijing.
Cheers,
Kuien Liu
On Wed, Jan 13, 2016 at 3:05 PM, Ivan Novick <inov...@pivotal.io>
wrote:
Hello ChenLiang,
I have read your description of the interface and to my understanding
this is a supervised machine learning algorithm that supports geometry
data. Am I correct?
What could be a good industrial use case for this model for some
examples? Could you train a system based on locations and weather to
find
bad signals for cell phone? Can you provide any real world example
scenario where this type of model will be useful for end users?
Also I am adding CC to some of my colleagues at work. Kuien, Max,
Yandong can you provide any feedback on this proposal from your Point
of
View?
http://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/201601.mbox/%3cblu175-w72199bca72716d8c1a99bf4...@phx.gbl%3E
Cheers,
Ivan
On Wed, Jan 13, 2016 at 11:20 AM, WangChenLiang <hi181904...@msn.com>
wrote:
Sorry, the link of attachment (http://1drv.ms/1ZjAiCg) is lost in
the
previous letter.
From: hi181904...@msn.com
To: dev@madlib.incubator.apache.org
Subject: RE: How to contribute a spatial module to MADlib
manipulating
objects from PostGIS
Date: Wed, 13 Jan 2016 11:09:17 +0800
Hi ,Caleb and Ivan!
Thanks for your attention and help. I reviewed the previous draft
and find
something inappropriate. The archive containing the new draft and
example code
is attached in the letter which would be more reasonable than the
earlier edition.
Please go over the manuscript and give suggestion again .
The following are my answers to Caleb's questions.
- Does this function require PostGIS to also be
installed? If yes, it would be better
if we disable the function if
PostGIS is not present rather than introduce PostGIS
as a dependency. (Similar
to what we do with our requirement on the xml module with our PMML
export
functionality).
A:Yes. I am trying to avoid
input any spatial datatypes in the interface of GWR.
But I have no
idea if it is necessary to provide simple alternative when PostGIS
is
not
available.
- What are the exact datatypes in the function
definition for regression_location
and prediction_location?
A:I changed the datatype
to TEXT as the name of POINT or MULTIPOLYGON
(centroid of
each polygon for estimation for GWR).
- In the description it describes
regression_location as "The length of
regression_location must be equal to the length of
source_table", which signals to me that it is likely intended to be
a
column of the source table? If not then how is
this length represented?
A: In the previous
interface, I was trying to input a geometry field which could be
from another
table having different row number. Now, I alter the argument
definition and make it
to TEXT. It must be the name of geometry field in the
source table.
- You didn't mark regression_location as
(optional). Due to the way Postgres
functions work all optional arguments
must come after all required arguments,
so having a non-optional argument in
the middle of the optional list must be
avoided.
A:Thanks for
reminding me of this mistake. It is really my fault. The order of
argument is changed in this edition.
- I haven't read through the literature, but it is
not immediately clear to me why
prediction_location is a parameter to
gwregr_train() rather than gwregr_predict().
Can you provide a brief
description to the way that prediction_location is used in
the model and its
relationship to training and prediction.
A: Actually,
there are three kinds location data including location of sample
data,
regression and prediction in the modeling of GWR.
Locations of sample data indicate where is sample
data. Locations of regression
indicate where regression should be conducted. If
it is identical to data location
(in most instances),diagnostic information can
be calculated.
Locations of
prediction indicate where coefficients should be predicted. It
should
be a
parameter for a predict function. Putting regression_location into
training
function is just for omitting kernel arguments and maybe not
appropriate. In the process of
training, GWR estimates weight and coefficients with distance
between data_loctions and regression_loctions. Then, diagnostic
information are
estimated when these two locations are identical. We can treat
data_locationas regression_location to simplify the process not
taking
different locations from
data location in the training step.
In the process of
prediction , there are two new information including new
independent variables and new locations. Therefore, coefficients and
weight
vector must be estimated
again. GWR can
estimate coefficients in any positions
using independent variables of sample data.
If we also provide independent
variables in any positions,we can also obtain
dependent variable in any position. So if we treat coefficients at
prediction_location as a training result to put
coefficients into prediction
directly, it is reasonable to put it into training function. But if
we
treat it as a part of prediction, it is appropriate to set
predicton_location within predict function. And then, prediction
function
must require kernel
parameters in addition to new data and locations for prediction.
Maybe
this way
is more clear
and reasonable, and is similar with others GWR packages in R.
I
rewrote the description of interface taking your suggestion into
account. I
moved
prediction_location into predict function and modified
some mistake and
unnecessary arguments. The new draft of interface design is
attached
in the
letter.
Regards,
ChenLiang Wang
From: cwel...@pivotal.io
Date: Tue, 5 Jan 2016 10:31:20 -0800
Subject: Re: How to contribute a spatial module to MADlib
manipulating objects from PostGIS
To: dev@madlib.incubator.apache.org
Hi ChenLiang,
Thanks for taking the next step to flush this out.
As a whole:
- naming and basic interface seems consistent with existing
conventions.
- names are descriptive.
- references to the literature is provided.
- functionality is complementary to the library.
What is not clear to me is:
- Does this function require PostGIS to also be installed? If yes,
it
would be better if we disable the function if PostGIS is not
present
rather
than introduce PostGIS as a dependency. (Similar to what we do
with
our
requirement on the xml module with our PMML export functionality).
- What are the exact datatypes in the function definition for
regression_location and prediction_location?
- In the description it describes regression_location as "The
length
of
regression_location must be equal to the length of source_table",
which
signals to me that it is likely intended to be a column of the
source
table? If not then how is this length represented?
- You didn't mark regression_location as (optional). Due to the
way
Postgres functions work all optional arguments must come after all
required
arguments, so having a non-optional argument in the middle of the
optional
list must be avoided.
- I haven't read through the literature, but it is not immediately
clear to
me why prediction_location is a parameter to gwregr_train() rather
than
gwregr_predict(). Can you provide a brief description to the way
that
prediction_location is used in the model and its relationship to
training
and prediction.
Regards,
Caleb
ChenLiang 要与你在 OneDrive
上共享一个文件。要查看该文件,请单击下面的链接。
gwr4madlib.rar