[ 
https://issues.apache.org/jira/browse/MADLIB-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Orhan Kislal closed MADLIB-1392.
--------------------------------
    Resolution: Fixed

> DL: Preprocessor support for asymmetric segment distribution
> ------------------------------------------------------------
>
>                 Key: MADLIB-1392
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1392
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Deep Learning
>            Reporter: Ekta Khanna
>            Priority: Major
>             Fix For: v1.17
>
>
> Add asymmetric segment redistribution support to the deep learning 
> preprocessor. Applies to {{training_preprocessor_dl()}} and 
> {{validation_preprocessor_dl()}}
> {code:java}
> training_preprocessor_dl(source_table,
>                          output_table,
>                          dependent_varname,
>                          independent_varname,
>                          buffer_size,
>                          normalizing_const,
>                          num_classes,
>                          distribution_rules    -- new optional param
>                         )
> {code}
> Following are the possible values for the new optional 
> param({{distribution_rules}})
>  # TEXT, *default*: {{all_segments}}. Specifies how to distribute the 
> {{output_table}}. This is important for how the fit function will use 
> resources on the cluster. The default {{all_segments}} means the 
> {{output_table}} will be distributed to all segments in the database cluster.
>  # If you specify {{gpu_segments}} then the {{output_table}} will be 
> distributed to all segments that are on hosts that have GPUs attached. This 
> will make maximum use of GPU resources.
>  # You can also specify the name of a resources table containing the segments 
> to use for training. This table is typically created and maintained by the 
> database administrator. Must contain a column called {{dbid}} that specifies 
> the segment id from the {{gp_segment_configuration}} table.
> Sample {{segments_to_use}} table:
> {code:java}
>  dbid | notes
>  -----|--------------
>     2 | comment here
>     3 | comment here
>     4 | comment here
>     5 | comment here
> {code}
> Same deal as above ^^^ for validation preprocessor.
> This change adds a new column to the output summary table {{gpu_config}}, 
> contains the following values:
> # if {{distribution_policy}} = {{all_segments}}, then {{all_segments}}
> # if {{distribution_policy}} = {{gpu_segments}}, then array of segments ids 
> all segments that are on hosts that have GPUs attached
> # if {{distribution_policy}} = {{segments_to_use_table}}, then array of 
> segments ids, for the above sample {{segments_to_use}} table -> [2,3,4,5]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to