[jira] [Updated] (MADLIB-1393) DL: Fit and evaluate changes for asymmetric cluster config

Ekta Khanna (Jira) Mon, 11 Nov 2019 16:16:00 -0800


     [ 
https://issues.apache.org/jira/browse/MADLIB-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ekta Khanna updated MADLIB-1393:
--------------------------------
    Description: 
Single fit()
{code}
madlib_keras_fit(
    source_table,
    model,
    model_arch_table,
    model_arch_id,
    compile_params,
    fit_params,
    num_iterations,
    use_gpus,        -- changed definition
    validation_table,
    metrics_compute_frequency,
    warm_start,
    name,
    description
    )
{code}
{{use_gpus}} (optional)

BOOLEAN, *default*: FALSE (i.e., CPU). Determines whether GPUs are to be used 
for training the neural network.  Set to TRUE to use GPUs.

*Note*

This parameter must not conflict with how the distribution rules are set in the 
preprocessor function.  For example, if you set a distribution rule to use 
certain segments on hosts that do not have GPUs attached, you will get an error 
if you set {{use_gpus}} to TRUE.

Also, we have seen some memory related issues when segments share GPU 
resources. For example, if you have 4 segments sharing 1 GPU, you may get 
memory related errors.  The recommended configuration is to have 1 GPU per 
segment.

*Multi model fit()*
Same idea as above ^^^ for single model fit.

*Evaluate*
Same idea as above ^^^ for single model fit..

  was:
Single fit()
{code}
madlib_keras_fit(
    source_table,
    model,
    model_arch_table,
    model_arch_id,
    compile_params,
    fit_params,
    num_iterations,
    use_gpus,        -- changed definition
    validation_table,
    metrics_compute_frequency,
    warm_start,
    name,
    description
    )
{code}
{{use_gpus}} (optional)

BOOLEAN, *default*: FALSE (i.e., CPU). Determines whether GPUs are to be used 
for training the neural network.  Set to TRUE to use GPUs.

*Note*

This parameter must not conflict with how the distribution rules are set in the 
preprocessor function.  For example, if you set a distribution rule to use 
certain segments on hosts that do not have GPUs attached, you will get an error 
if you set {{use_gpus}} to TRUE.

Also, we have seen some memory related issues when segments share GPU 
resources. For example, if you have 4 segments sharing 1 GPU, you may get 
memory related errors.  The recommended configuration is to have 1 GPU per 
segment.

Multi model fit()
Same idea as above ^^^ for single model fit.

Evaluate
Same idea as above ^^^ for single model fit..


> DL: Fit and evaluate changes for asymmetric cluster config
> ----------------------------------------------------------
>
>                 Key: MADLIB-1393
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1393
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Deep Learning
>            Reporter: Ekta Khanna
>            Priority: Major
>             Fix For: v1.17
>
>
> Single fit()
> {code}
> madlib_keras_fit(
>     source_table,
>     model,
>     model_arch_table,
>     model_arch_id,
>     compile_params,
>     fit_params,
>     num_iterations,
>     use_gpus,        -- changed definition
>     validation_table,
>     metrics_compute_frequency,
>     warm_start,
>     name,
>     description
>     )
> {code}
> {{use_gpus}} (optional)
> BOOLEAN, *default*: FALSE (i.e., CPU). Determines whether GPUs are to be used 
> for training the neural network.  Set to TRUE to use GPUs.
> *Note*
> This parameter must not conflict with how the distribution rules are set in 
> the preprocessor function.  For example, if you set a distribution rule to 
> use certain segments on hosts that do not have GPUs attached, you will get an 
> error if you set {{use_gpus}} to TRUE.
> Also, we have seen some memory related issues when segments share GPU 
> resources. For example, if you have 4 segments sharing 1 GPU, you may get 
> memory related errors.  The recommended configuration is to have 1 GPU per 
> segment.
> *Multi model fit()*
> Same idea as above ^^^ for single model fit.
> *Evaluate*
> Same idea as above ^^^ for single model fit..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (MADLIB-1393) DL: Fit and evaluate changes for asymmetric cluster config

Reply via email to