This is an automated email from the ASF dual-hosted git repository.

pingsutw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git


The following commit(s) were added to refs/heads/master by this push:
     new fc65e0c  SUBMARINE-1000. Add submarine save model method
fc65e0c is described below

commit fc65e0c948411dfefb878755be1703e2638878d6
Author: jeff-901 <[email protected]>
AuthorDate: Tue Sep 7 20:21:47 2021 +0800

    SUBMARINE-1000. Add submarine save model method
    
    ### What is this PR for?
    Create submarine save model method.
    
    ### What type of PR is it?
    Feature
    
    ### Todos
    * [x] - Create repository
    * [x] - Create save method for pytorch and tensorflow
    * [ ] - Create model version when saving.
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/SUBMARINE-1000
    
    ### How should this be tested?
    Call the save_model_submarine function and check the artifact by minio ui.
    ### Screenshots (if appropriate)
    ![Screenshot from 2021-08-31 
14-57-59](https://user-images.githubusercontent.com/54139205/131457292-a727ee9e-a302-492e-99e4-e212a7201abc.png)
    
    ### Questions:
    * Do the license files need updating? No
    * Are there breaking changes for older versions? No
    * Does this need new documentation? Yes
    
    Author: jeff-901 <[email protected]>
    
    Signed-off-by: Kevin <[email protected]>
    
    Closes #733 from jeff-901/SUBMARINE-1000 and squashes the following commits:
    
    fe0e9fa2 [jeff-901] fix new lint
    5b4b093f [jeff-901] fix lint
    4f9e5a29 [jeff-901] repository use environ value and use kubectl to wait 
minio
    bd259b75 [jeff-901] add the comment of sleep
    c50beec7 [jeff-901] change Repository.py to lowercase
    3f2d1fb6 [jeff-901] exit if bucket creation fail
    20ca0366 [jeff-901] delete local model
    0c313391 [jeff-901] create submarine bucket when server starts
    b2858ea5 [jeff-901] fix lint
    c7d75620 [jeff-901] fix lint
    6d9cb107 [jeff-901] fix lint and add TODO
    0741d07a [jeff-901] add repository and save model method
---
 bin/submarine.sh                                   |  6 ++
 dev-support/docker-images/submarine/Dockerfile     |  8 ++-
 .../docker-images/submarine/create_bucket.sh       | 35 +++++++++++
 .../pysubmarine/submarine/artifacts/__init__.py    | 20 ++++++
 .../pysubmarine/submarine/artifacts/repository.py  | 73 ++++++++++++++++++++++
 .../pysubmarine/submarine/models/client.py         | 38 ++++++++++-
 .../pysubmarine/submarine/models/pytorch.py        | 22 +++++++
 .../pysubmarine/submarine/models/tensorflow.py     | 18 ++++++
 8 files changed, 216 insertions(+), 4 deletions(-)

diff --git a/bin/submarine.sh b/bin/submarine.sh
index e8c109b..d2e74b0 100755
--- a/bin/submarine.sh
+++ b/bin/submarine.sh
@@ -55,4 +55,10 @@ if [[ ! -d "${SUBMARINE_LOG_DIR}" ]]; then
   $(mkdir -p "${SUBMARINE_LOG_DIR}")
 fi
 
+/usr/local/bin/create_bucket.sh
+if [ $? -ne 0 ];then
+  echo "Create submarine bucket fail" 
+  exit 1
+fi
+
 exec $JAVA_RUNNER $JAVA_OPTS -cp ${SUBMARINE_SERVER_CLASSPATH} 
${SUBMARINE_SERVER_MAIN} "$@" | tee -a "${SUBMARINE_SERVER_LOGFILE}" 2>&1
diff --git a/dev-support/docker-images/submarine/Dockerfile 
b/dev-support/docker-images/submarine/Dockerfile
index 792080a..f038407 100644
--- a/dev-support/docker-images/submarine/Dockerfile
+++ b/dev-support/docker-images/submarine/Dockerfile
@@ -23,7 +23,7 @@ MAINTAINER Apache Software Foundation 
<[email protected]>
 
 # INSTALL openjdk
 RUN apk update && \
-    apk add --no-cache openjdk8 tzdata bash tini && \
+    apk add --no-cache openjdk8 tzdata bash tini&& \
     cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && \
     echo Asia/Shanghai > /etc/timezone && \
     apk del tzdata && \
@@ -38,6 +38,12 @@ ADD ./tmp/submarine-site.xml "/opt/submarine-current/conf/"
 ADD ./tmp/submarine.sh "/opt/submarine-current/bin/"
 ADD ./tmp/mysql-connector-java-5.1.39.jar "/opt/submarine-current/lib/"
 
+# Create submarine Bucket
+WORKDIR /usr/local/bin
+RUN wget https://dl.min.io/client/mc/release/linux-amd64/mc && chmod +x mc
+RUN wget https://dl.k8s.io/release/v1.15.11/bin/linux/amd64/kubectl && chmod 
+x kubectl
+COPY create_bucket.sh /usr/local/bin
+
 WORKDIR /opt/submarine-current
 
 # Submarine port
diff --git a/dev-support/docker-images/submarine/create_bucket.sh 
b/dev-support/docker-images/submarine/create_bucket.sh
new file mode 100755
index 0000000..826407e
--- /dev/null
+++ b/dev-support/docker-images/submarine/create_bucket.sh
@@ -0,0 +1,35 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -euo pipefail
+
+S3_ENDPOINT_URL="http://submarine-minio-service:9000";
+AWS_ACCESS_KEY_ID="submarine_minio"
+AWS_SECRET_ACCESS_KEY="submarine_minio"
+
+
+# Wait for minio pod to setup
+/bin/bash -c "kubectl wait --for=condition=ready pod -l 
app=submarine-minio-pod; mc config host add minio ${S3_ENDPOINT_URL} 
${AWS_ACCESS_KEY_ID} ${AWS_SECRET_ACCESS_KEY}"
+
+# Create if the bucket "minio/submarine" not exists
+
+if /bin/bash -c "mc ls minio/submarine" >/dev/null 2>&1; then
+    echo "Bucket minio/submarine already exists, skipping creation."
+else
+    /bin/bash -c "mc mb minio/submarine"
+fi
diff --git a/submarine-sdk/pysubmarine/submarine/artifacts/__init__.py 
b/submarine-sdk/pysubmarine/submarine/artifacts/__init__.py
new file mode 100644
index 0000000..c75c093
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/artifacts/__init__.py
@@ -0,0 +1,20 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.artifacts.repository import Repository
+
+__all__ = [
+    "Repository",
+]
diff --git a/submarine-sdk/pysubmarine/submarine/artifacts/repository.py 
b/submarine-sdk/pysubmarine/submarine/artifacts/repository.py
new file mode 100644
index 0000000..3dff6fc
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/artifacts/repository.py
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+import boto3
+
+
+class Repository:
+    def __init__(self, experiment_id):
+        self.client = boto3.client(
+            "s3",
+            aws_access_key_id=os.environ.get("AWS_ACCESS_KEY_ID"),
+            aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY"),
+            endpoint_url=os.environ.get("MLFLOW_S3_ENDPOINT_URL"),
+        )
+        self.dest_path = experiment_id
+
+    def _upload_file(self, local_file, bucket, key):
+        self.client.upload_file(Filename=local_file, Bucket=bucket, Key=key)
+
+    def _list_artifact_subfolder(self, artifact_path):
+        response = self.client.list_objects(
+            Bucket="submarine",
+            Prefix=os.path.join(self.dest_path, artifact_path) + "/",
+            Delimiter="/",
+        )
+        return response.get("CommonPrefixes")
+
+    def log_artifact(self, local_file, artifact_path):
+        bucket = "submarine"
+        dest_path = self.dest_path
+        dest_path = os.path.join(dest_path, artifact_path)
+        dest_path = os.path.join(dest_path, os.path.basename(local_file))
+        self._upload_file(
+            local_file=local_file,
+            bucket=bucket,
+            key=dest_path,
+        )
+
+    def log_artifacts(self, local_dir, artifact_path):
+        bucket = "submarine"
+        dest_path = self.dest_path
+        list_of_subfolder = self._list_artifact_subfolder(artifact_path)
+        if list_of_subfolder is None:
+            artifact_path = os.path.join(artifact_path, "1")
+        else:
+            artifact_path = os.path.join(artifact_path, 
str(len(list_of_subfolder) + 1))
+        dest_path = os.path.join(dest_path, artifact_path)
+        local_dir = os.path.abspath(local_dir)
+        for (root, _, filenames) in os.walk(local_dir):
+            upload_path = dest_path
+            if root != local_dir:
+                rel_path = os.path.relpath(root, local_dir)
+                upload_path = os.path.join(dest_path, rel_path)
+            for f in filenames:
+                self._upload_file(
+                    local_file=os.path.join(root, f),
+                    bucket=bucket,
+                    key=os.path.join(upload_path, f),
+                )
diff --git a/submarine-sdk/pysubmarine/submarine/models/client.py 
b/submarine-sdk/pysubmarine/submarine/models/client.py
index a954bac..9cf655f 100644
--- a/submarine-sdk/pysubmarine/submarine/models/client.py
+++ b/submarine-sdk/pysubmarine/submarine/models/client.py
@@ -15,12 +15,16 @@
  under the License.
 """
 import os
+import re
+import tempfile
 import time
 
 import mlflow
 from mlflow.exceptions import MlflowException
 from mlflow.tracking import MlflowClient
 
+from submarine.artifacts.repository import Repository
+
 from .constant import (
     AWS_ACCESS_KEY_ID,
     AWS_SECRET_ACCESS_KEY,
@@ -31,15 +35,21 @@ from .utils import exist_ps, get_job_id, get_worker_index
 
 
 class ModelsClient:
-    def __init__(self, tracking_uri=None, registry_uri=None):
+    def __init__(
+        self,
+        tracking_uri=None,
+        registry_uri=None,
+        aws_access_key_id=None,
+        aws_secret_access_key=None,
+    ):
         """
         Set up mlflow server connection, including: s3 endpoint, aws, tracking 
server
         """
         # if setting url in environment variable,
         # there is no need to set it by MlflowClient() or 
mlflow.set_tracking_uri() again
         os.environ["MLFLOW_S3_ENDPOINT_URL"] = registry_uri or 
MLFLOW_S3_ENDPOINT_URL
-        os.environ["AWS_ACCESS_KEY_ID"] = AWS_ACCESS_KEY_ID
-        os.environ["AWS_SECRET_ACCESS_KEY"] = AWS_SECRET_ACCESS_KEY
+        os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id or 
AWS_ACCESS_KEY_ID
+        os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key or 
AWS_SECRET_ACCESS_KEY
         os.environ["MLFLOW_TRACKING_URI"] = tracking_uri or MLFLOW_TRACKING_URI
         self.client = MlflowClient()
         self.type_to_log_model = {
@@ -48,6 +58,7 @@ class ModelsClient:
             "tensorflow": mlflow.tensorflow.log_model,
             "keras": mlflow.keras.log_model,
         }
+        self.artifact_repo = Repository(get_job_id())
 
     def start(self):
         """
@@ -98,6 +109,27 @@ class ModelsClient:
             else:
                 raise MlflowException("No valid type of model has been 
matched")
 
+    def save_model_submarine(self, model_type, model, artifact_path, 
registered_model_name=None):
+        pattern = r"[0-9A-Za-z][0-9A-Za-z-_]*[0-9A-Za-z]|[0-9A-Za-z]"
+        if not re.fullmatch(pattern, artifact_path):
+            raise Exception(
+                "Artifact_path must only contains numbers, characters, hyphen 
and underscore.      "
+                "        Artifact_path must starts and ends with numbers or 
characters."
+            )
+        with tempfile.TemporaryDirectory() as tempdir:
+            if model_type == "pytorch":
+                import submarine.models.pytorch
+
+                submarine.models.pytorch.save_model(model, tempdir)
+            elif model_type == "tensorflow":
+                import submarine.models.tensorflow
+
+                submarine.models.tensorflow.save_model(model, tempdir)
+            else:
+                raise Exception("No valid type of model has been matched to 
{}".format(model_type))
+            self.artifact_repo.log_artifacts(tempdir, artifact_path)
+        # TODO for registering model ()
+
     def _get_or_create_experiment(self, experiment_name):
         """
         Return the id of experiment.
diff --git a/submarine-sdk/pysubmarine/submarine/models/pytorch.py 
b/submarine-sdk/pysubmarine/submarine/models/pytorch.py
new file mode 100644
index 0000000..a143aa5
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/models/pytorch.py
@@ -0,0 +1,22 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+import torch
+
+
+def save_model(model, artifact_path):
+    torch.save(model, os.path.join(artifact_path, "model.pth"))
diff --git a/submarine-sdk/pysubmarine/submarine/models/tensorflow.py 
b/submarine-sdk/pysubmarine/submarine/models/tensorflow.py
new file mode 100644
index 0000000..fbe5324
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/models/tensorflow.py
@@ -0,0 +1,18 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+def save_model(model, artifact_path):
+    model.save(artifact_path)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to