[GitHub] [incubator-mxnet] zhreshold commented on a change in pull request #17841: Gluon data 2.0: c++ dataloader and built-in image/bbox transforms

GitBox Wed, 29 Apr 2020 22:03:31 -0700


zhreshold commented on a change in pull request #17841:
URL: https://github.com/apache/incubator-mxnet/pull/17841#discussion_r417754919




##########
File path: python/mxnet/gluon/contrib/data/vision/dataloader.py
##########
@@ -0,0 +1,521 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# coding: utf-8
+# pylint: disable= arguments-differ, wildcard-import
+"Contrib Vision DataLoaders."
+import logging
+import numpy as np
+
+from ..... import nd
+from .....util import is_np_array
+from ..... import np as _mx_np   # pylint: disable=reimported
+from ....nn import HybridSequential, Sequential, HybridBlock, Block
+from ....data.vision import transforms
+from ....data import DataLoader
+from .transforms import bbox
+
+__all__ = ['create_image_augment', 'ImageDataLoader', 'ImageBboxDataLoader']
+
+def create_image_augment(data_shape, resize=0, rand_crop=False, 
rand_resize=False, rand_mirror=False,
+                         mean=None, std=None, brightness=0, contrast=0, 
saturation=0, hue=0,
+                         pca_noise=0, rand_gray=0, inter_method=2, 
dtype='float32'):
+    """Creates an augmenter block.
+
+    Parameters
+    ----------
+    data_shape : tuple of int
+        Shape for output data
+    resize : int
+        Resize shorter edge if larger than 0 at the begining
+    rand_crop : bool
+        Whether to enable random cropping other than center crop
+    rand_resize : bool
+        Whether to enable random sized cropping, require rand_crop to be 
enabled
+    rand_gray : float
+        [0, 1], probability to convert to grayscale for all channels, the 
number
+        of channels will not be reduced to 1
+    rand_mirror : bool
+        Whether to apply horizontal flip to image with probability 0.5
+    mean : np.ndarray or None
+        Mean pixel values for [r, g, b]
+    std : np.ndarray or None
+        Standard deviations for [r, g, b]
+    brightness : float
+        Brightness jittering range (percent)
+    contrast : float
+        Contrast jittering range (percent)
+    saturation : float
+        Saturation jittering range (percent)
+    hue : float
+        Hue jittering range (percent)
+    pca_noise : float
+        Pca noise level (percent)
+    inter_method : int, default=2(Area-based)
+        Interpolation method for all resizing operations
+
+        Possible values:
+        0: Nearest Neighbors Interpolation.
+        1: Bilinear interpolation.
+        2: Bicubic interpolation over 4x4 pixel neighborhood.
+        3: Area-based (resampling using pixel area relation). It may be a
+        preferred method for image decimation, as it gives moire-free
+        results. But when the image is zoomed, it is similar to the Nearest
+        Neighbors method. (used by default).
+        4: Lanczos interpolation over 8x8 pixel neighborhood.
+        10: Random select from interpolation method metioned above.
+        Note:
+        When shrinking an image, it will generally look best with AREA-based
+        interpolation, whereas, when enlarging an image, it will generally 
look best
+        with Bicubic (slow) or Bilinear (faster but still looks OK).
+
+    Examples
+    --------
+    >>> # An example of creating multiple augmenters
+    >>> augs = mx.gluon.contrib.data.create_image_augment(data_shape=(3, 300, 
300), rand_mirror=True,
+    ...    mean=True, brightness=0.125, contrast=0.125, rand_gray=0.05,
+    ...    saturation=0.125, pca_noise=0.05, inter_method=10)
+    """
+    if inter_method == 10:
+        inter_method = np.random.randint(0, 5)
+    augmenter = HybridSequential('default_img_augment_')
+    if resize > 0:
+        augmenter.add(transforms.image.Resize(resize, 
interpolation=inter_method))
+    crop_size = (data_shape[2], data_shape[1])
+    if rand_resize:
+        assert rand_crop
+        augmenter.add(transforms.image.RandomResizedCrop(crop_size, 
interpolation=inter_method))
+    elif rand_crop:
+        augmenter.add(transforms.image.RandomCrop(crop_size, 
interpolation=inter_method))
+    else:
+        augmenter.add(transforms.image.CenterCrop(crop_size, 
interpolation=inter_method))
+
+    if rand_mirror:
+        augmenter.add(transforms.image.RandomFlipLeftRight(0.5))
+
+    augmenter.add(transforms.Cast())
+
+    if brightness or contrast or saturation or hue:
+        augmenter.add(transforms.image.RandomColorJitter(brightness, contrast, 
saturation, hue))
+
+    if pca_noise > 0:
+        augmenter.add(transforms.image.RandomLighting(pca_noise))
+
+    if rand_gray > 0:
+        augmenter.add(transforms.image.RandomGray(rand_gray))
+
+    if mean is True:
+        mean = [123.68, 116.28, 103.53]
+    elif mean is not None:
+        assert isinstance(mean, (tuple, list))
+
+    if std is True:
+        std = [58.395, 57.12, 57.375]
+    elif std is not None:
+        assert isinstance(std, (tuple, list))
+
+    augmenter.add(transforms.image.ToTensor())
+
+    if mean is not None or std is not None:
+        augmenter.add(transforms.image.Normalize(mean, std))
+
+    augmenter.add(transforms.Cast(dtype))
+
+    return augmenter
+
+class ImageDataLoader(object):

Review comment:
       A 200 line of code can save each user 5min at least, especially for 
beginners. Actually it's not duplicating any code block and can handle most 
usecases for images. 
   
   So I think it's worth the effort to put in `gluon.contrib` and get 
maintained.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-mxnet] zhreshold commented on a change in pull request #17841: Gluon data 2.0: c++ dataloader and built-in image/bbox transforms

Reply via email to