[jira] [Commented] (TIKA-2306) Update Inception v3 to Inception v4 in Object recognition parser

ASF GitHub Bot (JIRA) Thu, 13 Apr 2017 22:04:10 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968610#comment-15968610
 ]


ASF GitHub Bot commented on TIKA-2306:
--------------------------------------

KranthiGV commented on a change in pull request #163: TIKA-2306: Update 
Inception v3 to Inception v4 in Object recognition parser 
URL: https://github.com/apache/tika/pull/163#discussion_r111529725
 
 

 ##########
 File path: 
tika-parsers/src/main/resources/org/apache/tika/parser/recognition/tf/classify_image.py
 ##########
 @@ -44,169 +49,205 @@
 from six.moves import urllib
 import tensorflow as tf
 
+from datasets import imagenet, dataset_utils
+from nets import inception
+from preprocessing import inception_preprocessing
+
+slim = tf.contrib.slim
+
 FLAGS = tf.app.flags.FLAGS
 
-# classify_image_graph_def.pb:
-#   Binary representation of the GraphDef protocol buffer.
-# imagenet_synset_to_human_label_map.txt:
+# inception_v4.ckpt
+#   Inception V4 checkpoint file.
+# imagenet_metadata.txt
 #   Map from synset ID to a human readable string.
-# imagenet_2012_challenge_label_map_proto.pbtxt:
+# imagenet_lsvrc_2015_synsets.txt
 #   Text representation of a protocol buffer mapping a label to synset ID.
 tf.app.flags.DEFINE_string(
     'model_dir', '/tmp/imagenet',
-    """Path to classify_image_graph_def.pb, """
-    """imagenet_synset_to_human_label_map.txt, and """
-    """imagenet_2012_challenge_label_map_proto.pbtxt.""")
+    """Path to inception_v4.ckpt, """
+    """imagenet_lsvrc_2015_synsets.txt, and """
+    """imagenet_metadata.txt.""")
 tf.app.flags.DEFINE_string('image_file', '',
                            """Absolute path to image file.""")
 tf.app.flags.DEFINE_integer('num_top_predictions', 5,
                             """Display this many predictions.""")
 
 # pylint: disable=line-too-long
-DATA_URL = 
'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
+DATA_URL = 
'http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz'
 # pylint: enable=line-too-long
 
 
-class NodeLookup(object):
-  """Converts integer node ID's to human readable labels."""
+def create_readable_names_for_imagenet_labels():
+    """Create a dict mapping label id to human readable string.
+
+    Returns:
+        labels_to_names: dictionary where keys are integers from to 1000
+        and values are human-readable names.
+
+    We retrieve a synset file, which contains a list of valid synset labels 
used
+    by ILSVRC competition. There is one synset one per line, eg.
+            #   n01440764
+            #   n01443537
+    We also retrieve a synset_to_human_file, which contains a mapping from 
synsets
+    to human-readable names for every synset in Imagenet. These are stored in a
+    tsv format, as follows:
+            #   n02119247    black fox
+            #   n02119359    silver fox
+    We assign each synset (in alphabetical order) an integer, starting from 1
+    (since 0 is reserved for the background class).
+
+    Code is based on
+    
https://github.com/tensorflow/models/blob/master/inception/inception/data/build_imagenet_data.py#L463
+    """
+
+    # pylint: disable=line-too-long
+
+    dest_directory = FLAGS.model_dir
+
+    synset_list = [s.strip() for s in open(os.path.join(
+        dest_directory, 'imagenet_lsvrc_2015_synsets.txt')).readlines()]
+    num_synsets_in_ilsvrc = len(synset_list)
+    assert num_synsets_in_ilsvrc == 1000
 
-  def __init__(self,
-               label_lookup_path=None,
-               uid_lookup_path=None):
-    if not label_lookup_path:
-      label_lookup_path = os.path.join(
-          FLAGS.model_dir, 'imagenet_2012_challenge_label_map_proto.pbtxt')
-    if not uid_lookup_path:
-      uid_lookup_path = os.path.join(
-          FLAGS.model_dir, 'imagenet_synset_to_human_label_map.txt')
-    self.node_lookup = self.load(label_lookup_path, uid_lookup_path)
+    synset_to_human_list = open(os.path.join(
+        dest_directory, 'imagenet_metadata.txt')).readlines()
+    num_synsets_in_all_imagenet = len(synset_to_human_list)
+    assert num_synsets_in_all_imagenet == 21842
 
-  def load(self, label_lookup_path, uid_lookup_path):
-    """Loads a human readable English name for each softmax node.
+    synset_to_human = {}
+    for s in synset_to_human_list:
+        parts = s.strip().split('\t')
+        assert len(parts) == 2
+        synset = parts[0]
+        human = parts[1]
+        synset_to_human[synset] = human
+
+    label_index = 1
+    labels_to_names = {0: 'background'}
+    for synset in synset_list:
+        name = synset_to_human[synset]
+        labels_to_names[label_index] = name
+        label_index += 1
+
+    return labels_to_names
+
+
+def run_inference_on_image(image):
+    """Runs inference on an image.
 
     Args:
-      label_lookup_path: string UID to integer node ID.
-      uid_lookup_path: string UID to human-readable string.
+      image: Image file name.
 
     Returns:
-      dict from integer node ID to human-readable string.
+      Nothing
     """
-    if not tf.gfile.Exists(uid_lookup_path):
-      tf.logging.fatal('File does not exist %s', uid_lookup_path)
-    if not tf.gfile.Exists(label_lookup_path):
-      tf.logging.fatal('File does not exist %s', label_lookup_path)
-
-    # Loads mapping from string UID to human-readable string
-    proto_as_ascii_lines = tf.gfile.GFile(uid_lookup_path).readlines()
-    uid_to_human = {}
-    p = re.compile(r'[n\d]*[ \S,]*')
-    for line in proto_as_ascii_lines:
-      parsed_items = p.findall(line)
-      uid = parsed_items[0]
-      human_string = parsed_items[2]
-      uid_to_human[uid] = human_string
-
-    # Loads mapping from string UID to integer node ID.
-    node_id_to_uid = {}
-    proto_as_ascii = tf.gfile.GFile(label_lookup_path).readlines()
-    for line in proto_as_ascii:
-      if line.startswith('  target_class:'):
-        target_class = int(line.split(': ')[1])
-      if line.startswith('  target_class_string:'):
-        target_class_string = line.split(': ')[1]
-        node_id_to_uid[target_class] = target_class_string[1:-2]
-
-    # Loads the final mapping of integer node ID to human-readable string
-    node_id_to_name = {}
-    for key, val in node_id_to_uid.items():
-      if val not in uid_to_human:
-        tf.logging.fatal('Failed to locate: %s', val)
-      name = uid_to_human[val]
-      node_id_to_name[key] = name
-
-    return node_id_to_name
-
-  def id_to_string(self, node_id):
-    if node_id not in self.node_lookup:
-      return ''
-    return self.node_lookup[node_id]
-
-
-def create_graph():
-  """Creates a graph from saved GraphDef file and returns a saver."""
-  # Creates graph from saved graph_def.pb.
-  with tf.gfile.FastGFile(os.path.join(
-      FLAGS.model_dir, 'classify_image_graph_def.pb'), 'rb') as f:
-    graph_def = tf.GraphDef()
-    graph_def.ParseFromString(f.read())
-    _ = tf.import_graph_def(graph_def, name='')
+    dest_directory = FLAGS.model_dir
 
+    image_size = inception.inception_v4.default_image_size
 
-def run_inference_on_image(image):
-  """Runs inference on an image.
-
-  Args:
-    image: Image file name.
-
-  Returns:
-    Nothing
-  """
-  if not tf.gfile.Exists(image):
-    tf.logging.fatal('File does not exist %s', image)
-  image_data = tf.gfile.FastGFile(image, 'rb').read()
-
-  # Creates graph from saved GraphDef.
-  create_graph()
-
-  with tf.Session() as sess:
-    # Some useful tensors:
-    # 'softmax:0': A tensor containing the normalized prediction across
-    #   1000 labels.
-    # 'pool_3:0': A tensor containing the next-to-last layer containing 2048
-    #   float description of the image.
-    # 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG
-    #   encoding of the image.
-    # Runs the softmax tensor by feeding the image_data as input to the graph.
-    softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
-    predictions = sess.run(softmax_tensor,
-                           {'DecodeJpeg/contents:0': image_data})
-    predictions = np.squeeze(predictions)
-
-    # Creates node ID --> English string lookup.
-    node_lookup = NodeLookup()
-
-    top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
-    for node_id in top_k:
-      human_string = node_lookup.id_to_string(node_id)
-      score = predictions[node_id]
-      print('%s (score = %.5f)' % (human_string, score))
+    if not tf.gfile.Exists(image):
+        tf.logging.fatal('File does not exist %s', image)
+    image_string = tf.gfile.FastGFile(image, 'rb').read()
 
+    with tf.Graph().as_default():
+        image = tf.image.decode_jpeg(image_string, channels=3)
+        processed_image = inception_preprocessing.preprocess_image(
+            image, image_size, image_size, is_training=False)
+        processed_images = tf.expand_dims(processed_image, 0)
+
+        # Create the model, use the default arg scope to configure the batch
+        # norm parameters.
+        with slim.arg_scope(inception.inception_v4_arg_scope()):
+            logits, _ = inception.inception_v4(
+                processed_images, num_classes=1001, is_training=False)
+        probabilities = tf.nn.softmax(logits)
+
+        init_fn = slim.assign_from_checkpoint_fn(
+            os.path.join(dest_directory, 'inception_v4.ckpt'),
+            slim.get_model_variables('InceptionV4'))
+
+        with tf.Session() as sess:
+            init_fn(sess)
+            _, probabilities = sess.run([image, probabilities])
 
 Review comment:
   Thanks for the review. I'll work on having a constructor method as expected. 
I have a query. Don't you think it wouldn't affect classify_image.py? Because 
we will have run that python file every time we call for recognition through 
cli? 
   What you suggested seems relevant to the REST api in which we can load the 
model once and reuse it. Correct me if I overlooked something. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update Inception v3 to Inception v4 in Object recognition parser 
> -----------------------------------------------------------------
>
>                 Key: TIKA-2306
>                 URL: https://issues.apache.org/jira/browse/TIKA-2306
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Kranthi Kiran GV
>            Priority: Minor
>              Labels: inception, object_recognition
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Object Recognition Parser currently uses Inception V3 model for the object 
> classification task. Google released a newer Inception V4 model [1][2].
> It has an improved Top -1 accuracy of 80.2 and Top-5 accuracy of 95.2 [3].
> I believe that Tika community would benefit from it. I would be working on 
> this issue in the next few days.
> [1] https://research.googleblog.com/2016/08/improving-inception-and-image.html
> [2] https://arxiv.org/abs/1602.07261
> [3] https://github.com/tensorflow/models/tree/master/slim



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (TIKA-2306) Update Inception v3 to Inception v4 in Object recognition parser

Reply via email to