[
https://issues.apache.org/jira/browse/USERGRID-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeffrey updated USERGRID-536:
------------------------------
Priority: Critical (was: Major)
> Change our index structure for static mapping and cleanup api
> -------------------------------------------------------------
>
> Key: USERGRID-536
> URL: https://issues.apache.org/jira/browse/USERGRID-536
> Project: Usergrid
> Issue Type: Story
> Components: Stack
> Reporter: Todd Nine
> Assignee: Todd Nine
> Priority: Critical
> Fix For: 2.1
>
>
> Currently, our dynamic mapping causes several issues with elastic search. We
> should change our mapping to use a static structure, and resolve this
> operational pain.
> We need to make the following changes.
> h2. Modify our IndexScope
> This should more closely resemble the elements of an edge since this
> represents an edge. It will simplify the use of our query module and make
> development clearer. This scope should be refactored into the following
> objects.
> * IndexEdge - Id, name, timestamp, edgeType (source or target)
> * SearchEdge - Id, name, edgeType
> Note: edgeType is the type of the Id within the edge. Does this Id represent
> a source Id, or does it represent a targetId? The entity to be indexed will
> implicitly be the opposite of the type specified. I.E if it's a source edge,
> the document is the target. If it's a target edge, the document is the
> source.
> These values should also be stored within our document, so that we can index
> our documents. Note that we perform bidirectional indexing in some cases,
> such was users, groups etc. When we do this, we need to ensure that mark the
> direction of the edge appropriately.
> h2. Change default sort ordering
> When sorting is unspecified, we should order by timestamp descending from our
> index edge. This ensures that we retain the correct edge time semantics, and
> will properly order collections and connections
> h2. Remove the legacy query class
> We don't need the Query class, it has far too many functions to be a well
> encapsulated object. Instead, we should simply take the string QL, the
> SearchEdge and the limit to return our candidates. From there, we should
> parse and visit the query internally to the query logic, NOT externally.
> h2. Create a static mapping
> The mapping should contains the following static fields.
> * entityId - The entity id
> * entityType - The entity type (from the id)
> * entityVersion - The entity version
> * edgeId - The edge Id
> * edgeName - The edge name
> * edgeTimestamp - The edge timestamp
> * edgeType - source | target
> * edgeSearch - edgeId + edgeName + edgeType
> It will then contain an array of "fields" Each of these fields will have the
> following formation.
> {code}
> { "name":"[entity field name as a path]", "[field type]":[field value}
> {code}
> We will define a field type for each type of field. Note that each field
> tuple will always contain a single field and a single value. Possible field
> types are the following.
> * string - This will be mapped into 2 mapping with multi mappings. It will
> be a string unanalyzed, and an analyzed string. The 2 fields will then be
> "string_u" and "string_a". The Query visitor will need to update the field
> name appropriately
> * long - An unanalyzed long
> * double - An unanalyzed double
> * boolean - An unanalyzed boolean
> * location - A geolocation field
> * uuid - A UUID stored as an unanalyzed string
> The entity path will be a flattened path from the root json element to the
> max json element. It can be though of as a path through the tree of json
> elements. We will use a dot '.' to delimit the fields. X.Y.Z for nested
> objects. Primitive arrays will contain a field object for each element in
> the array.
> h2. Indexing
> When indexing entities, we will no longer modify or prefix field names.
> They will be inserted into the value exactly as their path appears after
> lower case.
> h2. Querying
> When querying, the "contains" operation for a string will need to use the
> "string_a" data type. When using =, we will need to use the string_u data
> type. Each criteria will need to use nested object querying, to ensure the
> property name and property value are both part of the same field tuple.
> h3. References
> Multi Field Mapping:
> http://www.elastic.co/guide/en/elasticsearch/reference/current/_multi_fields.html
> Nested Objects:
> http://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html
> Nested Object Search:
> http://www.elastic.co/guide/en/elasticsearch/guide/master/nested-sorting.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)