[jira] [Updated] (USERGRID-536) Change our index structure for static mapping and cleanup api

Jeffrey (JIRA) Sun, 12 Jul 2015 09:01:42 -0700

     [ 
https://issues.apache.org/jira/browse/USERGRID-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jeffrey  updated USERGRID-536:
------------------------------
    Priority: Critical  (was: Major)

> Change our index structure for static mapping and cleanup api
> -------------------------------------------------------------
>
>                 Key: USERGRID-536
>                 URL: https://issues.apache.org/jira/browse/USERGRID-536
>             Project: Usergrid
>          Issue Type: Story
>          Components: Stack
>            Reporter: Todd Nine
>            Assignee: Todd Nine
>            Priority: Critical
>             Fix For: 2.1
>
>
> Currently, our dynamic mapping causes several issues with elastic search.  We 
> should change our mapping to use a static structure, and resolve this 
> operational pain.
> We need to make the following changes.
> h2. Modify our IndexScope
> This should more closely resemble the elements of an edge since this 
> represents an edge. It will simplify the use of our query module and make 
> development clearer.  This scope should be refactored into the following 
> objects.  
> * IndexEdge - Id, name, timestamp, edgeType (source or target)
> * SearchEdge - Id, name, edgeType
> Note: edgeType is the type of the Id within the edge.  Does this Id represent 
> a source Id, or does it represent a targetId?  The entity to be indexed will 
> implicitly be the opposite of the type specified.  I.E if it's a source edge, 
> the document is the target.  If it's a target edge, the document is the 
> source.
> These values should also be stored within our document, so that we can index 
> our documents.  Note that we perform bidirectional indexing in some cases, 
> such was users, groups etc.  When we do this, we need to ensure that mark the 
> direction of the edge appropriately.
> h2. Change default sort ordering
> When sorting is unspecified, we should order by timestamp descending from our 
> index edge.  This ensures that we retain the correct edge time semantics, and 
> will properly order collections and connections
> h2. Remove the legacy query class
> We don't need the Query class, it has far too many functions to be a well 
> encapsulated object.  Instead, we should simply take the string QL, the 
> SearchEdge and the limit to return our candidates.  From there, we should 
> parse and visit the query internally to the query logic, NOT externally.
> h2. Create a static mapping
> The mapping should contains the following static fields.
> * entityId - The entity id
> * entityType - The entity type (from the id)
> * entityVersion - The entity version
> * edgeId - The edge Id
> * edgeName - The edge name
> * edgeTimestamp - The edge timestamp
> * edgeType - source | target
> * edgeSearch - edgeId + edgeName + edgeType
> It will then contain an array of "fields"  Each of these fields will have the 
> following formation.
> {code}
> { "name":"[entity field name as a path]", "[field type]":[field value}
> {code}
> We will define a field type for each type of field.  Note that each field 
> tuple will always contain a single field and a single value.  Possible field 
> types are the following.
> * string - This will be mapped into 2 mapping with multi mappings.  It will 
> be a string unanalyzed, and an analyzed string.  The 2 fields will then be 
> "string_u" and "string_a".  The Query visitor will need to update the field 
> name appropriately
> * long - An unanalyzed long
> * double - An unanalyzed double
> * boolean - An unanalyzed boolean
> * location - A geolocation field
> * uuid - A UUID stored as an unanalyzed string
> The entity path will be a flattened path from the root json element to the 
> max json element.  It can be though of as a path through the tree of json 
> elements.  We will use a dot '.' to delimit the fields.  X.Y.Z for nested 
> objects.  Primitive arrays will contain a field object for each element in 
> the array.
> h2. Indexing
>   When indexing entities, we will no longer modify or prefix field names.  
> They will be inserted into the value exactly as their path appears after 
> lower case.
> h2. Querying
>   When querying, the "contains" operation for a string will need to use the 
> "string_a" data type.  When using =, we will need to use the string_u data 
> type.  Each criteria will need to use nested object querying, to ensure the 
> property name and property value are both part of the same field tuple.
> h3. References
> Multi Field Mapping: 
> http://www.elastic.co/guide/en/elasticsearch/reference/current/_multi_fields.html
> Nested Objects: 
> http://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html
> Nested Object Search: 
> http://www.elastic.co/guide/en/elasticsearch/guide/master/nested-sorting.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (USERGRID-536) Change our index structure for static mapping and cleanup api

Reply via email to