[ 
https://issues.apache.org/jira/browse/CALCITE-6498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Grein updated CALCITE-6498:
-------------------------------
    Description: 
_EmbeddedElasticsearchPolicy#_
_applyMapping_ tries to apply a multi-field mapping by adding another 
_properties_ field inside a parent field mapping. In Elasticsearch you need to 
use _fields_ rather than _properties_ for a multi-field mapping to work.
 
I've tested this with ES 7 and ES 6 (only including ES 7 example here in the 
issue description; for ES 6 you need to wrap everything under "mappings" in 
"_doc"):

ES 7:
{code:java}
PUT /some_index
{
  "mappings": {
    "properties": {
      "some_field": {
        "type": "text",
        "properties": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
}{code}
will lead to the following error:
{code:java}
{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "Mapping definition for [some_field] has unsupported 
parameters:  [properties : {keyword={type=keyword}}]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "Failed to parse mapping [_doc]: Mapping definition for 
[some_field] has unsupported parameters:  [properties : 
{keyword={type=keyword}}]",
    "caused_by" : {
      "type" : "mapper_parsing_exception",
      "reason" : "Mapping definition for [some_field] has unsupported 
parameters:  [properties : {keyword={type=keyword}}]"
    }
  },
  "status" : 400
} {code}
 
Successful request:
{code:java}
PUT /some_index
{
  "mappings": {
    "properties": {
      "some_field": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
} {code}
You'll encounter this error also, if you adapt the test data in 
_ElasticsearchAdapterTest_ to include nested fields:
{code:java}
@BeforeAll
public static void setupInstance() throws Exception {
  // "city.keyword" is a nested field with type "keyword"
  final Map<String, String> mapping =
      ImmutableMap.of("city", "text", "city.keyword", "keyword", "state", 
"keyword", "pop", "long"); {code}
Error:
{code:java}
{...

{"type":"mapper_parsing_exception","reason":"unknown parameter [properties] on 
mapper [city] of type [text]"}

...} {code}
Looking at _ElasticsearchJson#visitMappingProperties_ I assume this will lead 
to a similar issue, which I'll double-check (we probably need to distinguish 
between nested fields and multi-field mappings explicitly).

This is related to https://issues.apache.org/jira/browse/CALCITE-3027 as you 
want to detect, if you perform a "LIKE" operator on a purely "text" mapped 
field, which will lead to weird semantics ("text" mapped fields are 
analyzed/broken up into several tokens), if you do not prevent it. Usually you 
have a multi-field mapping for a field containing text ("text" mapping for full 
text search and a "keyword" mapping you can use for aggregations, wildcard 
queries etc.). So this is rather important to work correctly overall and in the 
tests.

  was:
_EmbeddedElasticsearchPolicy#_
_applyMapping_ tries to apply a multi-field mapping by adding another 
_properties_ field inside a parent field mapping. In Elasticsearch you need to 
use _fields_ rather than _properties_ for a multi-field mapping to work.
 
I've tested this with ES 7 and ES 6 (only including ES 7 example here in the 
issue description; for ES 6 you need to wrap everything under "mappings" in 
"_doc"):

ES 7:
{code:java}
PUT /some_index
{
  "mappings": {
    "properties": {
      "some_field": {
        "type": "text",
        "properties": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
}{code}
will lead to the following error:
{code:java}
{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "Mapping definition for [some_field] has unsupported 
parameters:  [properties : {keyword={type=keyword}}]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "Failed to parse mapping [_doc]: Mapping definition for 
[some_field] has unsupported parameters:  [properties : 
{keyword={type=keyword}}]",
    "caused_by" : {
      "type" : "mapper_parsing_exception",
      "reason" : "Mapping definition for [some_field] has unsupported 
parameters:  [properties : {keyword={type=keyword}}]"
    }
  },
  "status" : 400
} {code}
 
Successful request:
{code:java}
PUT /some_index
{
  "mappings": {
    "properties": {
      "some_field": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
} {code}
You'll encounter this error also, if you adapt the test data in 
_ElasticsearchAdapterTest_ to include nested fields:
{code:java}
@BeforeAll
public static void setupInstance() throws Exception {
  // "city.keyword" is a nested field with type "keyword"
  final Map<String, String> mapping =
      ImmutableMap.of("city", "text", "city.keyword", "keyword", "state", 
"keyword", "pop", "long"); {code}
Error:
{code:java}
{...

{"type":"mapper_parsing_exception","reason":"unknown parameter [properties] on 
mapper [city] of type [text]"}

...} {code}
Looking at _ElasticsearchJson#visitMappingProperties_ I assume this will lead 
to a similar issue, which I'll double-check (we probably need to distinguish 
between nested fields and multi-field mappings explicitly).

This is related to https://issues.apache.org/jira/browse/CALCITE-3027 as you 
want to detect, if you perform a "LIKE" operator on a purely "text" mapped 
field, which will lead to weird semantics ("text" mapped fields are 
analyzed/broken up into several tokens), if you do not prevent it. Usually you 
have a multi-field mapping for a field containing text ("text" field mapping 
and a "keyword" mapping you can use for aggregations, wildcard queries etc.). 
So this is rather important to work correctly overall and in the tests.


> Elasticsearch multi-field mappings do not work
> ----------------------------------------------
>
>                 Key: CALCITE-6498
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6498
>             Project: Calcite
>          Issue Type: Bug
>          Components: elasticsearch-adapter
>            Reporter: Tim Grein
>            Assignee: Tim Grein
>            Priority: Major
>             Fix For: 1.38.0
>
>
> _EmbeddedElasticsearchPolicy#_
> _applyMapping_ tries to apply a multi-field mapping by adding another 
> _properties_ field inside a parent field mapping. In Elasticsearch you need 
> to use _fields_ rather than _properties_ for a multi-field mapping to work.
>  
> I've tested this with ES 7 and ES 6 (only including ES 7 example here in the 
> issue description; for ES 6 you need to wrap everything under "mappings" in 
> "_doc"):
> ES 7:
> {code:java}
> PUT /some_index
> {
>   "mappings": {
>     "properties": {
>       "some_field": {
>         "type": "text",
>         "properties": {
>           "keyword": {
>             "type": "keyword"
>           }
>         }
>       }
>     }
>   }
> }{code}
> will lead to the following error:
> {code:java}
> {
>   "error" : {
>     "root_cause" : [
>       {
>         "type" : "mapper_parsing_exception",
>         "reason" : "Mapping definition for [some_field] has unsupported 
> parameters:  [properties : {keyword={type=keyword}}]"
>       }
>     ],
>     "type" : "mapper_parsing_exception",
>     "reason" : "Failed to parse mapping [_doc]: Mapping definition for 
> [some_field] has unsupported parameters:  [properties : 
> {keyword={type=keyword}}]",
>     "caused_by" : {
>       "type" : "mapper_parsing_exception",
>       "reason" : "Mapping definition for [some_field] has unsupported 
> parameters:  [properties : {keyword={type=keyword}}]"
>     }
>   },
>   "status" : 400
> } {code}
>  
> Successful request:
> {code:java}
> PUT /some_index
> {
>   "mappings": {
>     "properties": {
>       "some_field": {
>         "type": "text",
>         "fields": {
>           "keyword": {
>             "type": "keyword"
>           }
>         }
>       }
>     }
>   }
> } {code}
> You'll encounter this error also, if you adapt the test data in 
> _ElasticsearchAdapterTest_ to include nested fields:
> {code:java}
> @BeforeAll
> public static void setupInstance() throws Exception {
>   // "city.keyword" is a nested field with type "keyword"
>   final Map<String, String> mapping =
>       ImmutableMap.of("city", "text", "city.keyword", "keyword", "state", 
> "keyword", "pop", "long"); {code}
> Error:
> {code:java}
> {...
> {"type":"mapper_parsing_exception","reason":"unknown parameter [properties] 
> on mapper [city] of type [text]"}
> ...} {code}
> Looking at _ElasticsearchJson#visitMappingProperties_ I assume this will lead 
> to a similar issue, which I'll double-check (we probably need to distinguish 
> between nested fields and multi-field mappings explicitly).
> This is related to https://issues.apache.org/jira/browse/CALCITE-3027 as you 
> want to detect, if you perform a "LIKE" operator on a purely "text" mapped 
> field, which will lead to weird semantics ("text" mapped fields are 
> analyzed/broken up into several tokens), if you do not prevent it. Usually 
> you have a multi-field mapping for a field containing text ("text" mapping 
> for full text search and a "keyword" mapping you can use for aggregations, 
> wildcard queries etc.). So this is rather important to work correctly overall 
> and in the tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to