[jira] [Updated] (SPARK-24032) ElasticSearch update fails on nested field

Cristina Luengo (JIRA) Fri, 20 Apr 2018 00:44:13 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-24032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Cristina Luengo updated SPARK-24032:
------------------------------------
    Description: 
I'm trying to update a nested field on ElasticSearch using a scripted update. 
The code I'm using is:

update_params = "new_samples: samples"
 update_script = "ctx._source.samples += new_samples"

es_conf =

{ "es.mapping.id": "id", "es.mapping.exclude": "id", "es.write.operation": 
"upsert", "es.update.script.params": update_params, "es.update.script.inline": 
update_script }

result.write.format("org.elasticsearch.spark.sql").options(**es_conf).option("es.nodes",configuration["elasticsearch"]["host"]).option("es.port",configuration["elasticsearch"]["port"]
 
).save(configuration["elasticsearch"]["index_name"]+"/"+configuration["version"],mode='append')

And the schema of the field is:
|– samples: array (nullable = true)|
| |– element: struct (containsNull = true)|
| | |– gq: integer (nullable = true)|
| | |– dp: integer (nullable = true)|
| | |– gt: string (nullable = true)|
| | |– adBug: array (nullable = true)|
| | | |– element: integer (containsNull = true)|
| | |– ad: double (nullable = true)|
| | |– sample: string (nullable = true)|

And I get the following error:

py4j.protocol.Py4JJavaError: An error occurred while calling o83.save.
 : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 
1, localhost, executor driver): java.lang.ClassCastException: 
scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.Tuple2

Am I doing something wrong or is this a bug? Thanks.

  was:
I'm trying to update a nested field using Spark and a scripted update. The code 
I'm using is:

update_params = "new_samples: samples"
 update_script = "ctx._source.samples += new_samples"

es_conf = {
 "es.mapping.id": "id",
 "es.mapping.exclude": "id",
 "es.write.operation": "upsert",
 "es.update.script.params": update_params,
 "es.update.script.inline": update_script
 }

result.write.format("org.elasticsearch.spark.sql").options(**es_conf).option("es.nodes",configuration["elasticsearch"]["host"]).option("es.port",configuration["elasticsearch"]["port"]
 
).save(configuration["elasticsearch"]["index_name"]+"/"+configuration["version"],mode='append')

And the schema of the field is:

|-- samples: array (nullable = true)
 | |-- element: struct (containsNull = true)
 | | |-- gq: integer (nullable = true)
 | | |-- dp: integer (nullable = true)
 | | |-- gt: string (nullable = true)
 | | |-- adBug: array (nullable = true)
 | | | |-- element: integer (containsNull = true)
 | | |-- ad: double (nullable = true)
 | | |-- sample: string (nullable = true)

And I get the following error:

py4j.protocol.Py4JJavaError: An error occurred while calling o83.save.
 : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 
1, localhost, executor driver): java.lang.ClassCastException: 
scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.Tuple2

Am I doing something wrong or is this a bug? Thanks.


> ElasticSearch update fails on nested field
> ------------------------------------------
>
>                 Key: SPARK-24032
>                 URL: https://issues.apache.org/jira/browse/SPARK-24032
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Cristina Luengo
>            Priority: Critical
>
> I'm trying to update a nested field on ElasticSearch using a scripted update. 
> The code I'm using is:
> update_params = "new_samples: samples"
>  update_script = "ctx._source.samples += new_samples"
> es_conf =
> { "es.mapping.id": "id", "es.mapping.exclude": "id", "es.write.operation": 
> "upsert", "es.update.script.params": update_params, 
> "es.update.script.inline": update_script }
> result.write.format("org.elasticsearch.spark.sql").options(**es_conf).option("es.nodes",configuration["elasticsearch"]["host"]).option("es.port",configuration["elasticsearch"]["port"]
>  
> ).save(configuration["elasticsearch"]["index_name"]+"/"+configuration["version"],mode='append')
> And the schema of the field is:
> |– samples: array (nullable = true)|
> | |– element: struct (containsNull = true)|
> | | |– gq: integer (nullable = true)|
> | | |– dp: integer (nullable = true)|
> | | |– gt: string (nullable = true)|
> | | |– adBug: array (nullable = true)|
> | | | |– element: integer (containsNull = true)|
> | | |– ad: double (nullable = true)|
> | | |– sample: string (nullable = true)|
> And I get the following error:
> py4j.protocol.Py4JJavaError: An error occurred while calling o83.save.
>  : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
> in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 
> (TID 1, localhost, executor driver): java.lang.ClassCastException: 
> scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.Tuple2
> Am I doing something wrong or is this a bug? Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-24032) ElasticSearch update fails on nested field

Reply via email to