DataSourceV2 APIs creating multiple instances of DataSourceReader and hence not preserving the state

2018-10-08 Thread Shubham Chaurasia
Hi All, --Spark built with *tags/v2.4.0-rc2* Consider following DataSourceReader implementation: public class MyDataSourceReader implements DataSourceReader, SupportsScanColumnarBatch { StructType schema = null; Map options; public MyDataSourceReader(Map options) { System.out.println

SparkR issue

2018-10-08 Thread ayan guha
Hi We are seeing some weird behaviour in Spark R. We created a R Dataframe with 600K records and 29 columns. Then we tried to convert R DF to SparkDF using df <- SparkR::createDataFrame(rdf) from RStudio. It hanged, we had to kill the process after 1-2 hours. We also tried following: df <- Spa

答复: 答复: Executor hang

2018-10-08 Thread 阎志涛
Yeah, the problem was work around by adding config spark.sql.codegen.wholeStage=false. Thanks and Regards, Tony 发件人: kathleen li 发送时间: 2018年10月8日 12:15 收件人: 阎志涛 抄送: user@spark.apache.org 主题: Re: 答复: Executor hang https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-whole-stage-c

CSV parser - is there a way to find malformed csv record

2018-10-08 Thread Nirav Patel
I am getting `RuntimeException: Malformed CSV record` while parsing csv record and attaching schema at same time. Most likely there are additional commas or json data in some field which are not escaped properly. Is there a way CSV parser tells me which record is malformed? This is what I am usin

td

2018-10-08 Thread 冯 远森
td

Error while upserting ElasticSearch from Spark 2.2

2018-10-08 Thread Deepak Sharma
Hi All, I am facing this weird issue while upserting ElasticSearch using Spark Data Frame. *org.elasticsearch.hadoop.rest.EsHadoopRemoteException: version_conflict_engine_exception:* After it fails and if rerun for 2-3 times , it finally succeeds. I thought to check if anyone faced this issue and