Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

2018-06-04 Thread Pranav Agrawal
yes, issue is with array type only, I have confirmed that. I exploded array to struct but still getting the same error, *Exception in thread "main" org.apache.spark.sql.AnalysisException: Union can only be performed on tables with the compatible column types. struct <> struct at the 21th column

Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

2018-06-04 Thread Jorge Machado
Have you tryed to narrow down the problem so that we can be 100% sure that it lies on the array types ? Just exclude them for sake of testing. If we know 100% that it is on this array stuff try to explode that columns into simple types. Jorge Machado > On 4 Jun 2018, at 11:09, Pranav

Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

2018-06-04 Thread Pranav Agrawal
I am ordering the columns before doing union, so I think that should not be an issue, * String[] columns_original_order = baseDs.columns(); String[] columns = baseDs.columns();Arrays.sort(columns); baseDs=baseDs.selectExpr(columns);

Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

2018-06-04 Thread Jorge Machado
Try the same union with a dataframe without the arrays types. Could be something strange there like ordering or so. Jorge Machado > On 4 Jun 2018, at 10:17, Pranav Agrawal wrote: > > schema is exactly the same, not sure why it is failing though. > > root > |-- booking_id: integer

Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

2018-06-04 Thread Pranav Agrawal
schema is exactly the same, not sure why it is failing though. root |-- booking_id: integer (nullable = true) |-- booking_rooms_room_category_id: integer (nullable = true) |-- booking_rooms_room_id: integer (nullable = true) |-- booking_source: integer (nullable = true) |-- booking_status:

Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

2018-06-03 Thread Alessandro Solimando
Hi Pranav, I don´t have an answer to your issue, but what I generally do in this cases is to first try to simplify it to a point where it is easier to check what´s going on, and then adding back ¨pieces¨ one by one until I spot the error. In your case I can suggest to: 1) project the dataset to

Re: SPARK SQL Error

2015-10-15 Thread Giridhar Maddukuri
Ritchard's comments on why the --files option may be > redundant in > your case. > > Regards, > Dilip Biswal > Tel: 408-463-4980 > dbis...@us.ibm.com > > > > From:Giri <giridhar.madduk...@gmail.com> > To:user@spark.apache.org > Date:

Re: SPARK SQL Error

2015-10-15 Thread Dilip Biswal
nts on why the --files option may be redundant in your case. Regards, Dilip Biswal Tel: 408-463-4980 dbis...@us.ibm.com From: Giri <giridhar.madduk...@gmail.com> To: user@spark.apache.org Date: 10/15/2015 02:44 AM Subject: Re: SPARK SQL Error Hi Ritchard, Thank you so much

Re: SPARK SQL Error

2015-10-15 Thread Giri
Hi Ritchard, Thank you so much again for your input.This time I ran the command in the below way spark-submit --master yarn --class org.spark.apache.CsvDataSource /home/cloudera/Desktop/TestMain.jar hdfs://quickstart.cloudera:8020/people_csv But I am facing the new error "Could not parse

Re: SPARK SQL Error

2015-10-15 Thread pnpritchard
Going back to your code, I see that you instantiate the spark context as: val sc = new SparkContext(args(0), "Csv loading example") which will set the master url to "args(0)" and app name to "Csv loading example". In your case, args(0) is "hdfs://quickstart.cloudera:8020/people_csv", which

Re: SPARK SQL Error

2015-10-14 Thread pnpritchard
I think the stack trace is quite informative. Assuming line 10 of CsvDataSource is "val df = sqlContext.load("com.databricks.spark.csv", Map("path" -> args(1),"header"->"true"))", then the "args(1)" call is throwing an ArrayIndexOutOfBoundsException. The reason for this is because you aren't

Re: SPARK SQL Error

2015-10-13 Thread pnpritchard
Your app jar should be at the end of the command, without the --jars prefix. That option is only necessary if you have more than one jar to put on the classpath (i.e. dependency jars that aren't packaged inside your app jar). spark-submit --master yarn --class org.spark.apache.CsvDataSource

Re: Spark SQL Error

2015-07-30 Thread Akhil Das
It seem an issue with the ES connector https://github.com/elastic/elasticsearch-hadoop/issues/482 Thanks Best Regards On Tue, Jul 28, 2015 at 6:14 AM, An Tran tra...@gmail.com wrote: Hello all, I am currently having an error with Spark SQL access Elasticsearch using Elasticsearch Spark

RE: Spark sql error while writing Parquet file- Trying to write more fields than contained in row

2015-05-19 Thread Chandra Mohan, Ananda Vel Murugan
Hi, Thanks for the response. I was looking for a java solution. I will check the scala and python ones. Regards, Anand.C From: Todd Nist [mailto:tsind...@gmail.com] Sent: Tuesday, May 19, 2015 6:17 PM To: Chandra Mohan, Ananda Vel Murugan Cc: ayan guha; user Subject: Re: Spark sql error while

Re: Spark sql error while writing Parquet file- Trying to write more fields than contained in row

2015-05-19 Thread Todd Nist
, Ananda Vel Murugan; user *Subject:* Re: Spark sql error while writing Parquet file- Trying to write more fields than contained in row Hi Give a try with dtaFrame.fillna function to fill up missing column Best Ayan On Mon, May 18, 2015 at 8:29 PM, Chandra Mohan, Ananda Vel Murugan

RE: Spark sql error while writing Parquet file- Trying to write more fields than contained in row

2015-05-18 Thread Chandra Mohan, Ananda Vel Murugan
artifactIdspark-sql_2.10/artifactId version1.3.1/version /dependency Regards, Anand.C From: ayan guha [mailto:guha.a...@gmail.com] Sent: Monday, May 18, 2015 5:19 PM To: Chandra Mohan, Ananda Vel Murugan; user Subject: Re: Spark sql error while writing Parquet file- Trying to write more

Re: Spark sql error while writing Parquet file- Trying to write more fields than contained in row

2015-05-18 Thread ayan guha
Hi Give a try with dtaFrame.fillna function to fill up missing column Best Ayan On Mon, May 18, 2015 at 8:29 PM, Chandra Mohan, Ananda Vel Murugan ananda.muru...@honeywell.com wrote: Hi, I am using spark-sql to read a CSV file and write it as parquet file. I am building the schema

Re: spark sql error with proto/parquet

2015-04-20 Thread Michael Armbrust
You are probably using an encoding that we don't support. I think this PR may be adding that support: https://github.com/apache/spark/pull/5422 On Sat, Apr 18, 2015 at 5:40 PM, Abhishek R. Singh abhis...@tetrationanalytics.com wrote: I have created a bunch of protobuf based parquet files that