Re: Failed to bulk insert

2019-03-29 Thread vbal...@apache.org
Hi Umesh,  Can you add the following verbose configs to capture class-loading information when you run the spark application  You can set the following in spark config (For default - $SPARK_HOME/conf/spark-defaults.conf) spark.executor.extraJavaOptions=-verbose:classspark.driver.extraJavaOptions

Re: Failed to bulk insert

2019-03-29 Thread Umesh Kacha
Hi Balaji, I tried it still gives same error I dont have any other hoodie library except spark bundle. I am using Databricks Spark cloud. Do you think Databricks cloud has some other hoodie dependencies? Regards, Umesh On Thu, Mar 28, 2019 at 9:43 AM Umesh Kacha wrote: > Hi Balaji thanks no I

Re: Failed to bulk insert

2019-03-27 Thread Umesh Kacha
Hi Balaji thanks no I am still getting same error will debug more I am sure I don't have any other Hoodie jar except bundle one not sure what's wrong. I will ask for help in case I am stuck. On Thu, Mar 28, 2019, 5:00 AM vbal...@apache.org wrote: > > Hi Umesh, > > Were you able to bulk insert su

Re: Failed to bulk insert

2019-03-27 Thread vbal...@apache.org
Hi Umesh, Were you able to bulk insert successfully ? Balaji.VOn Monday, March 25, 2019, 9:42:44 AM PDT, vbal...@apache.org wrote: Hi Umesh,  I don't see any attachments here. Anyways, I did the following test 1. Create a HelloWorld IntelliJ java project2. I added the Spark bundle as

Re: Failed to bulk insert

2019-03-25 Thread vbal...@apache.org
Hi Umesh,  I don't see any attachments here. Anyways, I did the following test 1. Create a HelloWorld IntelliJ java project2. I added the Spark bundle as a library dependency (added the local jar directly). ( In INtelliJ -> File -> Project Structure -> Libraries -> + 3. I opened HoodieAvroSuppo

Re: Failed to bulk insert

2019-03-24 Thread Umesh Kacha
Hi Balaji thanks I am using only hoodie-spark-bundle jar as you told me to do so last time. Please find all the maven jars in my project in an attached snapshot and I am sure they dont clash with each other. On Sun, Mar 24, 2019 at 9:48 AM Balaji Varadarajan wrote: > Hi Umesh, > I suspect you a

Re: Failed to bulk insert

2019-03-23 Thread Balaji Varadarajan
Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of  HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is also

Re: Failed to bulk insert

2019-03-23 Thread vbal...@apache.org
Hi Umesh, I suspect you are including both hoodie-common and hoodie-spark-bundle jars in your runtime package dependencies. There will be a version of  HoodieAvroWriteSupport constructor with a proper shaded signature in Hoodie Spark bundle but this may not be picked if hoodie-common jar is als

Re: Failed to bulk insert

2019-03-23 Thread Umesh Kacha
Hi I filtered out nulls in dataframe for review_date field and it went ahead but failed with the following exception. It looks like some run time libs are missing I thought com.uber.hoodie:hoodie-spark-bundle:0.4.5 is uber jar it has all the transitive dependencies it need. No? org.apache.spark.Sp

Re: Failed to bulk insert

2019-03-10 Thread Vinoth Chandar
+1 yes. if its actually null. Good catch, Frank! :) On Sun, Mar 10, 2019 at 7:23 PM kaka chen wrote: > A possible root cause is the filed of record is null. > > public static String getNestedFieldValAsString(GenericRecord record, > String fieldName) { > String[] parts = fieldName.split("\\.")

Re: Failed to bulk insert

2019-03-10 Thread kaka chen
A possible root cause is the filed of record is null. public static String getNestedFieldValAsString(GenericRecord record, String fieldName) { String[] parts = fieldName.split("\\."); GenericRecord valueNode = record; int i = 0; for (;i < parts.length; i++) { String part = parts[i];

Re: Failed to bulk insert

2019-03-09 Thread Vinoth Chandar
Hmmm. Thats interesting. I can see that the parsing works, since the exception said "Part - review_date". There are definitely users who have done this before. So not sure what's going on. Can you paste the generated Avro schema? following is the corresponding code line log.info(s"Registered avro

Re: Failed to bulk insert

2019-03-09 Thread Umesh Kacha
Hi Vinoth thanks I have already did and checked that please see red column highlighted below. root |-- marketplace: string (nullable = true) |-- customer_id: string (nullable = true) |-- review_id: string (nullable = true) |-- product_id: string (nullable = true) |-- product_parent: string (nullab

Re: Failed to bulk insert

2019-03-09 Thread Vinoth Chandar
Hi, >>review_date(Part -review_date) field not found in record Seems like the precombine field is not in the input DF? Can you try doing df1.printSchema and check that once? On Sat, Mar 9, 2019 at 11:52 AM Umesh Kacha wrote: > Hi I have the following code using which I am trying to bulk inser

Failed to bulk insert

2019-03-09 Thread Umesh Kacha
Hi I have the following code using which I am trying to bulk insert huge csv file loaded into Spark DataFrame but it fails saying column review_date not found but that column is definitely there in dataframe. Please guide. df1.write .format("com.uber.hoodie") .option(DataSourceWriteOpt