A signature in Logging.class refers to type Logger in package org.slf4j which is not available.

2016-05-02 Thread Kapil Raaj
Hi folks, I am suddenly seeing : Error:scalac: bad symbolic reference. A signature in Logging.class refers to type Logger in package org.slf4j which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version

Re: Spark groupby and agg inconsistent and missing data

2015-12-10 Thread Kapil Raaj
Hi Folks, I am also getting similar issue: (df.groupBy("email").agg(last("user_id") as "user_id").select("user_id").count,df.groupBy("email").agg(last("user_id") as "user_id").select("user_id").distinct.count) When run on one computer it gives: (15123144,15123144) When run on cluster it gives:

Getting ParquetDecodingException when I am running my spark application from spark-submit

2015-11-24 Thread Kapil Raaj
The relevant error lines are: Caused by: parquet.io.ParquetDecodingException: Can't read value in column [roll_key] BINARY at value 19600 out of 4814, 19600 out of 19600 in currentPage. repetition level: 0, definition level: 1 Caused by: org.apache.spark.SparkException: Job aborted due to stage

Enriching df.write.jdbc

2015-10-04 Thread Kapil Raaj
Hello folks, I would like to contribute code to enrich DataFrame writer api for JDBC to cover "Update table" feature based on some field name/key passed as LIST of Strings. Use Case: 1. df.write.mode(*"Update"*).jdbc(connectionString, "table_name" ,connectionProperties, *keys*) Or 2.