Re: Spark hive udf: no handler for UDAF analysis exception

2018-09-05 Thread Swapnil Chougule
Looks like Spark Session has only implementation for UDAF but not for UDF. Is it a bug or some work around is there ? T.Gaweda has opened JIRA for this. SPARK-25334 Thanks, Swapnil On Tue, Sep 4, 2018 at 4:20 PM Swapnil Chougule wrote: > Created one project 'spark-udf' & written h

Spark hive udf: no handler for UDAF analysis exception

2018-09-04 Thread Swapnil Chougule
Created one project 'spark-udf' & written hive udf as below: package com.spark.udf import org.apache.hadoop.hive.ql.exec.UDF class UpperCase extends UDF with Serializable { def evaluate(input: String): String = { input.toUpperCase } Built it & created jar for it.

Type change support in spark parquet read-write

2018-08-31 Thread Swapnil Chougule
Hi Folks, I came across one problem while reading parquet through spark. One parquet has been written with field 'a' with type 'Integer'. Afterwards, reading this file with schema for 'a' as 'Long' gives exception. I thought this compatible type change is supported. But this is not working. Code

Spark udf from external jar without enabling Hive

2018-08-29 Thread Swapnil Chougule
Hi Team, I am creating udf as follow from external jar val spark = SparkSession.builder.appName("UdfUser") .master("local") .enableHiveSupport() .getOrCreate() spark.sql("CREATE FUNCTION uppercase AS 'path.package.udf.UpperCase' " + "USING JAR

Re: Spark structured streaming generate output path runtime

2018-06-04 Thread Swapnil Chougule
format(“text”).partitionBy(“flourtimestamp”).option(“path”, > “/home/data”).option("checkpointLocation","./checkpoint").start() > > > > The UDF will be called for every row. And partitionBy will create a folder > within /home/data > > > > *From: *

Spark structured streaming generate output path runtime

2018-06-01 Thread Swapnil Chougule
Hi I want to generate output directory runtime for data. Directory name is derived from current timestamp. Lets say, data for same minute should go into same directory. I tried following snippet but it didn't work. All data is being written in same directory (created with respect to initial

Re: Event time aggregation is possible in Spark Streaming ?

2017-07-10 Thread Swapnil Chougule
Thanks Michael for update Regards, Swapnil On 10 Jul 2017 11:50 p.m., "Michael Armbrust" <mich...@databricks.com> wrote: > Event-time aggregation is only supported in Structured Streaming. > > On Sat, Jul 8, 2017 at 4:18 AM, Swapnil Chougule <the.swapni...@gm

Event time aggregation is possible in Spark Streaming ?

2017-07-08 Thread Swapnil Chougule
Hello, I want to know whether event time aggregation in spark streaming. I could see it's possible in structured streaming. As I am working on conventional spark streaming, I need event time aggregation in it. I checked but didn't get any relevant documentation. Thanks in advance Regards,

Kafka 0.8.x / 0.9.x support in structured streaming

2017-05-15 Thread Swapnil Chougule
Hello I am new to structured streaming. Wanted to learn if there is support for Kafka 0.8.x or Kafka 0.9.x in structured streaming ? My Kafka source is of version 0.9.x & want get have structured streaming solution on top of it. I checked documentation for Spark release 2.1.0 but didn't get exact