Window function / streaming

2017-07-04 Thread Julien CHAMP
Hi there ! Let me explain my problem to see if you have a good solution to help me :) Let's imagine that I have all my data in a DB or a file, that I load in a dataframe DF with the following columns : *id | timestamp(ms) | value* A | 100 | 100 A | 110 | 50 B | 100 | 100 B |

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop
It seems to be because of this issues: https://issues.apache.org/jira/browse/SPARK-10925 I added a checkpoint, as suggested, to break the lineage and it worked. Best regards, 2017-07-04 17:26 GMT+02:00 Bernard Jesop : > Thank Didac, > > My bad, actually this code is

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop
Thank Didac, My bad, actually this code is incomplete, it should have been : - dfAgg = df.groupBy("S_ID").agg(...). I want to access the aggregated values (of dfAgg) for each row of 'df', that is why I do a left outer join. Also, regarding the second parameter, I am using this signature of

Kafka 0.10 with PySpark

2017-07-04 Thread Daniel van der Ende
Hi, I'm working on integrating some pyspark code with Kafka. We'd like to use SSL/TLS, and so want to use Kafka 0.10. Because structured streaming is still marked alpha, we'd like to use Spark streaming. On this page, however, it indicates that the Kafka 0.10 integration in Spark does not support

RE: [PySpark] - running processes and computing time

2017-07-04 Thread Sidney Feiner
To initialize it per executor, I used a class with only class attibutes and class methods (like an `object` in Scala), but because I was using PySpark, it was actually being initiated per process ☹ What I went for was the broadcast variable but there still is something suspicious with my

SparkSession via HS2 - is it supported?

2017-07-04 Thread Sudha KS
This is the code: created a java class by extending org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; ,and creates a sparkSession as- SparkSession spark = SparkSession.builder().enableHiveSupport().master("yarn-client").appName("SampleSparkUDTF_yarnV1").getOrCreate(); ,and tries to

Re: test mail

2017-07-04 Thread Sudhanshu Janghel
test email recieved ;p On 4 Jul 2017 7:40 am, "Sudha KS" wrote: -- *Disclaimer: The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended

test mail

2017-07-04 Thread Sudha KS