I have written this simple code to try streaming aggregation in spark 2.4.
Somehow, job keeps running but not returning any result. It returns me 3
columns JobType, Timestamp and TS if I remove groupby and count aggregation
function.
Would really appreciate any help.
val edgeDF = spark
1 - I am not sure how can I do what you suggest for #1 because I use the
entries in the initial df to build the query and then from it I get the second
df. Could you explain more?
2 - I also thought about doing what you consider in #2 , but if I am not
mistaken If I use regular Scala data
Hi
The spark postgres JDBC reader is limited because it relies on basic
SELECT statements with fetchsize and crashes on large tables even if
multiple partitions are setup with lower/upper bounds.
I am about writing a new postgres JDBC reader based on "COPY TO STDOUT".
It would stream the data
https://issues.apache.org/jira/browse/HIVE-13632
李斌松 于2018年12月29日周六 下午4:08写道:
> Hive has fixed this problem, which is not fixed in
> hive-exec-1.2.1.spark2.jar
>
> [image: image.png]
>
>