Hi there,
Read your question and I do believe you are on right path. But what could be
worth checking is - are you able to connect to s3 bucket from your worker
nodes.
I did read that you are able to do it from your machine but since write
happens at the the worker end, it might be worth
I hope you are using the Query object that is returned by the Structured
streaming, right?
Returned object contains a lot of information about each query and tracking
state of the object should be helpful.
Hope this may help, if not can you please share more details with examples?
Best,
A
--
If I understand correctly, you need to create a UDF (if you are using java
Extend appropriate UDF e.g. UDF1, UDF2 ..etc depending on number of
arguments and have this static list as a member variable in your class.
You can use this udf as filter in your stream directly.
On Tue, Feb 21, 2017 at
Oh that's easy ... just add this to the above statement for each duplicate
column -
.drop(rightDF.col("x")).drop(rightDF.col("y")).
thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Left-Right-Outer-join-on-multiple-Columns-tp26293p26328.html
did you try this -
DataFrame joinedDf_intersect =
leftDf.select("x", "y", "z")
.join(rightDf,leftDf.col("x").equalTo(rightDf.col("x"))
.and(leftDf.col("y").equalTo(rightDf.col("y"))), "left_outer") ;
Hope that helps.
On Mon, Feb 22, 2016 at 12:22 PM, praneshvyas [via Apache Spark User List] <
Did you get any resolution for this?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-YARN-using-Java-1-8-fails-tp24925p25039.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi there,
I have saved my records in to parquet format and am using Spark1.5. But when
I try to fetch the columns it throws exception*
java.lang.ClassCastException: java.lang.Long cannot be cast to
org.apache.spark.unsafe.types.UTF8String*.
This filed is saved as String while writing parquet. so