Re: Spark 2.2 With Column usage
Hi, Why are you doing the following two lines? .select("id",lit(referenceFiltered)) .selectexpr( "id" ) What are you trying to achieve? What's lit and what's referenceFiltered? What's the difference between select and selectexpr? Please start at http://spark.apache.org/docs/latest/sql-programming-guide.html and then hop onto http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package to know the Spark API better. I'm sure you'll quickly find out the answer(s). Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski The Internals of Spark SQL https://bit.ly/spark-sql-internals The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming The Internals of Apache Kafka https://bit.ly/apache-kafka-internals Follow me at https://twitter.com/jaceklaskowski On Sat, Jun 8, 2019 at 12:53 PM anbutech wrote: > Thanks Jacek Laskowski Sir.but i didn't get the point here > > please advise the below one are you expecting: > > dataset1.as("t1) > > join(dataset3.as("t2"), > > col(t1.col1) === col(t2.col1), JOINTYPE.Inner ) > > .join(dataset4.as("t3"), col(t3.col1) === col(t1.col1), > > JOINTYPE.Inner) > .select("id",lit(referenceFiltered)) > .selectexpr( > "id" > ) > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
Re: Spark 2.2 With Column usage
Thanks Jacek Laskowski Sir.but i didn't get the point here please advise the below one are you expecting: dataset1.as("t1) join(dataset3.as("t2"), col(t1.col1) === col(t2.col1), JOINTYPE.Inner ) .join(dataset4.as("t3"), col(t3.col1) === col(t1.col1), JOINTYPE.Inner) .select("id",lit(referenceFiltered)) .selectexpr( "id" ) -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Spark 2.2 With Column usage
Hi, > val referenceFiltered = dataset2.filter(.dataDate == date).filter.someColumn).select("id").toString > .withColumn("new_column",lit(referenceFiltered)) That won't work since lit is a function (adapter) to convert Scala values to Catalyst expressions. Unless I'm mistaken, in your case, what you really need is to replace `withColumn` with `select("id")` itself and you're done. When I'm writing this (I'm saying exactly what you actually have already) and I'm feeling confused. Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski The Internals of Spark SQL https://bit.ly/spark-sql-internals The Internals of Spark Structured Streaming https://bit.ly/spark-structured-streaming The Internals of Apache Kafka https://bit.ly/apache-kafka-internals Follow me at https://twitter.com/jaceklaskowski On Sat, Jun 8, 2019 at 6:05 AM anbutech wrote: > Hi Sir, > > Could you please advise to fix the below issue in the withColumn in the > spark 2.2 scala 2.11 joins > > def processing(spark:SparkSession, > > dataset1:Dataset[Reference], > > dataset2:Dataset[DataCore], > > dataset3:Dataset[ThirdPartyData] , > > dataset4:Dataset[OtherData] > > date:String):Dataset[DataMerge] { > > val referenceFiltered = dataset2.filter(.dataDate == > date).filter.someColumn).select("id").toString > > dataset1.as("t1) > > join(dataset3.as("t2"), > > col(t1.col1) === col(t2.col1), JOINTYPE.Inner ) > > .join(dataset4.as("t3"), col(t3.col1) === col(t1.col1), > > JOINTYPE.Inner) > > .withColumn("new_column",lit(referenceFiltered)) > > .selectexpr( > > "id", ---> want to get this value > > "column1, > > "column2, > > "column3", > > "column4" ) > > } > > how do i get the String value ,let say the value"124567" > ("referenceFiltered") inside the withColumn? > > im getting the withColumn output as "id:BigInt" . I want to get the same > value for all the records. > > Note: > > I have asked not use cross join in the code. Any other way to fix this > issue. > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
Spark 2.2 With Column usage
Hi Sir, Could you please advise to fix the below issue in the withColumn in the spark 2.2 scala 2.11 joins def processing(spark:SparkSession, dataset1:Dataset[Reference], dataset2:Dataset[DataCore], dataset3:Dataset[ThirdPartyData] , dataset4:Dataset[OtherData] date:String):Dataset[DataMerge] { val referenceFiltered = dataset2.filter(.dataDate == date).filter.someColumn).select("id").toString dataset1.as("t1) join(dataset3.as("t2"), col(t1.col1) === col(t2.col1), JOINTYPE.Inner ) .join(dataset4.as("t3"), col(t3.col1) === col(t1.col1), JOINTYPE.Inner) .withColumn("new_column",lit(referenceFiltered)) .selectexpr( "id", ---> want to get this value "column1, "column2, "column3", "column4" ) } how do i get the String value ,let say the value"124567" ("referenceFiltered") inside the withColumn? im getting the withColumn output as "id:BigInt" . I want to get the same value for all the records. Note: I have asked not use cross join in the code. Any other way to fix this issue. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org