Because of some legacy issues I can't immediately upgrade spark version. But I
try filter data before loading it into spark based on the suggestion by
val df = sparkSession.read.format("jdbc").option(...).option("dbtable",
"(select .. from ... where url <> '') table_name")load()
Hi James,
It is always advisable to use the latest SPARK version. That said, can you
please giving a try to dataframes and udf if possible. I think, that would
be a much scalable way to address the issue.
Also in case possible, it is always advisable to use the filter option
before fetching the
I am very new to Spark. Just successfully setup Spark SQL connecting to
postgresql database, and am able to display table with code
sparkSession.sql("SELECT id, url from table_a where col_b <> '' ").show()
Now I want to perform filter and map function on col_b value. In plain scala it