If you have to get the data into parquet format for other reasons then I
think count() on the parquet should be better. If it just the count you need
using database sending dbTable = (select count(*) from ) might be
quicker, t will avoid unnecessary data transfer from the database to
Hi Which of the following is better approach for too many values in database
final Dataset dataset = spark.sqlContext().read()
.format("jdbc")
.option("url", params.getJdbcUrl())
.option("driver", params.getDriver())