You can use EXPLAIN statement to see optimized plan for each query. (
https://stackoverflow.com/questions/35883620/spark-how-can-get-the-logical-physical-query-execution-using-thirft-hive
).
2018-03-19 0:52 GMT+07:00 CPC :
> Hi nguyen,
>
> Thank you for quick response. But
Hi nguyen,
Thank you for quick response. But what i am trying to understand is in both
query predicate evolution require only one column. So actually spark does
not need to read all column in projection if they are not used in filter
predicate. Just to give an example, amazon redshift has this
Hi @CPC,
Parquet is column storage format, so if you want to read data from only one
column, you can do that without accessing all of your data. Spark SQL
consists of a query optimizer ( see
https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html),
so it will
Hi everybody,
I try to understand how spark reading parquet files but i am confused a
little bit. I have a table with 4 columns and named
businesskey,transactionname,request and response Request and response
columns are huge columns(10-50kb). when i execute a query like
"select * from mytable