Hello again spark users and developers! I have standalone spark cluster (1.1.0) and spark sql running on it. My cluster consists of 4 datanodes and replication factor of files is 3.
I use thrift server to access spark sql and have 1 table with 30+ partitions. When I run query on whole table (something simple like select count(*) from t) spark produces a lot of network activity filling all available 1gb link. Looks like spark sent data by network instead of local reading. Is it any way to log which blocks were accessed locally and which are not? Thanks!