*Hi, community, I have setup 3 nodes spark cluster using standalone mode, each machine's memery is 16G, the core is 4. *
*when i run " val file = sc.textFile("/user/hive/warehouse/b/test.txt") file.filter(line => line.contains("2013-")).count() "* *it cost 2.7s , * *but , when i run "select count(*) from b;" using shark, it cost 15.81s, * *So,Why shark using more time than spark? * *other info:* *1. i have set export SPARK_MEM=10g in shark-env.sh2. * *test.txt is 4.21G which exists on each machine's directory /user/hive/warehouse/b/ and * *test.txt has been loaded into memery.* *3. there are 38532979 lines in test.txt*