Could you check the Spark web UI for the number of tasks issued when the
query is executed? I digged out |mapred.map.tasks| because I saw 2 tasks
were issued.
On 2/26/15 3:01 AM, Kannan Rajah wrote:
Cheng, We tried this setting and it still did not help. This was on
Spark 1.2.0.
--
Kannan
On Mon, Feb 23, 2015 at 6:38 PM, Cheng Lian <lian.cs....@gmail.com
<mailto:lian.cs....@gmail.com>> wrote:
(Move to user list.)
Hi Kannan,
You need to set |mapred.map.tasks| to 1 in hive-site.xml. The
reason is this line of code
<https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala#L68>,
which overrides |spark.default.parallelism|. Also,
|spark.sql.shuffle.parallelism| isn’t used here since there’s no
shuffle involved (we only need to sort within a partition).
Default value of |mapred.map.tasks| is 2
<https://hadoop.apache.org/docs/r1.0.4/mapred-default.html>. You
may see that the Spark SQL result can be divided into two sorted
parts from the middle.
Cheng
On 2/19/15 10:33 AM, Kannan Rajah wrote:
According to hive documentation, "sort by" is supposed to order the results
for each reducer. So if we set a single reducer, then the results should be
sorted, right? But this is not happening. Any idea why? Looks like the
settings I am using to restrict the number of reducers is not having an
effect.
*Tried the following:*
Set spark.default.parallelism to 1
Set spark.sql.shuffle.partitions to 1
These were set in hive-site.xml and also inside spark shell.
*Spark-SQL*
create table if not exists testSortBy (key int, name string, age int);
LOAD DATA LOCAL INPATH '/home/mapr/sample-name-age.txt' OVERWRITE INTO TABLE
testSortBy;
select * from testSortBY;
1 Aditya 28
2 aash 25
3 prashanth 27
4 bharath 26
5 terry 27
6 nanda 26
7 pradeep 27
8 pratyay 26
set spark.default.parallelism=1;
set spark.sql.shuffle.partitions=1;
select name,age from testSortBy sort by age; aash 25 bharath 26 prashanth
27 Aditya 28 nanda 26 pratyay 26 terry 27 pradeep 27 *HIVE* select name,age
from testSortBy sort by age;
aash 25
bharath 26
nanda 26
pratyay 26
prashanth 27
terry 27
pradeep 27
Aditya 28
--
Kannan