[ https://issues.apache.org/jira/browse/SPARK-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695617#comment-14695617 ]
Davies Liu commented on SPARK-9604: ----------------------------------- [~cloud_fan] Yeah, The test looks much better now. Because [~rxin] had removed the old generated aggregation, so it ran in safe mode now. It's still very slow if we do distinct on ArrayData or ArrayMap in tungsten mode. > Unsafe ArrayData and MapData is very very slow > ---------------------------------------------- > > Key: SPARK-9604 > URL: https://issues.apache.org/jira/browse/SPARK-9604 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Davies Liu > Assignee: Wenchen Fan > Priority: Critical > > After the unsafe ArrayData and MapData merged in, this test become very slow > (from less than 1 second to more than 35 seconds). > https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.3,label=centos/3157/testReport/org.apache.spark.sql.columnar/InMemoryColumnarQuerySuite/test_different_data_types/history/ > I tried to disable the cache, it's still very slow (also most the same), once > remove ArrayData and ArrayMap, it become much faster (still take about 10 > seconds). > Related changes: > https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.3,label=centos/3148/changes > Also the duration of Hive tests increased from 32min to 45min > https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.3,label=centos/3154/testReport/junit/org.apache.spark.sql.hive.execution/history/ > cc [~rxin] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org