[ https://issues.apache.org/jira/browse/KNOX-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650250#comment-16650250 ]
Kevin Risden commented on KNOX-1524: ------------------------------------ h2. Test Case and Reproduction The following results are tested with: * a single 4 core 8GB RAM Centos 7 VM on my Macbook Pro laptop * openjdk version "1.8.0_181" * Hadoop 3.1.1 single node pseudo distributed * Hive 3.1.0 with single HiveServer2 node ** {code:java} /opt/apache-hive-3.1.0-bin/bin/hiveserver2 --hiveconf hive.server2.transport.mode=http --hiveconf hive.server2.enable.doAs=false --hiveconf fs.hdfs.impl.disable.cache=true --hiveconf fs.file.impl.disable.cache=true{code} ** Enabling or disabling the filesystem cache did not change the results * Knox 1.1.0 without SSL * data set - [http://stat-computing.org/dataexpo/2009/the-data.html] - 1990.csv - 486MB * "select *" from table with single column * Limit to first 1 million rows Create table {code:java} CREATE TABLE tbl (a string) STORED AS TEXTFILE LOCATION '/tmp/1990';{code} Testing commands * HDFS native ** {code:java} time hdfs dfs -text /tmp/1990/1990.csv | head -n 1000000 > /dev/null{code} * Hive binary ** {code:java} time /opt/apache-hive-3.1.0-bin/bin/beeline -u 'jdbc:hive2://hive.vagrant:10000/' -n admin -p admin-password -e 'select * from tbl limit 1000000' > /dev/null{code} * Hive HTTP ** {code:java} time /opt/apache-hive-3.1.0-bin/bin/beeline -u 'jdbc:hive2://hive.vagrant:10001/;transportMode=http;httpPath=cliservice' -n admin -p admin-password -e 'select * from tbl limit 1000000' > /dev/null{code} * Hive Knox ** {code:java} time /opt/apache-hive-3.1.0-bin/bin/beeline -u 'jdbc:hive2://hive.vagrant:8443/;transportMode=http;httpPath=gateway/sandbox/hive' -n admin -p admin-password -e 'select * from tbl limit 1000000' > /dev/null{code} Assumptions * JVM startup time is approximately the same for each run * Hive is using native Hadoop libraries (checked with ps aux | grep native) > Hive "select *" performance evaluation > -------------------------------------- > > Key: KNOX-1524 > URL: https://issues.apache.org/jira/browse/KNOX-1524 > Project: Apache Knox > Issue Type: Task > Reporter: Kevin Risden > Assignee: Kevin Risden > Priority: Major > Fix For: 1.2.0 > > > While looking at WebHDFS performance in KNOX-1221, I decided to look a bit > more into performance for common use cases. Hive performance is another area > that could use some research. -- This message was sent by Atlassian JIRA (v7.6.3#76005)