Hello Hive Experts, I am a software engineer at Microsoft, and I am having trouble trying to run a standalone Hive metastore service on my Windows 10 machine. Your assistance would be greatly appreciated.
[0] Github project here (running branch "3.1"): link<https://github.com/apache/hive/tree/master/standalone-metastore> [1] Related documentation here: link<https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration> [2] Path to setup scripts: ($HOME\hive\standalone-metastore\src\main\scripts) I was able to build it successfully using Maven but having trouble running the jar. Here are my questions: 1. I believe this standalone metastore project is relatively new, as it was introduced in Hive 3.0. When was the stable version of the standalone metastore released? 1. While the documentation link [1] above helps with the config setup, is there a resource for step-by-step guidance on how to bootstrap the standalone metastore service on a Windows 10 or Linux machine? Without it, I've been trying to reverse engineer on how to get it running. * For starters, it seems that the scripts under the directory above [2] are not Windows 10 friendly because of carriage returns. And having to using Cygwin confused me on which path convention to use for the environment/system variables (E:\src\hive\... vs. /cygdrive/e/src/hive/). I removed the carriage returns and used the /cygdrive/ convention to get it working partially. * I had no clue which environment/system variables I needed and if there were any dependencies, which I assumed to be none because the related documentation [1] above notes the independent nature of the standalone metastore project. However, by studying the scripts (base, start-metastore, and metastore.sh) under the path above [2], I found two things: i. The need to define METASTORE_HOME ($HOME\hive\standalone-metastore\target\apache-hive-metastore-3.1.1-bin\apache-hive-metastore-3.1.1-bin) and METASTORE_CONF_DIR environment variables ii. The need to install Hadoop as a dependency because the metastore.sh script uses it to start the metastore service; hence, installing it and then defining the HADOOP_HOME ($HOME\hadoop-3.1.1) environment variable (I also had to remove the carriage returns under $HOME\hadoop-3.1.1) iii. I have no other environment variables or dependencies other than the ones aforementioned * At this point, the metastore service began to start running; however, I ran into an exception "Failed to get schema version" - more information here: link<https://community.hortonworks.com/questions/15136/orgapachehadoophivemetastorehivemetaexception-fail.html>. I believe this is because the default derby database was not initialized. * So, using the schematool script under my apache-hive-metastore-3.1.1-bin directory, I ran schematool --dbType derby -initSchema. Then I ran into an exception "Unknown version specified for initialization: 3.1.0" - the exception is thrown here link<https://github.com/apache/hive/blob/e867d1c693e966706d3b7c6fe18e039a85928f51/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java#L137>. It cannot find the derby schema script, but I confirmed that it is there ($HOME\hive\standalone-metastore\target\apache-hive-metastore-3.1.1-bin\apache-hive-metastore-3.1.1-bin\scripts\metastore\upgrade\derby\hive-schema-3.1.0.derby.sql). This led me to believe that there was a conflict again with file path conventions between Windows "\" and Linux "/" and I have faced a dead end. For the time being, I am redirecting my efforts on setting up a Linux machine to see if I would have a smoother experience, but any help for my concerns/issues above would be greatly appreciated. Thank you! Joo Wan Joo Wan