你是想要调试HiveCatalog的代码么?可以参考flink里的测试用例,我们有的测试是用embedded模式做的(比如HiveCatalogHiveMetadataTest),有些测试是单独起一个HMS进程(比如TableEnvHiveConnectorTest)。
On Wed, May 27, 2020 at 7:27 PM Zhou Zach <wander...@163.com> wrote: > 是的,发现了,感谢指点。请教下,用intellij > idea调试,你是在本地调试吗,那样的话,要在本地搭建个hadoop集群吗,至少要搭建个本地的hive吧,还是直接用intellij > idea连接远程,如果集群在阿里云上,是不是要另外开端口的 > > > > > > > > > > > > > > > > > > 在 2020-05-27 19:19:58,"Rui Li" <lirui.fu...@gmail.com> 写道: > >year在calcite里是保留关键字,你用`year`试试呢 > > > >On Wed, May 27, 2020 at 7:09 PM Zhou Zach <wander...@163.com> wrote: > > > >> The program finished with the following exception: > >> > >> > >> org.apache.flink.client.program.ProgramInvocationException: The main > >> method caused an error: SQL parse failed. Encountered "year =" at line > 4, > >> column 51. > >> Was expecting one of: > >> "ARRAY" ... > >> "CASE" ... > >> "CURRENT" ... > >> "CURRENT_CATALOG" ... > >> "CURRENT_DATE" ... > >> "CURRENT_DEFAULT_TRANSFORM_GROUP" ... > >> "CURRENT_PATH" ... > >> "CURRENT_ROLE" ... > >> "CURRENT_SCHEMA" ... > >> "CURRENT_TIME" ... > >> "CURRENT_TIMESTAMP" ... > >> "CURRENT_USER" ... > >> "DATE" ... > >> "EXISTS" ... > >> "FALSE" ... > >> "INTERVAL" ... > >> "LOCALTIME" ... > >> "LOCALTIMESTAMP" ... > >> "MULTISET" ... > >> "NEW" ... > >> "NEXT" ... > >> "NOT" ... > >> "NULL" ... > >> "PERIOD" ... > >> "SESSION_USER" ... > >> "SYSTEM_USER" ... > >> "TIME" ... > >> "TIMESTAMP" ... > >> "TRUE" ... > >> "UNKNOWN" ... > >> "USER" ... > >> <UNSIGNED_INTEGER_LITERAL> ... > >> <APPROX_NUMERIC_LITERAL> ... > >> <DECIMAL_NUMERIC_LITERAL> ... > >> <BINARY_STRING_LITERAL> ... > >> <QUOTED_STRING> ... > >> <PREFIXED_STRING_LITERAL> ... > >> <UNICODE_STRING_LITERAL> ... > >> <LBRACE_D> ... > >> <LBRACE_T> ... > >> <LBRACE_TS> ... > >> <LBRACE_FN> ... > >> "?" ... > >> "+" ... > >> "-" ... > >> <BRACKET_QUOTED_IDENTIFIER> ... > >> <QUOTED_IDENTIFIER> ... > >> <BACK_QUOTED_IDENTIFIER> ... > >> <IDENTIFIER> ... > >> <UNICODE_QUOTED_IDENTIFIER> ... > >> "CAST" ... > >> "EXTRACT" ... > >> "POSITION" ... > >> "CONVERT" ... > >> "TRANSLATE" ... > >> "OVERLAY" ... > >> "FLOOR" ... > >> "CEIL" ... > >> "CEILING" ... > >> "SUBSTRING" ... > >> "TRIM" ... > >> "CLASSIFIER" ... > >> "MATCH_NUMBER" ... > >> "RUNNING" ... > >> "PREV" ... > >> "JSON_EXISTS" ... > >> "JSON_VALUE" ... > >> "JSON_QUERY" ... > >> "JSON_OBJECT" ... > >> "JSON_OBJECTAGG" ... > >> "JSON_ARRAY" ... > >> "JSON_ARRAYAGG" ... > >> "MAP" ... > >> "SPECIFIC" ... > >> "ABS" ... > >> "AVG" ... > >> "CARDINALITY" ... > >> "CHAR_LENGTH" ... > >> "CHARACTER_LENGTH" ... > >> "COALESCE" ... > >> "COLLECT" ... > >> "COVAR_POP" ... > >> "COVAR_SAMP" ... > >> "CUME_DIST" ... > >> "COUNT" ... > >> "DENSE_RANK" ... > >> "ELEMENT" ... > >> "EXP" ... > >> "FIRST_VALUE" ... > >> "FUSION" ... > >> "GROUPING" ... > >> "HOUR" ... > >> "LAG" ... > >> "LEAD" ... > >> "LEFT" ... > >> "LAST_VALUE" ... > >> "LN" ... > >> "LOWER" ... > >> "MAX" ... > >> "MIN" ... > >> "MINUTE" ... > >> "MOD" ... > >> "MONTH" ... > >> "NTH_VALUE" ... > >> "NTILE" ... > >> "NULLIF" ... > >> "OCTET_LENGTH" ... > >> "PERCENT_RANK" ... > >> "POWER" ... > >> "RANK" ... > >> "REGR_COUNT" ... > >> "REGR_SXX" ... > >> "REGR_SYY" ... > >> "RIGHT" ... > >> "ROW_NUMBER" ... > >> "SECOND" ... > >> "SQRT" ... > >> "STDDEV_POP" ... > >> "STDDEV_SAMP" ... > >> "SUM" ... > >> "UPPER" ... > >> "TRUNCATE" ... > >> "VAR_POP" ... > >> "VAR_SAMP" ... > >> "YEAR" ... > >> "YEAR" "(" ... > >> > >> > >> at > >> > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335) > >> at > >> > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205) > >> at > org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138) > >> at > >> > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664) > >> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > >> at > >> > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895) > >> at > >> > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968) > >> at java.security.AccessController.doPrivileged(Native Method) > >> at javax.security.auth.Subject.doAs(Subject.java:422) > >> at > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > >> at > >> > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968) > >> Caused by: org.apache.flink.table.api.SqlParserException: SQL parse > >> failed. Encountered "year =" at line 4, column 51. > >> Was expecting one of: > >> "ARRAY" ... > >> "CASE" ... > >> "CURRENT" ... > >> "CURRENT_CATALOG" ... > >> "CURRENT_DATE" ... > >> "CURRENT_DEFAULT_TRANSFORM_GROUP" ... > >> "CURRENT_PATH" ... > >> "CURRENT_ROLE" ... > >> "CURRENT_SCHEMA" ... > >> "CURRENT_TIME" ... > >> "CURRENT_TIMESTAMP" ... > >> "CURRENT_USER" ... > >> "DATE" ... > >> "EXISTS" ... > >> "FALSE" ... > >> "INTERVAL" ... > >> "LOCALTIME" ... > >> "LOCALTIMESTAMP" ... > >> "MULTISET" ... > >> "NEW" ... > >> "NEXT" ... > >> "NOT" ... > >> "NULL" ... > >> "PERIOD" ... > >> "SESSION_USER" ... > >> "SYSTEM_USER" ... > >> "TIME" ... > >> "TIMESTAMP" ... > >> "TRUE" ... > >> "UNKNOWN" ... > >> "USER" ... > >> <UNSIGNED_INTEGER_LITERAL> ... > >> <APPROX_NUMERIC_LITERAL> ... > >> <DECIMAL_NUMERIC_LITERAL> ... > >> <BINARY_STRING_LITERAL> ... > >> <QUOTED_STRING> ... > >> <PREFIXED_STRING_LITERAL> ... > >> <UNICODE_STRING_LITERAL> ... > >> <LBRACE_D> ... > >> <LBRACE_T> ... > >> <LBRACE_TS> ... > >> <LBRACE_FN> ... > >> "?" ... > >> "+" ... > >> "-" ... > >> <BRACKET_QUOTED_IDENTIFIER> ... > >> <QUOTED_IDENTIFIER> ... > >> <BACK_QUOTED_IDENTIFIER> ... > >> <IDENTIFIER> ... > >> <UNICODE_QUOTED_IDENTIFIER> ... > >> "CAST" ... > >> "EXTRACT" ... > >> "POSITION" ... > >> "CONVERT" ... > >> "TRANSLATE" ... > >> "OVERLAY" ... > >> "FLOOR" ... > >> "CEIL" ... > >> "CEILING" ... > >> "SUBSTRING" ... > >> "TRIM" ... > >> "CLASSIFIER" ... > >> "MATCH_NUMBER" ... > >> "RUNNING" ... > >> "PREV" ... > >> "JSON_EXISTS" ... > >> "JSON_VALUE" ... > >> "JSON_QUERY" ... > >> "JSON_OBJECT" ... > >> "JSON_OBJECTAGG" ... > >> "JSON_ARRAY" ... > >> "JSON_ARRAYAGG" ... > >> "MAP" ... > >> "SPECIFIC" ... > >> "ABS" ... > >> "AVG" ... > >> "CARDINALITY" ... > >> "CHAR_LENGTH" ... > >> "CHARACTER_LENGTH" ... > >> "COALESCE" ... > >> "COLLECT" ... > >> "COVAR_POP" ... > >> "COVAR_SAMP" ... > >> "CUME_DIST" ... > >> "COUNT" ... > >> "DENSE_RANK" ... > >> "ELEMENT" ... > >> "EXP" ... > >> "FIRST_VALUE" ... > >> "FUSION" ... > >> "GROUPING" ... > >> "HOUR" ... > >> "LAG" ... > >> "LEAD" ... > >> "LEFT" ... > >> "LAST_VALUE" ... > >> "LN" ... > >> "LOWER" ... > >> "MAX" ... > >> "MIN" ... > >> "MINUTE" ... > >> "MOD" ... > >> "MONTH" ... > >> "NTH_VALUE" ... > >> "NTILE" ... > >> "NULLIF" ... > >> "OCTET_LENGTH" ... > >> "PERCENT_RANK" ... > >> "POWER" ... > >> "RANK" ... > >> "REGR_COUNT" ... > >> "REGR_SXX" ... > >> "REGR_SYY" ... > >> "RIGHT" ... > >> "ROW_NUMBER" ... > >> "SECOND" ... > >> "SQRT" ... > >> "STDDEV_POP" ... > >> "STDDEV_SAMP" ... > >> "SUM" ... > >> "UPPER" ... > >> "TRUNCATE" ... > >> "VAR_POP" ... > >> "VAR_SAMP" ... > >> "YEAR" ... > >> "YEAR" "(" ... > >> > >> > >> at > >> > org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:50) > >> at > >> > org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64) > >> at > >> > org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484) > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> 在 2020-05-27 19:08:09,"Rui Li" <lirui.fu...@gmail.com> 写道: > >> >读hive分区表报的什么错啊,把stacktrace贴一下? > >> > > >> >On Wed, May 27, 2020 at 6:08 PM Zhou Zach <wander...@163.com> wrote: > >> > > >> >> > >> >> > >> >> hive partition table: > >> >> > >> >> > >> >> 1CREATE TABLE `dwd.bill`( > >> >> 2 `id` bigint, > >> >> 3 `gid` bigint, > >> >> 4 `count` bigint, > >> >> 5 `price` bigint, > >> >> 6 `srcuid` bigint, > >> >> 7 `srcnickname` string, > >> >> 8 `srcleftmoney` bigint, > >> >> 9 `srcwealth` bigint, > >> >> 10 `srccredit` decimal(10,0), > >> >> 11 `dstnickname` string, > >> >> 12 `dstuid` bigint, > >> >> 13 `familyid` int, > >> >> 14 `dstleftmoney` bigint, > >> >> 15 `dstwealth` bigint, > >> >> 16 `dstcredit` decimal(10,0), > >> >> 17 `addtime` bigint, > >> >> 18 `type` int, > >> >> 19 `getmoney` decimal(10,0), > >> >> 20 `os` int, > >> >> 21 `bak` string, > >> >> 22 `getbonus` decimal(10,0), > >> >> 23 `unionbonus` decimal(10,0)) > >> >> 24PARTITIONED BY ( > >> >> 25 `year` int, > >> >> 26 `month` int) > >> >> 27ROW FORMAT SERDE > >> >> 28 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > >> >> 29STORED AS INPUTFORMAT > >> >> 30 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > >> >> 31OUTPUTFORMAT > >> >> 32 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' > >> >> > >> >> > >> >> > >> >> > >> >> Query: > >> >> > >> >> > >> >> tableEnv.sqlUpdate( > >> >> """ > >> >> | > >> >> |INSERT INTO catalog2.dwd.orders > >> >> |select srcuid, price from catalog2.dwd.bill where year = > 2020 > >> >> | > >> >> | > >> >> |""".stripMargin) > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> >> 在 2020-05-27 18:01:19,"Leonard Xu" <xbjt...@gmail.com> 写道: > >> >> >Flink 支持hive分区表的,看你在另外一个邮件里贴了,你能把你的hive表和query在邮件里贴下吗? > >> >> > > >> >> >祝好 > >> >> >Leonard Xu > >> >> > > >> >> >> 在 2020年5月27日,17:40,Zhou Zach <wander...@163.com> 写道: > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> 感谢回复,表名前加上Catalog和db前缀可以成功访问了。 > >> >> >> 现在遇到个问题,flink 读hive > >> >> 分区表时,如果where子句用分区键,比如year过滤就会报错,用表中其他字段过滤是没问题的,是flink 不支持 > >> hive分区表,还是哪个地方没设置对 > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> 在 2020-05-27 17:33:11,"Leonard Xu" <xbjt...@gmail.com> 写道: > >> >> >>> Hi, > >> >> >>>> 因为一个HiveCatalog只能关联一个库 > >> >> >>> 一个Catalog是可以关联到多个db的,不同catalog,不同db中表都可以访问的. > >> >> >>> > >> >> >>> Flink SQL> show catalogs; > >> >> >>> default_catalog > >> >> >>> myhive > >> >> >>> Flink SQL> use catalog myhive; > >> >> >>> Flink SQL> show databases; > >> >> >>> default > >> >> >>> hive_test > >> >> >>> hive_test1 > >> >> >>> Flink SQL> select * from hive_test.db2_table union select * from > >> >> myhive.hive_test1.db1_table; > >> >> >>> 2020-05-27 17:25:48,565 INFO > org.apache.hadoop.hive.conf.HiveConf > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> 祝好 > >> >> >>> Leonard Xu > >> >> >>> > >> >> >>> > >> >> >>>> 在 2020年5月27日,10:55,Zhou Zach <wander...@163.com> 写道: > >> >> >>>> > >> >> >>>> hi all, > >> >> >>>> Flink sql 的HiveCatalog 是不是不能跨库操作啊,就是一个flink > >> >> sql中join的两个表涉及到两个不同到库,因为一个HiveCatalog只能关联一个库 > >> >> > >> > > >> > > >> >-- > >> >Best regards! > >> >Rui Li > >> > > > > > >-- > >Best regards! > >Rui Li > -- Best regards! Rui Li