你是想要调试HiveCatalog的代码么?可以参考flink里的测试用例,我们有的测试是用embedded模式做的(比如HiveCatalogHiveMetadataTest),有些测试是单独起一个HMS进程(比如TableEnvHiveConnectorTest)。

On Wed, May 27, 2020 at 7:27 PM Zhou Zach <wander...@163.com> wrote:

> 是的,发现了,感谢指点。请教下,用intellij
> idea调试,你是在本地调试吗,那样的话,要在本地搭建个hadoop集群吗,至少要搭建个本地的hive吧,还是直接用intellij
> idea连接远程,如果集群在阿里云上,是不是要另外开端口的
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 在 2020-05-27 19:19:58,"Rui Li" <lirui.fu...@gmail.com> 写道:
> >year在calcite里是保留关键字,你用`year`试试呢
> >
> >On Wed, May 27, 2020 at 7:09 PM Zhou Zach <wander...@163.com> wrote:
> >
> >> The program finished with the following exception:
> >>
> >>
> >> org.apache.flink.client.program.ProgramInvocationException: The main
> >> method caused an error: SQL parse failed. Encountered "year =" at line
> 4,
> >> column 51.
> >> Was expecting one of:
> >>     "ARRAY" ...
> >>     "CASE" ...
> >>     "CURRENT" ...
> >>     "CURRENT_CATALOG" ...
> >>     "CURRENT_DATE" ...
> >>     "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
> >>     "CURRENT_PATH" ...
> >>     "CURRENT_ROLE" ...
> >>     "CURRENT_SCHEMA" ...
> >>     "CURRENT_TIME" ...
> >>     "CURRENT_TIMESTAMP" ...
> >>     "CURRENT_USER" ...
> >>     "DATE" ...
> >>     "EXISTS" ...
> >>     "FALSE" ...
> >>     "INTERVAL" ...
> >>     "LOCALTIME" ...
> >>     "LOCALTIMESTAMP" ...
> >>     "MULTISET" ...
> >>     "NEW" ...
> >>     "NEXT" ...
> >>     "NOT" ...
> >>     "NULL" ...
> >>     "PERIOD" ...
> >>     "SESSION_USER" ...
> >>     "SYSTEM_USER" ...
> >>     "TIME" ...
> >>     "TIMESTAMP" ...
> >>     "TRUE" ...
> >>     "UNKNOWN" ...
> >>     "USER" ...
> >>     <UNSIGNED_INTEGER_LITERAL> ...
> >>     <APPROX_NUMERIC_LITERAL> ...
> >>     <DECIMAL_NUMERIC_LITERAL> ...
> >>     <BINARY_STRING_LITERAL> ...
> >>     <QUOTED_STRING> ...
> >>     <PREFIXED_STRING_LITERAL> ...
> >>     <UNICODE_STRING_LITERAL> ...
> >>     <LBRACE_D> ...
> >>     <LBRACE_T> ...
> >>     <LBRACE_TS> ...
> >>     <LBRACE_FN> ...
> >>     "?" ...
> >>     "+" ...
> >>     "-" ...
> >>     <BRACKET_QUOTED_IDENTIFIER> ...
> >>     <QUOTED_IDENTIFIER> ...
> >>     <BACK_QUOTED_IDENTIFIER> ...
> >>     <IDENTIFIER> ...
> >>     <UNICODE_QUOTED_IDENTIFIER> ...
> >>     "CAST" ...
> >>     "EXTRACT" ...
> >>     "POSITION" ...
> >>     "CONVERT" ...
> >>     "TRANSLATE" ...
> >>     "OVERLAY" ...
> >>     "FLOOR" ...
> >>     "CEIL" ...
> >>     "CEILING" ...
> >>     "SUBSTRING" ...
> >>     "TRIM" ...
> >>     "CLASSIFIER" ...
> >>     "MATCH_NUMBER" ...
> >>     "RUNNING" ...
> >>     "PREV" ...
> >>     "JSON_EXISTS" ...
> >>     "JSON_VALUE" ...
> >>     "JSON_QUERY" ...
> >>     "JSON_OBJECT" ...
> >>     "JSON_OBJECTAGG" ...
> >>     "JSON_ARRAY" ...
> >>     "JSON_ARRAYAGG" ...
> >>     "MAP" ...
> >>     "SPECIFIC" ...
> >>     "ABS" ...
> >>     "AVG" ...
> >>     "CARDINALITY" ...
> >>     "CHAR_LENGTH" ...
> >>     "CHARACTER_LENGTH" ...
> >>     "COALESCE" ...
> >>     "COLLECT" ...
> >>     "COVAR_POP" ...
> >>     "COVAR_SAMP" ...
> >>     "CUME_DIST" ...
> >>     "COUNT" ...
> >>     "DENSE_RANK" ...
> >>     "ELEMENT" ...
> >>     "EXP" ...
> >>     "FIRST_VALUE" ...
> >>     "FUSION" ...
> >>     "GROUPING" ...
> >>     "HOUR" ...
> >>     "LAG" ...
> >>     "LEAD" ...
> >>     "LEFT" ...
> >>     "LAST_VALUE" ...
> >>     "LN" ...
> >>     "LOWER" ...
> >>     "MAX" ...
> >>     "MIN" ...
> >>     "MINUTE" ...
> >>     "MOD" ...
> >>     "MONTH" ...
> >>     "NTH_VALUE" ...
> >>     "NTILE" ...
> >>     "NULLIF" ...
> >>     "OCTET_LENGTH" ...
> >>     "PERCENT_RANK" ...
> >>     "POWER" ...
> >>     "RANK" ...
> >>     "REGR_COUNT" ...
> >>     "REGR_SXX" ...
> >>     "REGR_SYY" ...
> >>     "RIGHT" ...
> >>     "ROW_NUMBER" ...
> >>     "SECOND" ...
> >>     "SQRT" ...
> >>     "STDDEV_POP" ...
> >>     "STDDEV_SAMP" ...
> >>     "SUM" ...
> >>     "UPPER" ...
> >>     "TRUNCATE" ...
> >>     "VAR_POP" ...
> >>     "VAR_SAMP" ...
> >>     "YEAR" ...
> >>     "YEAR" "(" ...
> >>
> >>
> >> at
> >>
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
> >> at
> >>
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
> >> at
> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
> >> at
> >>
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
> >> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
> >> at
> >>
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
> >> at
> >>
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
> >> at java.security.AccessController.doPrivileged(Native Method)
> >> at javax.security.auth.Subject.doAs(Subject.java:422)
> >> at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
> >> at
> >>
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
> >> Caused by: org.apache.flink.table.api.SqlParserException: SQL parse
> >> failed. Encountered "year =" at line 4, column 51.
> >> Was expecting one of:
> >>     "ARRAY" ...
> >>     "CASE" ...
> >>     "CURRENT" ...
> >>     "CURRENT_CATALOG" ...
> >>     "CURRENT_DATE" ...
> >>     "CURRENT_DEFAULT_TRANSFORM_GROUP" ...
> >>     "CURRENT_PATH" ...
> >>     "CURRENT_ROLE" ...
> >>     "CURRENT_SCHEMA" ...
> >>     "CURRENT_TIME" ...
> >>     "CURRENT_TIMESTAMP" ...
> >>     "CURRENT_USER" ...
> >>     "DATE" ...
> >>     "EXISTS" ...
> >>     "FALSE" ...
> >>     "INTERVAL" ...
> >>     "LOCALTIME" ...
> >>     "LOCALTIMESTAMP" ...
> >>     "MULTISET" ...
> >>     "NEW" ...
> >>     "NEXT" ...
> >>     "NOT" ...
> >>     "NULL" ...
> >>     "PERIOD" ...
> >>     "SESSION_USER" ...
> >>     "SYSTEM_USER" ...
> >>     "TIME" ...
> >>     "TIMESTAMP" ...
> >>     "TRUE" ...
> >>     "UNKNOWN" ...
> >>     "USER" ...
> >>     <UNSIGNED_INTEGER_LITERAL> ...
> >>     <APPROX_NUMERIC_LITERAL> ...
> >>     <DECIMAL_NUMERIC_LITERAL> ...
> >>     <BINARY_STRING_LITERAL> ...
> >>     <QUOTED_STRING> ...
> >>     <PREFIXED_STRING_LITERAL> ...
> >>     <UNICODE_STRING_LITERAL> ...
> >>     <LBRACE_D> ...
> >>     <LBRACE_T> ...
> >>     <LBRACE_TS> ...
> >>     <LBRACE_FN> ...
> >>     "?" ...
> >>     "+" ...
> >>     "-" ...
> >>     <BRACKET_QUOTED_IDENTIFIER> ...
> >>     <QUOTED_IDENTIFIER> ...
> >>     <BACK_QUOTED_IDENTIFIER> ...
> >>     <IDENTIFIER> ...
> >>     <UNICODE_QUOTED_IDENTIFIER> ...
> >>     "CAST" ...
> >>     "EXTRACT" ...
> >>     "POSITION" ...
> >>     "CONVERT" ...
> >>     "TRANSLATE" ...
> >>     "OVERLAY" ...
> >>     "FLOOR" ...
> >>     "CEIL" ...
> >>     "CEILING" ...
> >>     "SUBSTRING" ...
> >>     "TRIM" ...
> >>     "CLASSIFIER" ...
> >>     "MATCH_NUMBER" ...
> >>     "RUNNING" ...
> >>     "PREV" ...
> >>     "JSON_EXISTS" ...
> >>     "JSON_VALUE" ...
> >>     "JSON_QUERY" ...
> >>     "JSON_OBJECT" ...
> >>     "JSON_OBJECTAGG" ...
> >>     "JSON_ARRAY" ...
> >>     "JSON_ARRAYAGG" ...
> >>     "MAP" ...
> >>     "SPECIFIC" ...
> >>     "ABS" ...
> >>     "AVG" ...
> >>     "CARDINALITY" ...
> >>     "CHAR_LENGTH" ...
> >>     "CHARACTER_LENGTH" ...
> >>     "COALESCE" ...
> >>     "COLLECT" ...
> >>     "COVAR_POP" ...
> >>     "COVAR_SAMP" ...
> >>     "CUME_DIST" ...
> >>     "COUNT" ...
> >>     "DENSE_RANK" ...
> >>     "ELEMENT" ...
> >>     "EXP" ...
> >>     "FIRST_VALUE" ...
> >>     "FUSION" ...
> >>     "GROUPING" ...
> >>     "HOUR" ...
> >>     "LAG" ...
> >>     "LEAD" ...
> >>     "LEFT" ...
> >>     "LAST_VALUE" ...
> >>     "LN" ...
> >>     "LOWER" ...
> >>     "MAX" ...
> >>     "MIN" ...
> >>     "MINUTE" ...
> >>     "MOD" ...
> >>     "MONTH" ...
> >>     "NTH_VALUE" ...
> >>     "NTILE" ...
> >>     "NULLIF" ...
> >>     "OCTET_LENGTH" ...
> >>     "PERCENT_RANK" ...
> >>     "POWER" ...
> >>     "RANK" ...
> >>     "REGR_COUNT" ...
> >>     "REGR_SXX" ...
> >>     "REGR_SYY" ...
> >>     "RIGHT" ...
> >>     "ROW_NUMBER" ...
> >>     "SECOND" ...
> >>     "SQRT" ...
> >>     "STDDEV_POP" ...
> >>     "STDDEV_SAMP" ...
> >>     "SUM" ...
> >>     "UPPER" ...
> >>     "TRUNCATE" ...
> >>     "VAR_POP" ...
> >>     "VAR_SAMP" ...
> >>     "YEAR" ...
> >>     "YEAR" "(" ...
> >>
> >>
> >> at
> >>
> org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:50)
> >> at
> >>
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64)
> >> at
> >>
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 在 2020-05-27 19:08:09,"Rui Li" <lirui.fu...@gmail.com> 写道:
> >> >读hive分区表报的什么错啊,把stacktrace贴一下?
> >> >
> >> >On Wed, May 27, 2020 at 6:08 PM Zhou Zach <wander...@163.com> wrote:
> >> >
> >> >>
> >> >>
> >> >> hive partition table:
> >> >>
> >> >>
> >> >> 1CREATE TABLE `dwd.bill`(
> >> >> 2  `id` bigint,
> >> >> 3  `gid` bigint,
> >> >> 4  `count` bigint,
> >> >> 5  `price` bigint,
> >> >> 6  `srcuid` bigint,
> >> >> 7  `srcnickname` string,
> >> >> 8  `srcleftmoney` bigint,
> >> >> 9  `srcwealth` bigint,
> >> >> 10  `srccredit` decimal(10,0),
> >> >> 11  `dstnickname` string,
> >> >> 12  `dstuid` bigint,
> >> >> 13  `familyid` int,
> >> >> 14  `dstleftmoney` bigint,
> >> >> 15  `dstwealth` bigint,
> >> >> 16  `dstcredit` decimal(10,0),
> >> >> 17  `addtime` bigint,
> >> >> 18  `type` int,
> >> >> 19  `getmoney` decimal(10,0),
> >> >> 20  `os` int,
> >> >> 21  `bak` string,
> >> >> 22  `getbonus` decimal(10,0),
> >> >> 23  `unionbonus` decimal(10,0))
> >> >> 24PARTITIONED BY (
> >> >> 25  `year` int,
> >> >> 26  `month` int)
> >> >> 27ROW FORMAT SERDE
> >> >> 28  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
> >> >> 29STORED AS INPUTFORMAT
> >> >> 30  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> >> >> 31OUTPUTFORMAT
> >> >> 32  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Query:
> >> >>
> >> >>
> >> >> tableEnv.sqlUpdate(
> >> >>       """
> >> >>         |
> >> >>         |INSERT INTO catalog2.dwd.orders
> >> >>         |select srcuid, price from catalog2.dwd.bill where year =
> 2020
> >> >>         |
> >> >>         |
> >> >>         |""".stripMargin)
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> 在 2020-05-27 18:01:19,"Leonard Xu" <xbjt...@gmail.com> 写道:
> >> >> >Flink 支持hive分区表的,看你在另外一个邮件里贴了,你能把你的hive表和query在邮件里贴下吗?
> >> >> >
> >> >> >祝好
> >> >> >Leonard Xu
> >> >> >
> >> >> >> 在 2020年5月27日,17:40,Zhou Zach <wander...@163.com> 写道:
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 感谢回复,表名前加上Catalog和db前缀可以成功访问了。
> >> >> >> 现在遇到个问题,flink 读hive
> >> >> 分区表时,如果where子句用分区键,比如year过滤就会报错,用表中其他字段过滤是没问题的,是flink 不支持
> >> hive分区表,还是哪个地方没设置对
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 在 2020-05-27 17:33:11,"Leonard Xu" <xbjt...@gmail.com> 写道:
> >> >> >>> Hi,
> >> >> >>>> 因为一个HiveCatalog只能关联一个库
> >> >> >>> 一个Catalog是可以关联到多个db的,不同catalog,不同db中表都可以访问的.
> >> >> >>>
> >> >> >>> Flink SQL> show catalogs;
> >> >> >>> default_catalog
> >> >> >>> myhive
> >> >> >>> Flink SQL> use catalog myhive;
> >> >> >>> Flink SQL> show databases;
> >> >> >>> default
> >> >> >>> hive_test
> >> >> >>> hive_test1
> >> >> >>> Flink SQL> select * from hive_test.db2_table union select * from
> >> >> myhive.hive_test1.db1_table;
> >> >> >>> 2020-05-27 17:25:48,565 INFO
> org.apache.hadoop.hive.conf.HiveConf
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> 祝好
> >> >> >>> Leonard Xu
> >> >> >>>
> >> >> >>>
> >> >> >>>> 在 2020年5月27日,10:55,Zhou Zach <wander...@163.com> 写道:
> >> >> >>>>
> >> >> >>>> hi all,
> >> >> >>>> Flink sql 的HiveCatalog 是不是不能跨库操作啊,就是一个flink
> >> >> sql中join的两个表涉及到两个不同到库,因为一个HiveCatalog只能关联一个库
> >> >>
> >> >
> >> >
> >> >--
> >> >Best regards!
> >> >Rui Li
> >>
> >
> >
> >--
> >Best regards!
> >Rui Li
>


-- 
Best regards!
Rui Li

回复