[ https://issues.apache.org/jira/browse/HIVE-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman reassigned HIVE-15267: ------------------------------------- Assignee: Steve Yeom (was: Wei Zheng) > Make query length calculation logic more accurate in TxnUtils.needNewQuery() > ---------------------------------------------------------------------------- > > Key: HIVE-15267 > URL: https://issues.apache.org/jira/browse/HIVE-15267 > Project: Hive > Issue Type: Bug > Components: Hive, Transactions > Affects Versions: 1.2.1, 2.1.0 > Reporter: Wei Zheng > Assignee: Steve Yeom > > In HIVE-15181 there's such review comment, for which this ticket will handle > {code} > in TxnUtils.needNewQuery() "sizeInBytes / 1024 > queryMemoryLimit" doesn't do > the right thing. > If the user sets METASTORE_DIRECT_SQL_MAX_QUERY_LENGTH to 1K, they most > likely want each SQL string to be at most 1K. > But if sizeInBytes=2047, this still returns false. > It should include length of "suffix" in computation of sizeInBytes > Along the same lines: the check for max query length is done after each batch > is already added to the query. Suppose there are 1000 9-digit txn IDs in each > IN(...). That's, conservatively, 18KB of text. So the length of each query is > increasing in 18KB chunks. > I think the check for query length should be done for each item in IN clause. > If some DB has a limit on query length of X, then any query > X will fail. So > I think this must ensure not to produce any queries > X, even by 1 char. > For example, case 3.1 of the UT generates a query of almost 4000 characters - > this is clearly > 1KB. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)