Re: Materialized view case sensitivity problem

Christian Beikov Thu, 24 Aug 2017 12:59:14 -0700

I guess it's partly the DBMSes fault, because it either treatsidentifiers case insensitive or the schema metadata API returns columnnames in upper case. Not sure if we could do anything about that in theJDBC adapter.

I think I will just inspect the JDBC schema before preparing anymaterializations to avoid the casing problem and having to rely on aspecific case. Thanks for the clarifications!



Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 21:52 schrieb Jesus Camacho Rodriguez:

I never hit this issue as we do not go through the JDBC adaptor when we
use the MV rewriting within Hive.

I am not familiar with that code path, but I guess no matter whether it is
MV or a table definition, we should end up doing the same wrt casing column
names, thus there should be no need for case insensitive comparison?

- Jesús



On 8/24/17, 12:19 PM, "Christian Beikov" <christian.bei...@gmail.com> wrote:

I apparently had a different problem that lead me to believe the view
was the problem. In fact, the actual query was the problem.

So i have the query for the materialized view "select id as `id`, name
as `name` from document" and the query for the normal view "select
cast(_MAP['id'] AS bigint) AS `id`, cast(_MAP['name'] AS varchar(255))
AS `name` from elasticsearch_raw.document_index".

Now when I send the query "select id as col1, name as col2 from
document", the row type at first is "col1 bigint, col2 varchar(255)" and
later it becomes "ID bigint, NAME varchar(255)" which is to a specific
extent a good thing. The materialization logic determines it can
substitue the query, but during the substitution it compares that row
type with the one from the view. The Jdbc schema receives the columns in
upper case, which is why the row type of the sent query is in upper
case. Either the comparison should be case insensitive, or I simply
upper case the names of the columns in the view, which is what I did now.

Doing that will unfortunately cause a little mismatch in the ES adapter
which expects that the field names have the same case as the fields of
the row type. This is why I adapted some rules to extract the correctly
cased field name from the _MAP expression.

Now the question is, should the comparison be case insensitive or should
I rely on the fact, that the JDBC schema will always have upper cased
column names?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 21:00 schrieb Julian Hyde:

Rather than "select id, name from document” could you create your view as 
"select `id`, `name` from document” (or however the back-end system quotes 
identifiers). Then “id” would still be in lower-case when the JDBC adapter queries the 
catalog.

On Aug 24, 2017, at 5:17 AM, Christian Beikov <christian.bei...@gmail.com> 
wrote:

My main problem is the row type equality assertion in 
org.apache.calcite.plan.SubstitutionVisitor#go(org.apache.calcite.rel.mutable.MutableRel)

Imagine I have a table "document" with columns "id" and "name". When the JdbcSchema reads the 
structure, it gets column names in upper case. Now I register a materialized view for a query like "select id, name from 
document". The materialized table for that view is in my case a view again defined like "select ... AS `id`, ... AS 
`name` from ...".

The row type of my view correctly is "id, name". The row type of the table "document" is "ID, 
NAME" because the JdbcSchema gets upper cased names. Initially, the row type of the query for the materialized 
view is also correct, but during the "trim fields" phase the row type gets replaced with the types from the 
table. Is this replacement of field types even correct?

Because of that, the assertion in the substiution visitor fails. What would be 
the appropriate solution for this mismatch?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 12:57 schrieb Julian Hyde:

Or supply your own TableFactory? I'm not quite sure of your use case.
I've only tested cases where materialized views are "internal",
therefore they work fine with Calcite's default dialect.

On Thu, Aug 24, 2017 at 3:21 AM, Christian Beikov
<christian.bei...@gmail.com> wrote:

Actually, it seems the root cause is that the materialization uses the wrong
configuration.

org.apache.calcite.materialize.MaterializationService.DefaultTableFactory#createTable
creates a new connection with the default configuration that does TO_UPPER.
Would it be ok for it to receive a CalciteConnectionConfig?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 11:36 schrieb Christian Beikov:

Seems org.apache.calcite.prepare.CalcitePrepareImpl#prepare2_ misses a
call to
org.apache.calcite.sql.parser.SqlParser.ConfigBuilder#setCaseSensitive to
configure the parser according to the LEX configuration. Is that a bug or
expected?


Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*
Am 24.08.2017 um 11:24 schrieb Christian Beikov:

Hey,

I have configured Lex.MYSQL_ANSI but when a query gets parsed, the column
names of select items are "to-upper-cased".

I'm having problems with matching the row types of materialized views and
the source sql because of that. Any idea how to fix that?

--

Mit freundlichen Grüßen,
------------------------------------------------------------------------
*Christian Beikov*

Re: Materialized view case sensitivity problem

Reply via email to