[
https://issues.apache.org/jira/browse/LENS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640262#comment-14640262
]
Rajat Khandelwal commented on LENS-598:
---------------------------------------
[~amareshwari] is it still in progress? Can we change status accordingly?
> Dimtable selection not honoring requirements of multiple inter-related join
> chains properly
> -------------------------------------------------------------------------------------------
>
> Key: LENS-598
> URL: https://issues.apache.org/jira/browse/LENS-598
> Project: Apache Lens
> Issue Type: Bug
> Affects Versions: 2.1
> Reporter: Angad Singh
> Assignee: Amareshwari Sriramadasu
>
> Scenario:
> We have a cubeX, dimension1, dimension2
> table1, table2 are dimtables of dimension1
> table3 is a dimtable of dimension2
> We have joinchain1 from cubeX to dimension1 to dimension2, and joinchain2
> from cubeX to dimension1 directly.
> Problem is that when a cube query is run which selects columns from cubex,
> dimension1 and dimension2, lens should ideally look at requirements of both
> joinchain1 and joinchain2 when selecting dimtables of dimension1, but it is
> not.
> At times when it is failing to join table3, it is selecting the wrong
> dimtable of dimension1 (table2).
> E.g.:
> Query:
> {noformat}
> cube select apps_owned_title from user j where apps_owned.source='raf'
> {noformat}
> correctly generated hive query (when it selected dimtable table1):
> {code:sql}
> SELECT ( apps_owned_metadata . apptitle ) FROM
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_gender_state j join
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_bundleid_install
> apps_owned on j.userid = apps_owned.userid and (apps_owned.dt = 'latest')
> join user.uh1_hdfs_hive_dimtable_uac_app apps_owned_metadata on
> apps_owned.bundleid = apps_owned_metadata.bundleid and
> (apps_owned_metadata.dt = 'latest') WHERE (((( apps_owned . source ) = 'raf'
> ) AND ((j.dt = 'latest'))))
> {code}
> incorrectly generated query:
> {code:sql}
> SELECT ( apps_owned_metadata . apptitle ) FROM
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_gender_state j join
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_app_install
> apps_owned on j.userid = apps_owned.userid and (apps_owned.dt = 'latest')
> WHERE (((( apps_owned . source ) = 'raf' ) AND ((apps_owned_metadata.dt =
> 'latest')) AND ((j.dt = 'latest'))))
> {code}
> Exception in lensserver logs as a result of above query:
> {noformat}
> 10 Jun 2015 08:56:27,850 [QuerySubmitter] INFO
> org.apache.lens.driver.hive.HiveDriver - Hive driver query:INSERT OVERWRITE
> DIRECTORY "/tmp/lensreports/hdfsout/1be8556b-64b2-42f0-a32b-45d3be75537f"
> SELECT ( apps_owned_metadata . apptitle ) FROM
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_gender_state j join
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_app_install
> apps_owned on j.userid = apps_owned.userid and (apps_owned.dt = 'latest')
> WHERE (((( apps_owned . source ) = 'raf' ) AND ((apps_owned_metadata.dt =
> 'latest')) AND ((j.dt = 'latest'))))
> 10 Jun 2015 08:56:27,851 [QuerySubmitter] INFO
> org.apache.lens.driver.hive.HiveDriver - whetherCalculatePriority: true
> 10 Jun 2015 08:56:27,852 [QuerySubmitter] INFO
> org.apache.lens.server.api.priority.QueryPriorityDecider - Deciding Priority
> VERY_HIGH since cost = 0.0
> 10 Jun 2015 08:56:27,852 [QuerySubmitter] WARN
> org.apache.hadoop.conf.Configuration - mapred.job.priority is deprecated.
> Instead, use mapreduce.job.priority
> 10 Jun 2015 08:56:27,852 [QuerySubmitter] INFO
> org.apache.lens.driver.hive.HiveDriver - set priority to VERY_HIGH
> 10 Jun 2015 08:56:28,468 [QuerySubmitter] INFO
> org.apache.lens.driver.hive.HiveDriver - No hive operation available for
> 1be8556b-64b2-42f0-a32b-45d3be75537f
> 10 Jun 2015 08:56:28,469 [QuerySubmitter] ERROR
> org.apache.lens.server.query.QueryExecutionServiceImpl - Error launching
> query 1be8556b-64b2-42f0-a32b-45d3be75537f
> org.apache.lens.server.api.error.LensException: Error executing async query
> at
> org.apache.lens.driver.hive.HiveDriver.executeAsync(HiveDriver.java:524)
> at
> org.apache.lens.server.query.QueryExecutionServiceImpl$QuerySubmitter.run(QueryExecutionServiceImpl.java:497)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while
> compiling statement: FAILED: SemanticException [Error 10004]: Line 1:409
> Invalid table alias or column reference 'apps_owned_metadata': (possible
> column names are: j.userid, j.timestamp, j.gender, j.gender_state_score,
> j.gender_state_confidence, j.dt, apps_owned.userid, apps_owned.timestamp,
> apps_owned.appid, apps_owned.app_install_confidence, apps_owned.dt,
> apps_owned.source)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIServiceClient.checkStatus(ThriftCLIServiceClient.java:52)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIServiceClient.executeStatementInternal(ThriftCLIServiceClient.java:151)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIServiceClient.executeStatementAsync(ThriftCLIServiceClient.java:139)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hive.service.cli.thrift.RetryingThriftCLIServiceClient.invokeInternal(RetryingThriftCLIServiceClient.java:301)
> at
> org.apache.hive.service.cli.thrift.RetryingThriftCLIServiceClient.invoke(RetryingThriftCLIServiceClient.java:329)
> at com.sun.proxy.$Proxy59.executeStatementAsync(Unknown Source)
> at
> org.apache.hive.service.cli.thrift.RetryingThriftCLIServiceClient$CLIServiceClientWrapper.executeStatementAsync(RetryingThriftCLIServiceClient.java:118)
> at
> org.apache.lens.driver.hive.HiveDriver.executeAsync(HiveDriver.java:515)
> ... 2 more
> {noformat}
> Note: in above example, cubeX is "user", dimension1 is "user_owned_apps",
> dimension2 is "hive_dimension_uac_app". table 1 is
> "user_attributestore_er_cubetable_dimension_bundleid_install", table2 is
> "user_attributestore_er_cubetable_dimension_app_install", table3 is
> "hive_dimtable_uac_app".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)