[ 
https://issues.apache.org/jira/browse/LENS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640262#comment-14640262
 ] 

Rajat Khandelwal commented on LENS-598:
---------------------------------------

[~amareshwari] is it still in progress? Can we change status accordingly?

> Dimtable selection not honoring requirements of multiple inter-related join 
> chains properly
> -------------------------------------------------------------------------------------------
>
>                 Key: LENS-598
>                 URL: https://issues.apache.org/jira/browse/LENS-598
>             Project: Apache Lens
>          Issue Type: Bug
>    Affects Versions: 2.1
>            Reporter: Angad Singh
>            Assignee: Amareshwari Sriramadasu
>
> Scenario:
> We have a cubeX, dimension1, dimension2
> table1, table2 are dimtables of dimension1
> table3 is a dimtable of dimension2
> We have joinchain1 from cubeX to dimension1 to dimension2, and joinchain2 
> from cubeX to dimension1 directly.
> Problem is that when a cube query is run which selects columns from cubex, 
> dimension1 and dimension2, lens should ideally look at requirements of both 
> joinchain1 and joinchain2 when selecting dimtables of dimension1, but it is 
> not.
> At times when it is failing to join table3, it is selecting the wrong 
> dimtable of dimension1 (table2).
> E.g.:
> Query:
> {noformat}
> cube select apps_owned_title from user j where apps_owned.source='raf'
> {noformat}
> correctly generated hive query (when it selected dimtable table1):
> {code:sql}
> SELECT ( apps_owned_metadata . apptitle ) FROM 
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_gender_state j join 
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_bundleid_install 
> apps_owned on j.userid = apps_owned.userid and (apps_owned.dt = 'latest') 
> join user.uh1_hdfs_hive_dimtable_uac_app apps_owned_metadata on 
> apps_owned.bundleid = apps_owned_metadata.bundleid and 
> (apps_owned_metadata.dt = 'latest') WHERE (((( apps_owned . source ) =  'raf' 
> ) AND ((j.dt = 'latest'))))
> {code}
> incorrectly generated query:
> {code:sql}
> SELECT ( apps_owned_metadata . apptitle ) FROM 
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_gender_state j join 
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_app_install 
> apps_owned on j.userid = apps_owned.userid and (apps_owned.dt = 'latest') 
> WHERE (((( apps_owned . source ) =  'raf' ) AND ((apps_owned_metadata.dt = 
> 'latest')) AND ((j.dt = 'latest')))) 
> {code}
> Exception in lensserver logs as a result of above query:
> {noformat}
> 10 Jun 2015 08:56:27,850 [QuerySubmitter] INFO  
> org.apache.lens.driver.hive.HiveDriver  - Hive driver query:INSERT OVERWRITE 
> DIRECTORY "/tmp/lensreports/hdfsout/1be8556b-64b2-42f0-a32b-45d3be75537f"  
> SELECT ( apps_owned_metadata . apptitle ) FROM 
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_gender_state j join 
> user.uh1_hdfs_user_attributestore_er_cubetable_dimension_app_install 
> apps_owned on j.userid = apps_owned.userid and (apps_owned.dt = 'latest') 
> WHERE (((( apps_owned . source ) =  'raf' ) AND ((apps_owned_metadata.dt = 
> 'latest')) AND ((j.dt = 'latest')))) 
> 10 Jun 2015 08:56:27,851 [QuerySubmitter] INFO  
> org.apache.lens.driver.hive.HiveDriver  - whetherCalculatePriority: true
> 10 Jun 2015 08:56:27,852 [QuerySubmitter] INFO  
> org.apache.lens.server.api.priority.QueryPriorityDecider  - Deciding Priority 
> VERY_HIGH since cost = 0.0
> 10 Jun 2015 08:56:27,852 [QuerySubmitter] WARN  
> org.apache.hadoop.conf.Configuration  - mapred.job.priority is deprecated. 
> Instead, use mapreduce.job.priority
> 10 Jun 2015 08:56:27,852 [QuerySubmitter] INFO  
> org.apache.lens.driver.hive.HiveDriver  - set priority to VERY_HIGH
> 10 Jun 2015 08:56:28,468 [QuerySubmitter] INFO  
> org.apache.lens.driver.hive.HiveDriver  - No hive operation available for 
> 1be8556b-64b2-42f0-a32b-45d3be75537f
> 10 Jun 2015 08:56:28,469 [QuerySubmitter] ERROR 
> org.apache.lens.server.query.QueryExecutionServiceImpl  - Error launching 
> query 1be8556b-64b2-42f0-a32b-45d3be75537f
> org.apache.lens.server.api.error.LensException: Error executing async query
>       at 
> org.apache.lens.driver.hive.HiveDriver.executeAsync(HiveDriver.java:524)
>       at 
> org.apache.lens.server.query.QueryExecutionServiceImpl$QuerySubmitter.run(QueryExecutionServiceImpl.java:497)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
> compiling statement: FAILED: SemanticException [Error 10004]: Line 1:409 
> Invalid table alias or column reference 'apps_owned_metadata': (possible 
> column names are: j.userid, j.timestamp, j.gender, j.gender_state_score, 
> j.gender_state_confidence, j.dt, apps_owned.userid, apps_owned.timestamp, 
> apps_owned.appid, apps_owned.app_install_confidence, apps_owned.dt, 
> apps_owned.source)
>       at 
> org.apache.hive.service.cli.thrift.ThriftCLIServiceClient.checkStatus(ThriftCLIServiceClient.java:52)
>       at 
> org.apache.hive.service.cli.thrift.ThriftCLIServiceClient.executeStatementInternal(ThriftCLIServiceClient.java:151)
>       at 
> org.apache.hive.service.cli.thrift.ThriftCLIServiceClient.executeStatementAsync(ThriftCLIServiceClient.java:139)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.hive.service.cli.thrift.RetryingThriftCLIServiceClient.invokeInternal(RetryingThriftCLIServiceClient.java:301)
>       at 
> org.apache.hive.service.cli.thrift.RetryingThriftCLIServiceClient.invoke(RetryingThriftCLIServiceClient.java:329)
>       at com.sun.proxy.$Proxy59.executeStatementAsync(Unknown Source)
>       at 
> org.apache.hive.service.cli.thrift.RetryingThriftCLIServiceClient$CLIServiceClientWrapper.executeStatementAsync(RetryingThriftCLIServiceClient.java:118)
>       at 
> org.apache.lens.driver.hive.HiveDriver.executeAsync(HiveDriver.java:515)
>       ... 2 more
> {noformat}
> Note: in above example, cubeX is "user", dimension1 is "user_owned_apps", 
> dimension2 is "hive_dimension_uac_app". table 1 is 
> "user_attributestore_er_cubetable_dimension_bundleid_install", table2 is 
> "user_attributestore_er_cubetable_dimension_app_install", table3 is 
> "hive_dimtable_uac_app".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to