[ 
https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15468449#comment-15468449
 ] 

Davies Liu commented on SPARK-17403:
------------------------------------

[~rhernando] Could you pull out the string column (SL_RD_ColR_N) and dump it as 
parquet file to reproduce the issue here? There could a bug in scanning cached 
strings (depends on the data). 

> Fatal Error: Scan cached strings
> --------------------------------
>
>                 Key: SPARK-17403
>                 URL: https://issues.apache.org/jira/browse/SPARK-17403
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>         Environment: Spark standalone cluster (3 Workers, 47 cores)
> Ubuntu 14
> Java 8
>            Reporter: Ruben Hernando
>
> The process creates views from JDBC (SQL server) source and combines them to 
> create other views.
> Finally it dumps results via JDBC
> Error:
> {quote}
> # JRE version: Java(TM) SE Runtime Environment (8.0_101-b13) (build 
> 1.8.0_101-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.101-b13 mixed mode 
> linux-amd64 )
> # Problematic frame:
> # J 4895 C1 org.apache.spark.unsafe.Platform.getLong(Ljava/lang/Object;J)J (9 
> bytes) @ 0x00007fbb355dfd6c [0x00007fbb355dfd60+0xc]
> #
> {quote}
> SQL Query plan (fields truncated):
> {noformat}
> == Parsed Logical Plan ==
> 'Project [*]
> +- 'UnresolvedRelation `COEQ_63`
> == Analyzed Logical Plan ==
> InstanceId: bigint, price: double, ZoneId: int, priceItemId: int, priceId: int
> Project [InstanceId#20236L, price#20237, ZoneId#20239, priceItemId#20242, 
> priceId#20244]
> +- SubqueryAlias coeq_63
>    +- Project [_TableSL_SID#143L AS InstanceId#20236L, SL_RD_ColR_N#189 AS 
> price#20237, 24 AS ZoneId#20239, 6 AS priceItemId#20242, 63 AS priceId#20244]
>       +- SubqueryAlias 6__input
>          +- 
> Relation[_TableSL_SID#143L,_TableP_DC_SID#144L,_TableSH_SID#145L,ID#146,Name#147,TableP_DCID#148,TableSHID#149,SL_ACT_GI_DTE#150,SL_Xcl_C#151,SL_Xcl_C#152,SL_Css_Cojs#153L,SL_Config#154,SL_CREATEDON#
>  .......... 36 more fields] JDBCRelation((select [SLTables].[_TableSL_SID], 
> [SLTables]. ... [...]  FROM [sch].[SLTables] [SLTables] JOIN sch.TPSLTables 
> TPSLTables ON [TPSLTables].[_TableSL_SID] = [SLTables].[_TableSL_SID] where 
> _TP = 24) input)
> == Optimized Logical Plan ==
> Project [_TableSL_SID#143L AS InstanceId#20236L, SL_RD_ColR_N#189 AS 
> price#20237, 24 AS ZoneId#20239, 6 AS priceItemId#20242, 63 AS priceId#20244]
> +- InMemoryRelation [_TableSL_SID#143L, _TableP_DC_SID#144L, 
> _TableSH_SID#145L, ID#146, Name#147, ... 36 more fields], true, 10000, 
> StorageLevel(disk, memory, deserialized, 1 replicas)
>    :  +- *Scan JDBCRelation((select [SLTables].[_TableSL_SID], 
> [SLTables].[_TableP_DC_SID], [SLTables].[_TableSH_SID], [SLTables].[ID], 
> [SLTables].[Name], [SLTables].[TableP_DCID], [SLTables].[TableSHID], 
> [TPSLTables].[SL_ACT_GI_DTE],  ... [...] FROM [sch].[SLTables] [SLTables] 
> JOIN sch.TPSLTables TPSLTables ON [TPSLTables].[_TableSL_SID] = 
> [SLTables].[_TableSL_SID] where _TP = 24) input) 
> [_TableSL_SID#143L,_TableP_DC_SID#144L,_TableSH_SID#145L,ID#146,Name#147,TableP_DCID#148,TableSHID#149,SL_ACT_GI_DTE#150,SL_Xcl_C#151,...
>  36 more fields] 
> == Physical Plan ==
> *Project [_TableSL_SID#143L AS InstanceId#20236L, SL_RD_ColR_N#189 AS 
> price#20237, 24 AS ZoneId#20239, 6 AS priceItemId#20242, 63 AS priceId#20244]
> +- InMemoryTableScan [_TableSL_SID#143L, SL_RD_ColR_N#189]
>    :  +- InMemoryRelation [_TableSL_SID#143L, _TableP_DC_SID#144L, 
> _TableSH_SID#145L, ID#146, Name#147, ... 36 more fields], true, 10000, 
> StorageLevel(disk, memory, deserialized, 1 replicas)
>    :     :  +- *Scan JDBCRelation((select [SLTables].[_TableSL_SID], 
> [SLTables].[_TableP_DC_SID], [SLTables].[_TableSH_SID], [SLTables].[ID], 
> [SLTables].[Name], [SLTables].[TableP_DCID],  ... [...] FROM [sch].[SLTables] 
> [SLTables] JOIN sch.TPSLTables TPSLTables ON [TPSLTables].[_TableSL_SID] = 
> [SLTables].[_TableSL_SID] where _TP = 24) input) 
> [_TableSL_SID#143L,_TableP_DC_SID#144L,_TableSH_SID#145L,ID#146,Name#147,,... 
> 36 more fields]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to