[ 
https://issues.apache.org/jira/browse/SPARK-27211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guiju Zhang updated SPARK-27211:
--------------------------------
    Description: 
First, I have an object RawLogPlayload which has an field: long timestamp

Then I try to join two Dataset<RawLogPlayload> and select some of the columns

Following is the code Snippet

extractedRawTc.printSchema();   // output1

Dataset<RawLogPayload> extractedRawW3cFilled = 
extractedRawW3c.alias("extractedRawW3c")

.join(extractedRawTc.alias("extractedRawTc"), 
functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")),
 "inner")

.select(functions.col("extractedRawW3c.df_logdatetime"), 
functions.col("extractedRawW3c.rawsessionid"), 
functions.col("extractedRawTc.uid"),

functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),

functions.col("extractedRawW3c.tid"), 
functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),

functions.col("extractedRawW3c.curWindow"), 
*functions.col("extractedRawW3c.timestamp")*)

.as(Encoders.bean(RawLogPayload.class));

extractedRawW3cFilled.printSchema();  // output2

 

After run this, it will cast following exception

2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
103, Column 32: No applicable constructor/method found for actual parameters 
"org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void 
com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"

org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
103, Column 32: *No applicable constructor/method found for actual parameters 
"org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void 
com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"*

at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)

at 
org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)

 

Output1 extractedRawTc schema

root

 |-- curWindow: string (nullable = true)

 |-- df_logdatetime: string (nullable = true)

 |-- fid: string (nullable = true)

 |-- rawsessionid: string (nullable = true)

 |-- string1: string (nullable = true)

 |-- t: string (nullable = true)

 |-- tid: string (nullable = true)

 |-- time: string (nullable = true)

 |-- *timestamp: long (nullable = true)*

 |-- uid: string (nullable = true)

 |-- url: string (nullable = true)

 |-- wid: string (nullable = true)

 

Output2  extractedRawW3cFilled schema

root

 |-- df_logdatetime: string (nullable = true)

 |-- rawsessionid: string (nullable = true)

 |-- uid: string (nullable = true)

 |-- time: string (nullable = true)

 |-- T: string (nullable = true)

 |-- url: string (nullable = true)

 |-- wid: string (nullable = true)

 |-- tid: string (nullable = true)

 |-- fid: string (nullable = true)

 

My question: the schema of column timestamp is long, but from the exception 
log, it seems after selecting the datatype of timestamp becomes UTF8String, Why 
would this happen? Is it a bug? If not could you point how to use it correctly?

Thanks

 

 

 |-- string1: string (nullable = true)

 |-- curWindow: string (nullable = true)

 |-- *timestamp: long (nullable = true)*

  was:
(1) RawLogPlayload has an field: long timestamp

 

(2)

extractedRawTc.printSchema();   // output1

Dataset<RawLogPayload> extractedRawW3cFilled = 
extractedRawW3c.alias("extractedRawW3c")

.join(extractedRawTc.alias("extractedRawTc"), 
functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")),
 "inner")

.select(functions.col("extractedRawW3c.df_logdatetime"), 
functions.col("extractedRawW3c.rawsessionid"), 
functions.col("extractedRawTc.uid"),

functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),

functions.col("extractedRawW3c.tid"), 
functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),

functions.col("extractedRawW3c.curWindow"), 
*functions.col("extractedRawW3c.timestamp")*)

.as(Encoders.bean(RawLogPayload.class));

extractedRawW3cFilled.printSchema();  // output2

 

(4) cast exception

 

2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
103, Column 32: No applicable constructor/method found for actual parameters 
"org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void 
com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"

org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
103, Column 32: *No applicable constructor/method found for actual parameters 
"org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void 
com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"*

at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)

at 
org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)

 

 

Output1 extractedRawTc schema

root

 |-- curWindow: string (nullable = true)

 |-- df_logdatetime: string (nullable = true)

 |-- fid: string (nullable = true)

 |-- rawsessionid: string (nullable = true)

 |-- string1: string (nullable = true)

 |-- t: string (nullable = true)

 |-- tid: string (nullable = true)

 |-- time: string (nullable = true)

 |-- *timestamp: long (nullable = true)*

 |-- uid: string (nullable = true)

 |-- url: string (nullable = true)

 |-- wid: string (nullable = true)

 

 

Output2  extractedRawW3cFilled schema

root

 |-- df_logdatetime: string (nullable = true)

 |-- rawsessionid: string (nullable = true)

 |-- uid: string (nullable = true)

 |-- time: string (nullable = true)

 |-- T: string (nullable = true)

 |-- url: string (nullable = true)

 |-- wid: string (nullable = true)

 |-- tid: string (nullable = true)

 |-- fid: string (nullable = true)

 |-- string1: string (nullable = true)

 |-- curWindow: string (nullable = true)

 |-- *timestamp: long (nullable = true)*


> cast error when select column from Row
> --------------------------------------
>
>                 Key: SPARK-27211
>                 URL: https://issues.apache.org/jira/browse/SPARK-27211
>             Project: Spark
>          Issue Type: Question
>          Components: Java API
>    Affects Versions: 2.3.0, 2.3.1
>            Reporter: Guiju Zhang
>            Priority: Major
>              Labels: SQL, Spark
>
> First, I have an object RawLogPlayload which has an field: long timestamp
> Then I try to join two Dataset<RawLogPlayload> and select some of the columns
> Following is the code Snippet
> extractedRawTc.printSchema();   // output1
> Dataset<RawLogPayload> extractedRawW3cFilled = 
> extractedRawW3c.alias("extractedRawW3c")
> .join(extractedRawTc.alias("extractedRawTc"), 
> functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")),
>  "inner")
> .select(functions.col("extractedRawW3c.df_logdatetime"), 
> functions.col("extractedRawW3c.rawsessionid"), 
> functions.col("extractedRawTc.uid"),
> functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),
> functions.col("extractedRawW3c.tid"), 
> functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),
> functions.col("extractedRawW3c.curWindow"), 
> *functions.col("extractedRawW3c.timestamp")*)
> .as(Encoders.bean(RawLogPayload.class));
> extractedRawW3cFilled.printSchema();  // output2
>  
> After run this, it will cast following exception
> 2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 103, Column 32: No applicable constructor/method found for actual parameters 
> "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void 
> com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 103, Column 32: *No applicable constructor/method found for actual parameters 
> "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void 
> com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"*
> at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)
> at 
> org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)
>  
> Output1 extractedRawTc schema
> root
>  |-- curWindow: string (nullable = true)
>  |-- df_logdatetime: string (nullable = true)
>  |-- fid: string (nullable = true)
>  |-- rawsessionid: string (nullable = true)
>  |-- string1: string (nullable = true)
>  |-- t: string (nullable = true)
>  |-- tid: string (nullable = true)
>  |-- time: string (nullable = true)
>  |-- *timestamp: long (nullable = true)*
>  |-- uid: string (nullable = true)
>  |-- url: string (nullable = true)
>  |-- wid: string (nullable = true)
>  
> Output2  extractedRawW3cFilled schema
> root
>  |-- df_logdatetime: string (nullable = true)
>  |-- rawsessionid: string (nullable = true)
>  |-- uid: string (nullable = true)
>  |-- time: string (nullable = true)
>  |-- T: string (nullable = true)
>  |-- url: string (nullable = true)
>  |-- wid: string (nullable = true)
>  |-- tid: string (nullable = true)
>  |-- fid: string (nullable = true)
>  
> My question: the schema of column timestamp is long, but from the exception 
> log, it seems after selecting the datatype of timestamp becomes UTF8String, 
> Why would this happen? Is it a bug? If not could you point how to use it 
> correctly?
> Thanks
>  
>  
>  |-- string1: string (nullable = true)
>  |-- curWindow: string (nullable = true)
>  |-- *timestamp: long (nullable = true)*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to