[ https://issues.apache.org/jira/browse/SPARK-27211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guiju Zhang updated SPARK-27211: -------------------------------- Description: First, I have an object RawLogPlayload which has an field: long timestamp Then I try to join two Dataset<RawLogPlayload> and select some of the columns Following is the code Snippet extractedRawTc.printSchema(); // output1 Dataset<RawLogPayload> extractedRawW3cFilled = extractedRawW3c.alias("extractedRawW3c") .join(extractedRawTc.alias("extractedRawTc"), functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), "inner") .select(functions.col("extractedRawW3c.df_logdatetime"), functions.col("extractedRawW3c.rawsessionid"), functions.col("extractedRawTc.uid"), functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"), functions.col("extractedRawW3c.tid"), functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"), functions.col("extractedRawW3c.curWindow"), *functions.col("extractedRawW3c.timestamp")*) .as(Encoders.bean(RawLogPayload.class)); extractedRawW3cFilled.printSchema(); // output2 After run this, it will cast following exception 2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)" org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: *No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"* at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821) at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910) Output1 extractedRawTc schema root |-- curWindow: string (nullable = true) |-- df_logdatetime: string (nullable = true) |-- fid: string (nullable = true) |-- rawsessionid: string (nullable = true) |-- string1: string (nullable = true) |-- t: string (nullable = true) |-- tid: string (nullable = true) |-- time: string (nullable = true) |-- *timestamp: long (nullable = true)* |-- uid: string (nullable = true) |-- url: string (nullable = true) |-- wid: string (nullable = true) Output2 extractedRawW3cFilled schema root |-- df_logdatetime: string (nullable = true) |-- rawsessionid: string (nullable = true) |-- uid: string (nullable = true) |-- time: string (nullable = true) |-- T: string (nullable = true) |-- url: string (nullable = true) |-- wid: string (nullable = true) |-- tid: string (nullable = true) |-- fid: string (nullable = true) My question: the schema of column timestamp is long, but from the exception log, it seems after selecting the datatype of timestamp becomes UTF8String, Why would this happen? Is it a bug? If not could you point how to use it correctly? Thanks |-- string1: string (nullable = true) |-- curWindow: string (nullable = true) |-- *timestamp: long (nullable = true)* was: (1) RawLogPlayload has an field: long timestamp (2) extractedRawTc.printSchema(); // output1 Dataset<RawLogPayload> extractedRawW3cFilled = extractedRawW3c.alias("extractedRawW3c") .join(extractedRawTc.alias("extractedRawTc"), functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), "inner") .select(functions.col("extractedRawW3c.df_logdatetime"), functions.col("extractedRawW3c.rawsessionid"), functions.col("extractedRawTc.uid"), functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"), functions.col("extractedRawW3c.tid"), functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"), functions.col("extractedRawW3c.curWindow"), *functions.col("extractedRawW3c.timestamp")*) .as(Encoders.bean(RawLogPayload.class)); extractedRawW3cFilled.printSchema(); // output2 (4) cast exception 2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)" org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: *No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"* at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821) at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910) Output1 extractedRawTc schema root |-- curWindow: string (nullable = true) |-- df_logdatetime: string (nullable = true) |-- fid: string (nullable = true) |-- rawsessionid: string (nullable = true) |-- string1: string (nullable = true) |-- t: string (nullable = true) |-- tid: string (nullable = true) |-- time: string (nullable = true) |-- *timestamp: long (nullable = true)* |-- uid: string (nullable = true) |-- url: string (nullable = true) |-- wid: string (nullable = true) Output2 extractedRawW3cFilled schema root |-- df_logdatetime: string (nullable = true) |-- rawsessionid: string (nullable = true) |-- uid: string (nullable = true) |-- time: string (nullable = true) |-- T: string (nullable = true) |-- url: string (nullable = true) |-- wid: string (nullable = true) |-- tid: string (nullable = true) |-- fid: string (nullable = true) |-- string1: string (nullable = true) |-- curWindow: string (nullable = true) |-- *timestamp: long (nullable = true)* > cast error when select column from Row > -------------------------------------- > > Key: SPARK-27211 > URL: https://issues.apache.org/jira/browse/SPARK-27211 > Project: Spark > Issue Type: Question > Components: Java API > Affects Versions: 2.3.0, 2.3.1 > Reporter: Guiju Zhang > Priority: Major > Labels: SQL, Spark > > First, I have an object RawLogPlayload which has an field: long timestamp > Then I try to join two Dataset<RawLogPlayload> and select some of the columns > Following is the code Snippet > extractedRawTc.printSchema(); // output1 > Dataset<RawLogPayload> extractedRawW3cFilled = > extractedRawW3c.alias("extractedRawW3c") > .join(extractedRawTc.alias("extractedRawTc"), > functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), > "inner") > .select(functions.col("extractedRawW3c.df_logdatetime"), > functions.col("extractedRawW3c.rawsessionid"), > functions.col("extractedRawTc.uid"), > functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"), > functions.col("extractedRawW3c.tid"), > functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"), > functions.col("extractedRawW3c.curWindow"), > *functions.col("extractedRawW3c.timestamp")*) > .as(Encoders.bean(RawLogPayload.class)); > extractedRawW3cFilled.printSchema(); // output2 > > After run this, it will cast following exception > 2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 103, Column 32: No applicable constructor/method found for actual parameters > "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void > com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)" > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 103, Column 32: *No applicable constructor/method found for actual parameters > "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void > com.**xxxx**.**xxxx**.spartan.api.core.RawLogPayload.setTimestamp(long)"* > at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821) > at > org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910) > > Output1 extractedRawTc schema > root > |-- curWindow: string (nullable = true) > |-- df_logdatetime: string (nullable = true) > |-- fid: string (nullable = true) > |-- rawsessionid: string (nullable = true) > |-- string1: string (nullable = true) > |-- t: string (nullable = true) > |-- tid: string (nullable = true) > |-- time: string (nullable = true) > |-- *timestamp: long (nullable = true)* > |-- uid: string (nullable = true) > |-- url: string (nullable = true) > |-- wid: string (nullable = true) > > Output2 extractedRawW3cFilled schema > root > |-- df_logdatetime: string (nullable = true) > |-- rawsessionid: string (nullable = true) > |-- uid: string (nullable = true) > |-- time: string (nullable = true) > |-- T: string (nullable = true) > |-- url: string (nullable = true) > |-- wid: string (nullable = true) > |-- tid: string (nullable = true) > |-- fid: string (nullable = true) > > My question: the schema of column timestamp is long, but from the exception > log, it seems after selecting the datatype of timestamp becomes UTF8String, > Why would this happen? Is it a bug? If not could you point how to use it > correctly? > Thanks > > > |-- string1: string (nullable = true) > |-- curWindow: string (nullable = true) > |-- *timestamp: long (nullable = true)* -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org