Re: SparkR Supported Types - Please add bigint
They are actually the same thing, LongType. `long` is friendly for developer, `bigint` is friendly for database guy, maybe data scientists. On Thu, Jul 23, 2015 at 11:33 PM, Sun, Rui rui@intel.com wrote: printSchema calls StructField. buildFormattedString() to output schema information. buildFormattedString() use DataType.typeName as string representation of the data type. LongType. typeName = long LongType.simpleString = bigint I am not sure about the difference of these two type name representations. -Original Message- From: Exie [mailto:tfind...@prodevelop.com.au] Sent: Friday, July 24, 2015 1:35 PM To: user@spark.apache.org Subject: Re: SparkR Supported Types - Please add bigint Interestingly, after more digging, df.printSchema() in raw spark shows the columns as a long, not a bigint. root |-- localEventDtTm: timestamp (nullable = true) |-- asset: string (nullable = true) |-- assetCategory: string (nullable = true) |-- assetType: string (nullable = true) |-- event: string (nullable = true) |-- extras: array (nullable = true) ||-- element: struct (containsNull = true) |||-- name: string (nullable = true) |||-- value: string (nullable = true) |-- ipAddress: string (nullable = true) |-- memberId: string (nullable = true) |-- system: string (nullable = true) |-- timestamp: long (nullable = true) |-- title: string (nullable = true) |-- trackingId: string (nullable = true) |-- version: long (nullable = true) I'm going to have to keep digging I guess. :( -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Supported-Types-Please-add-bigint-tp23975p23978.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
RE: SparkR Supported Types - Please add bigint
printSchema calls StructField. buildFormattedString() to output schema information. buildFormattedString() use DataType.typeName as string representation of the data type. LongType. typeName = long LongType.simpleString = bigint I am not sure about the difference of these two type name representations. -Original Message- From: Exie [mailto:tfind...@prodevelop.com.au] Sent: Friday, July 24, 2015 1:35 PM To: user@spark.apache.org Subject: Re: SparkR Supported Types - Please add bigint Interestingly, after more digging, df.printSchema() in raw spark shows the columns as a long, not a bigint. root |-- localEventDtTm: timestamp (nullable = true) |-- asset: string (nullable = true) |-- assetCategory: string (nullable = true) |-- assetType: string (nullable = true) |-- event: string (nullable = true) |-- extras: array (nullable = true) ||-- element: struct (containsNull = true) |||-- name: string (nullable = true) |||-- value: string (nullable = true) |-- ipAddress: string (nullable = true) |-- memberId: string (nullable = true) |-- system: string (nullable = true) |-- timestamp: long (nullable = true) |-- title: string (nullable = true) |-- trackingId: string (nullable = true) |-- version: long (nullable = true) I'm going to have to keep digging I guess. :( -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Supported-Types-Please-add-bigint-tp23975p23978.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
RE: SparkR Supported Types - Please add bigint
Exie, Reported your issue: https://issues.apache.org/jira/browse/SPARK-9302 SparkR has support for long(bigint) type in serde. This issue is related to support complex Scala types in serde. -Original Message- From: Exie [mailto:tfind...@prodevelop.com.au] Sent: Friday, July 24, 2015 10:26 AM To: user@spark.apache.org Subject: SparkR Supported Types - Please add bigint Hi Folks, Using Spark to read in JSON files and detect the schema, it gives me a dataframe with a bigint filed. R then fails to import the dataframe as it cant convert the type. head(mydf) Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class jobj to a data.frame show(mydf) DataFrame[localEventDtTm:timestamp, asset:string, assetCategory:string, assetType:string, event:string, extras:arraystructlt;name:string,value:string, ipAddress:string, memberId:string, system:string, timestamp:bigint, title:string, trackingId:string, version:bigint] I believe this is related to: https://issues.apache.org/jira/browse/SPARK-8840 A sample record in raw JSON looks like this: {version: 1,event: view,timestamp: 1427846422377,system: DCDS,asset: 6404476,assetType: myType,assetCategory: myCategory,extras: [{name: videoSource,value: mySource},{name: playerType,value: Article},{name: duration,value: 202088}],trackingId: 155629a0-d802-11e4-13ee-6884e43d6000,ipAddress: 165.69.2.4,title: myTitle} Can someone turn this into a feature request or something for 1.5.0 ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Supported-Types-Please-add-bigint-tp23975.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SparkR Supported Types - Please add bigint
Interestingly, after more digging, df.printSchema() in raw spark shows the columns as a long, not a bigint. root |-- localEventDtTm: timestamp (nullable = true) |-- asset: string (nullable = true) |-- assetCategory: string (nullable = true) |-- assetType: string (nullable = true) |-- event: string (nullable = true) |-- extras: array (nullable = true) ||-- element: struct (containsNull = true) |||-- name: string (nullable = true) |||-- value: string (nullable = true) |-- ipAddress: string (nullable = true) |-- memberId: string (nullable = true) |-- system: string (nullable = true) |-- timestamp: long (nullable = true) |-- title: string (nullable = true) |-- trackingId: string (nullable = true) |-- version: long (nullable = true) I'm going to have to keep digging I guess. :( -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Supported-Types-Please-add-bigint-tp23975p23978.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org