[jira] [Updated] (SPARK-13299) DataFrame limit operation is not consistent
[ https://issues.apache.org/jira/browse/SPARK-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nazarii Balkovskyi updated SPARK-13299: --- Description: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} was: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} > DataFrame limit operation is not consistent > --- > > Key: SPARK-13299 > URL: https://issues.apache.org/jira/browse/SPARK-13299 > Project: Spark > Issue Type: Bug >Affects Versions: 1.3.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0 >Reporter: Nazarii Balkovskyi > Labels: SparkSQL, dataframe > Attachments: SparkLimitIssue.png > > > I faced to a problem with using limit method from DataFrame API. > I try to get first 999 records from the AVRO source which contains about 3.5K > records. > {code:java} > DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); > df = df.limit(999); > {code} > Then after saving operation I get the rows not in the same order as in input > data set. Sometimes it gives me proper order but usually not. > {code:java} > df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); > {code} > Here you can see Spark plan (maybe it can help to figure out the cause of the > issue): > {code} > == Parsed Logical Plan == > Limit 999 > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Analyzed Logical Plan == > mobileNumber: bigint, tariff: string, debit: float > Limit 999 > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Optimized Logical Plan == > Limit 999 > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Physical Plan == > Limit 999 > Scan > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] > Code Generation: true > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13299) DataFrame limit operation is not consistent
[ https://issues.apache.org/jira/browse/SPARK-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nazarii Balkovskyi updated SPARK-13299: --- Description: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} was: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://:8020/user/hdfs/clientsENG10M.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} > DataFrame limit operation is not consistent > --- > > Key: SPARK-13299 > URL: https://issues.apache.org/jira/browse/SPARK-13299 > Project: Spark > Issue Type: Bug >Affects Versions: 1.3.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0 >Reporter: Nazarii Balkovskyi > Labels: SparkSQL, dataframe > Attachments: SparkLimitIssue.png > > > I faced to a problem with using limit method from DataFrame API. > I try to get first 999 records from the AVRO source which contains about 3.5K > records. > {code:java} > DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); > df = df.limit(999); > {code} > Then after saving operation I get the rows not in the same order as in input > data set. Sometimes it gives me proper order but usually not. > {code:java} > df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); > {code} > Here you can see Spark plan (maybe it can help to figure out the cause of the > issue): > {code} > == Parsed Logical Plan == > Limit 999 > Filter (1 = 1) > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Analyzed Logical Plan == > mobileNumber: bigint, tariff: string, debit: float > Limit 999 > Filter (1 = 1) > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Optimized Logical Plan == > Limit 999 > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Physical Plan == > Limit 999 > Scan > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] > Code Generation: true > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13299) DataFrame limit operation is not consistent
[ https://issues.apache.org/jira/browse/SPARK-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nazarii Balkovskyi updated SPARK-13299: --- Description: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://:8020/user/hdfs/clientsENG10M.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} was: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://lssparkmaster.edvantis.com:8020/user/hdfs/clientsENG10M.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} > DataFrame limit operation is not consistent > --- > > Key: SPARK-13299 > URL: https://issues.apache.org/jira/browse/SPARK-13299 > Project: Spark > Issue Type: Bug >Affects Versions: 1.3.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0 >Reporter: Nazarii Balkovskyi > Labels: SparkSQL, dataframe > Attachments: SparkLimitIssue.png > > > I faced to a problem with using limit method from DataFrame API. > I try to get first 999 records from the AVRO source which contains about 3.5K > records. > {code:java} > DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); > df = df.limit(999); > {code} > Then after saving operation I get the rows not in the same order as in input > data set. Sometimes it gives me proper order but usually not. > {code:java} > df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); > {code} > Here you can see Spark plan (maybe it can help to figure out the cause of the > issue): > {code} > == Parsed Logical Plan == > Limit 999 > Filter (1 = 1) > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Analyzed Logical Plan == > mobileNumber: bigint, tariff: string, debit: float > Limit 999 > Filter (1 = 1) > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Optimized Logical Plan == > Limit 999 > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Physical Plan == > Limit 999 > Scan > AvroRelation(hdfs://:8020/user/hdfs/clientsENG10M.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] > Code Generation: true > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13299) DataFrame limit operation is not consistent
[ https://issues.apache.org/jira/browse/SPARK-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nazarii Balkovskyi updated SPARK-13299: --- Description: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): {code} == Parsed Logical Plan == Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Analyzed Logical Plan == mobileNumber: bigint, tariff: string, debit: float Limit 999 Filter (1 = 1) Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Optimized Logical Plan == Limit 999 Relation[mobileNumber#0L,tariff#1,debit#2] AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) == Physical Plan == Limit 999 Scan AvroRelation(hdfs://lssparkmaster.edvantis.com:8020/user/hdfs/clientsENG10M.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] Code Generation: true {code} was: I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): == Parsed Logical Plan == Limit 999 Relation[color#0,id#1,type#2,rand#3,junk#4] AvroRelation(hdfs://:8020/tmp/hdfs.2016-02-12--10-18-55-171-488/hdfs.2016-02-12--10-19-05-109-895.avro,None,0) == Analyzed Logical Plan == color: string, id: int, type: string, rand: int, junk: string Limit 999 Relation[color#0,id#1,type#2,rand#3,junk#4] AvroRelation(hdfs://:8020/tmp/hdfs.2016-02-12--10-18-55-171-488/hdfs.2016-02-12--10-19-05-109-895.avro,None,0) == Optimized Logical Plan == InMemoryRelation [color#0,id#1,type#2,rand#3,junk#4], true, 1, StorageLevel(true, true, false, true, 1), (Limit 999), None == Physical Plan == InMemoryColumnarTableScan [color#0,id#1,type#2,rand#3,junk#4], (InMemoryRelation [color#0,id#1,type#2,rand#3,junk#4], true, 1, StorageLevel(true, true, false, true, 1), (Limit 999), None) Code Generation: true > DataFrame limit operation is not consistent > --- > > Key: SPARK-13299 > URL: https://issues.apache.org/jira/browse/SPARK-13299 > Project: Spark > Issue Type: Bug >Affects Versions: 1.3.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0 >Reporter: Nazarii Balkovskyi > Labels: SparkSQL, dataframe > Attachments: SparkLimitIssue.png > > > I faced to a problem with using limit method from DataFrame API. > I try to get first 999 records from the AVRO source which contains about 3.5K > records. > {code:java} > DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); > df = df.limit(999); > {code} > Then after saving operation I get the rows not in the same order as in input > data set. Sometimes it gives me proper order but usually not. > {code:java} > df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); > {code} > Here you can see Spark plan (maybe it can help to figure out the cause of the > issue): > {code} > == Parsed Logical Plan == > Limit 999 > Filter (1 = 1) > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Analyzed Logical Plan == > mobileNumber: bigint, tariff: string, debit: float > Limit 999 > Filter (1 = 1) > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Optimized Logical Plan == > Limit 999 > Relation[mobileNumber#0L,tariff#1,debit#2] > AvroRelation(hdfs://:8020/user/hdfs/dataset.avro,None,0) > == Physical Plan == > Limit 999 > Scan > AvroRelation(hdfs://lssparkmaster.edvantis.com:8020/user/hdfs/clientsENG10M.avro,None,0)[mobileNumber#0L,tariff#1,debit#2] > Code Generation: true > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For addit
[jira] [Updated] (SPARK-13299) DataFrame limit operation is not consistent
[ https://issues.apache.org/jira/browse/SPARK-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nazarii Balkovskyi updated SPARK-13299: --- Attachment: SparkLimitIssue.png > DataFrame limit operation is not consistent > --- > > Key: SPARK-13299 > URL: https://issues.apache.org/jira/browse/SPARK-13299 > Project: Spark > Issue Type: Bug >Affects Versions: 1.3.1, 1.5.0, 1.5.1, 1.5.2, 1.6.0 >Reporter: Nazarii Balkovskyi > Labels: SparkSQL, dataframe > Attachments: SparkLimitIssue.png > > > I faced to a problem with using limit method from DataFrame API. > I try to get first 999 records from the AVRO source which contains about 3.5K > records. > {code:java} > DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); > df = df.limit(999); > {code} > Then after saving operation I get the rows not in the same order as in input > data set. Sometimes it gives me proper order but usually not. > {code:java} > df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); > {code} > Here you can see Spark plan (maybe it can help to figure out the cause of the > issue): > == Parsed Logical Plan == > Limit 999 > Relation[color#0,id#1,type#2,rand#3,junk#4] > AvroRelation(hdfs://:8020/tmp/hdfs.2016-02-12--10-18-55-171-488/hdfs.2016-02-12--10-19-05-109-895.avro,None,0) > == Analyzed Logical Plan == > color: string, id: int, type: string, rand: int, junk: string > Limit 999 > Relation[color#0,id#1,type#2,rand#3,junk#4] > AvroRelation(hdfs://:8020/tmp/hdfs.2016-02-12--10-18-55-171-488/hdfs.2016-02-12--10-19-05-109-895.avro,None,0) > == Optimized Logical Plan == > InMemoryRelation [color#0,id#1,type#2,rand#3,junk#4], true, 1, > StorageLevel(true, true, false, true, 1), (Limit 999), None > == Physical Plan == > InMemoryColumnarTableScan [color#0,id#1,type#2,rand#3,junk#4], > (InMemoryRelation [color#0,id#1,type#2,rand#3,junk#4], true, 1, > StorageLevel(true, true, false, true, 1), (Limit 999), None) > Code Generation: true -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13299) DataFrame limit operation is not consistent
Nazarii Balkovskyi created SPARK-13299: -- Summary: DataFrame limit operation is not consistent Key: SPARK-13299 URL: https://issues.apache.org/jira/browse/SPARK-13299 Project: Spark Issue Type: Bug Affects Versions: 1.6.0, 1.5.2, 1.5.1, 1.5.0, 1.3.1 Reporter: Nazarii Balkovskyi I faced to a problem with using limit method from DataFrame API. I try to get first 999 records from the AVRO source which contains about 3.5K records. {code:java} DataFrame df = sqlContext.load(inputSource, "com.databricks.spark.avro"); df = df.limit(999); {code} Then after saving operation I get the rows not in the same order as in input data set. Sometimes it gives me proper order but usually not. {code:java} df.save(filepathToSave, "com.databricks.spark.avro", SaveMode.ErrorIfExists); {code} Here you can see Spark plan (maybe it can help to figure out the cause of the issue): == Parsed Logical Plan == Limit 999 Relation[color#0,id#1,type#2,rand#3,junk#4] AvroRelation(hdfs://:8020/tmp/hdfs.2016-02-12--10-18-55-171-488/hdfs.2016-02-12--10-19-05-109-895.avro,None,0) == Analyzed Logical Plan == color: string, id: int, type: string, rand: int, junk: string Limit 999 Relation[color#0,id#1,type#2,rand#3,junk#4] AvroRelation(hdfs://:8020/tmp/hdfs.2016-02-12--10-18-55-171-488/hdfs.2016-02-12--10-19-05-109-895.avro,None,0) == Optimized Logical Plan == InMemoryRelation [color#0,id#1,type#2,rand#3,junk#4], true, 1, StorageLevel(true, true, false, true, 1), (Limit 999), None == Physical Plan == InMemoryColumnarTableScan [color#0,id#1,type#2,rand#3,junk#4], (InMemoryRelation [color#0,id#1,type#2,rand#3,junk#4], true, 1, StorageLevel(true, true, false, true, 1), (Limit 999), None) Code Generation: true -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org