[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11934 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203344269 LGTM, merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203323790 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203323791 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54501/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203323445 **[Test build #54501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54501/consoleFull)** for PR 11934 at commit [`2b87aa6`](https://github.com/apache/spark/commit/2b87aa6d013efa5762123bb3480b7a5b64879055). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203282708 **[Test build #54501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54501/consoleFull)** for PR 11934 at commit [`2b87aa6`](https://github.com/apache/spark/commit/2b87aa6d013efa5762123bb3480b7a5b64879055). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203256047 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203256050 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54491/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203255912 **[Test build #54491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54491/consoleFull)** for PR 11934 at commit [`ef77e70`](https://github.com/apache/spark/commit/ef77e7067e1ae9fe8e8e00f79d72f03d92b0f33e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-203238196 **[Test build #54491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54491/consoleFull)** for PR 11934 at commit [`ef77e70`](https://github.com/apache/spark/commit/ef77e7067e1ae9fe8e8e00f79d72f03d92b0f33e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/11934#discussion_r57829895 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/DefaultSource.scala --- @@ -125,6 +126,33 @@ class DefaultSource extends FileFormat with DataSourceRegister { } } } + + override def buildReader( + sqlContext: SQLContext, + partitionSchema: StructType, + dataSchema: StructType, + filters: Seq[Filter], + options: Map[String, String]): PartitionedFile => Iterator[InternalRow] = { +verifySchema(dataSchema) + +val conf = new Configuration(sqlContext.sparkContext.hadoopConfiguration) +val broadcastedConf = + sqlContext.sparkContext.broadcast(new SerializableConfiguration(conf)) + +val unsafeRow = new UnsafeRow(1) +val bufferHolder = new BufferHolder(unsafeRow) +val unsafeRowWriter = new UnsafeRowWriter(bufferHolder, 1) --- End diff -- yea, `HadoopFileLinesReader` returns `Text` and we should process it directly instead of converting it to string first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11934#discussion_r57828026 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/DefaultSource.scala --- @@ -125,6 +126,33 @@ class DefaultSource extends FileFormat with DataSourceRegister { } } } + + override def buildReader( + sqlContext: SQLContext, + partitionSchema: StructType, + dataSchema: StructType, + filters: Seq[Filter], + options: Map[String, String]): PartitionedFile => Iterator[InternalRow] = { +verifySchema(dataSchema) + +val conf = new Configuration(sqlContext.sparkContext.hadoopConfiguration) +val broadcastedConf = + sqlContext.sparkContext.broadcast(new SerializableConfiguration(conf)) + +val unsafeRow = new UnsafeRow(1) +val bufferHolder = new BufferHolder(unsafeRow) +val unsafeRowWriter = new UnsafeRowWriter(bufferHolder, 1) --- End diff -- Oh, its a writeable so you are avoiding the extra object allocation... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11934#discussion_r57827885 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/DefaultSource.scala --- @@ -125,6 +126,33 @@ class DefaultSource extends FileFormat with DataSourceRegister { } } } + + override def buildReader( + sqlContext: SQLContext, + partitionSchema: StructType, + dataSchema: StructType, + filters: Seq[Filter], + options: Map[String, String]): PartitionedFile => Iterator[InternalRow] = { +verifySchema(dataSchema) + +val conf = new Configuration(sqlContext.sparkContext.hadoopConfiguration) +val broadcastedConf = + sqlContext.sparkContext.broadcast(new SerializableConfiguration(conf)) + +val unsafeRow = new UnsafeRow(1) +val bufferHolder = new BufferHolder(unsafeRow) +val unsafeRowWriter = new UnsafeRowWriter(bufferHolder, 1) --- End diff -- These are not serializable and need to be moved into the the closure. Also, is this the same as the following? ```scala val encoder = ExpressionEncoder[String]() new HadoopFileLinesReader(file, broadcastedConf.value.value).map(encoder.toRow) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202844477 ping @marmbrus , do you any idea why the file source stress test keeps timeout? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202831579 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202831587 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54418/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202831035 **[Test build #54418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54418/consoleFull)** for PR 11934 at commit [`da95258`](https://github.com/apache/spark/commit/da95258ab6be0822d86f008f706beb467cc38c4e). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202825515 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202825518 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54414/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202825355 **[Test build #54414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54414/consoleFull)** for PR 11934 at commit [`f590e87`](https://github.com/apache/spark/commit/f590e87689a5cc49b25b9553315482e9ec4e5e23). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202742522 **[Test build #54418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54418/consoleFull)** for PR 11934 at commit [`da95258`](https://github.com/apache/spark/commit/da95258ab6be0822d86f008f706beb467cc38c4e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202735703 **[Test build #54414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54414/consoleFull)** for PR 11934 at commit [`f590e87`](https://github.com/apache/spark/commit/f590e87689a5cc49b25b9553315482e9ec4e5e23). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202733752 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202713243 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202713247 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54398/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202712888 **[Test build #54398 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54398/consoleFull)** for PR 11934 at commit [`f590e87`](https://github.com/apache/spark/commit/f590e87689a5cc49b25b9553315482e9ec4e5e23). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202712337 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202712338 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54396/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202712224 **[Test build #54396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54396/consoleFull)** for PR 11934 at commit [`8ea0fb2`](https://github.com/apache/spark/commit/8ea0fb282f9fc49052fa774720818f02d6ac8acb). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202650547 **[Test build #54398 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54398/consoleFull)** for PR 11934 at commit [`f590e87`](https://github.com/apache/spark/commit/f590e87689a5cc49b25b9553315482e9ec4e5e23). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-202649380 **[Test build #54396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54396/consoleFull)** for PR 11934 at commit [`8ea0fb2`](https://github.com/apache/spark/commit/8ea0fb282f9fc49052fa774720818f02d6ac8acb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201521446 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201521464 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54188/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201520326 **[Test build #54188 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54188/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201376305 **[Test build #54188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54188/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201374918 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201236007 Looks like it's a legitimate failure, @marmbrus do you any idea why file source stress test timeout? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201228026 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54152/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201228023 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201227889 **[Test build #54152 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54152/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201156989 LGTM pending Jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/11934#discussion_r57421991 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala --- @@ -57,9 +57,10 @@ import org.apache.spark.sql.types._ private[sql] object FileSourceStrategy extends Strategy with Logging { def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match { case PhysicalOperation(projects, filters, l@LogicalRelation(files: HadoopFsRelation, _, _)) - if (files.fileFormat.toString == "TestFileFormat" || - files.fileFormat.isInstanceOf[parquet.DefaultSource]) && - files.sqlContext.conf.parquetFileScan => + if files.fileFormat.toString == "TestFileFormat" || + (files.fileFormat.isInstanceOf[parquet.DefaultSource] && + files.sqlContext.conf.parquetFileScan) || + files.fileFormat.isInstanceOf[text.DefaultSource] => --- End diff -- No need to be addressed in this PR, but we should probably have a more general way to do this kind of data source dispatching. For example, ORC is in the hive package and can't be directly referenced here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201156632 **[Test build #54152 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54152/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201156127 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201128330 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201128331 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54111/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201128268 **[Test build #54111 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54111/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201078960 **[Test build #54111 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54111/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-201077863 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200960948 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54042/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200960945 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200960522 **[Test build #54042 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54042/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200857874 **[Test build #54042 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54042/consoleFull)** for PR 11934 at commit [`947fc70`](https://github.com/apache/spark/commit/947fc70f0d71259f3d499fffa1d71e6ec3fe3ca5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200825503 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200825495 **[Test build #54040 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54040/consoleFull)** for PR 11934 at commit [`60d7958`](https://github.com/apache/spark/commit/60d7958822660e3a6409d9bde55f0d37183e40e1). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class DefaultSource extends FileFormat with DataSourceRegister with Serializable ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200825505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54040/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200825003 **[Test build #54040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54040/consoleFull)** for PR 11934 at commit [`60d7958`](https://github.com/apache/spark/commit/60d7958822660e3a6409d9bde55f0d37183e40e1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/11934#issuecomment-200824066 cc @marmrust @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14114][SQL] implement buildReader for t...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/11934 [SPARK-14114][SQL] implement buildReader for text data source ## What changes were proposed in this pull request? This PR implements buildReader for text data source and enable it in the new data source code path. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark text Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/11934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #11934 commit 60d7958822660e3a6409d9bde55f0d37183e40e1 Author: Wenchen FanDate: 2016-03-24T09:54:04Z implement buildReader for text data source --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org