[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-14 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/17640 Overall, this looks like a sensible approach to a messy problem. You might want to think about adding some overflow handling to the SQL-->R translation. That is, if a Dataframe conta

[GitHub] spark issue #15027: [SPARK-17475] [STREAMING] Delete CRC files if the filesy...

2016-11-02 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15027 @viirya to answer your question re deleting vs moving the files: Deleting is easier to implement, because once the .crc file is deleted, you can be sure it won't appear again. Moving the checksum

[GitHub] spark issue #15027: [SPARK-17475] [STREAMING] Delete CRC files if the filesy...

2016-10-27 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15027 When I comment out line 155 in HDFSMetadataLog.scala on this branch (`if (fileManager.exists(crcPath)) fileManager.delete(crcPath)`) and run the test case attached to this PR, the test case fails

[GitHub] spark issue #15162: [SPARK-17386] [STREAMING] [WIP] Make polling rate adapti...

2016-10-26 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15162 Closing the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15162: [SPARK-17386] [STREAMING] [WIP] Make polling rate...

2016-10-26 Thread frreiss
Github user frreiss closed the pull request at: https://github.com/apache/spark/pull/15162 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source trait ...

2016-10-26 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 Updated the branch and addressed new review comments. Looks like my last push missed a one-line change to memory.scala. Tests are running now. --- If your project is set up for it, you can reply

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r85227714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala --- @@ -111,6 +126,23 @@ case class MemoryStream[A : Encoder](id

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r85227658 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala --- @@ -111,6 +126,23 @@ case class MemoryStream[A : Encoder](id

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-21 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r84569335 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -337,17 +343,27 @@ class StreamExecution

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-21 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r84539662 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -336,17 +342,27 @@ class StreamExecution

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-21 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r84538526 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -336,17 +342,27 @@ class StreamExecution

[GitHub] spark issue #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source trait ...

2016-10-21 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 I've been running tests since this morning; should have updates in soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-19 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r84138507 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala --- @@ -30,16 +30,30 @@ trait Source { /** Returns

[GitHub] spark issue #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source trait ...

2016-10-17 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 All my changes are in now, and regression tests pass. As far as I can see, all the review comments have been addressed at this point. --- If your project is set up for it, you can reply

[GitHub] spark issue #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source trait ...

2016-10-14 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 Sorry, I missed the last few email notifications about this PR. I've merged with the head version and made updates to address the most recent round of review comments. Currently running regression

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-14 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r83524491 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala --- @@ -92,21 +105,64 @@ class TextSocketSource(host: String, port

[GitHub] spark pull request #14553: [SPARK-16963] [STREAMING] [SQL] Changes to Source...

2016-10-14 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r83524216 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala --- @@ -30,16 +30,37 @@ trait Source { /** Returns

[GitHub] spark pull request #15352: [SPARK-17780][SQL]Report Throwable to user in Str...

2016-10-05 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15352#discussion_r82085912 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -207,13 +207,18 @@ class StreamExecution

[GitHub] spark pull request #15307: [WIP][SPARK-17731][SQL][STREAMING] Metrics for st...

2016-10-03 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81684775 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -136,16 +139,30 @@ class StreamExecution

[GitHub] spark pull request #15307: [WIP][SPARK-17731][SQL][STREAMING] Metrics for st...

2016-10-03 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81672432 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -317,15 +358,18 @@ class StreamExecution

[GitHub] spark pull request #15307: [WIP][SPARK-17731][SQL][STREAMING] Metrics for st...

2016-10-03 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81672040 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -136,16 +139,30 @@ class StreamExecution

[GitHub] spark pull request #15307: [WIP][SPARK-17731][SQL][STREAMING] Metrics for st...

2016-10-03 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81668841 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala --- @@ -56,7 +57,12 @@ case class StateStoreRestoreExec

[GitHub] spark issue #15262: [SPARK-17690][STREAMING][SQL] Add mini-dfs cluster based...

2016-09-28 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15262 LGTM overall. We may want to switch more of the test cases to use HDFS in a follow-on JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15258: [SPARK-17689][SQL][STREAMING] added excludeFiles option ...

2016-09-27 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15258 This change allows FileInputStream to consume partial outputs of a system such as Hadoop or another copy of Spark, provided that the system adheres rigidly to the write policy of recent versions

[GitHub] spark pull request #15258: [SPARK-17689][SQL][STREAMING] added excludeFiles ...

2016-09-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15258#discussion_r80838376 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -50,6 +50,19 @@ class ListingFileCatalog

[GitHub] spark pull request #15262: [SPARK-17690][STREAMING][SQL] Add mini-dfs cluste...

2016-09-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15262#discussion_r80826485 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -330,15 +353,42 @@ class FileStreamSourceSuite extends

[GitHub] spark issue #15005: [SPARK-17421] [DOCS] Documenting the current treatment o...

2016-09-23 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15005 Thanks @srowen for all the thoughtful comments! It's great to see committers spending time to help improve the build experience for new developers. --- If your project is set up for it, you can

[GitHub] spark pull request #15005: [SPARK-17421] [DOCS] Documenting the current trea...

2016-09-21 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15005#discussion_r79902326 --- Diff: docs/building-spark.md --- @@ -16,24 +16,32 @@ Building Spark using Maven requires Maven 3.3.9 or newer and Java 7+. ### Setting up

[GitHub] spark pull request #15005: [SPARK-17421] [DOCS] Documenting the current trea...

2016-09-21 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15005#discussion_r79766509 --- Diff: docs/building-spark.md --- @@ -16,24 +16,27 @@ Building Spark using Maven requires Maven 3.3.9 or newer and Java 7+. ### Setting up

[GitHub] spark pull request #15005: [SPARK-17421] [DOCS] Documenting the current trea...

2016-09-21 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15005#discussion_r79765892 --- Diff: docs/building-spark.md --- @@ -16,24 +16,31 @@ Building Spark using Maven requires Maven 3.3.9 or newer and Java 7+. ### Setting up

[GitHub] spark issue #15005: [SPARK-17421] [DOCS] Documenting the current treatment o...

2016-09-21 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15005 Summary of testing: - On Java 8, the build fails intermittently with OOM when `-Xmx2g` is omitted - The `-XX:ReservedCodeCacheSize=512m` argument prevents warnings on both Java 7

[GitHub] spark pull request #15166: [SPARK-17513][SQL] Make StreamExecution garbage-c...

2016-09-20 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15166#discussion_r79730904 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -125,6 +125,30 @@ class StreamingQuerySuite extends

[GitHub] spark pull request #15166: [SPARK-17513][SQL] Make StreamExecution garbage-c...

2016-09-20 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15166#discussion_r79730664 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -125,6 +125,30 @@ class StreamingQuerySuite extends

[GitHub] spark pull request #15166: [SPARK-17513][SQL] Make StreamExecution garbage-c...

2016-09-20 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15166#discussion_r79722643 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -125,6 +125,30 @@ class StreamingQuerySuite extends

[GitHub] spark pull request #15162: [SPARK-17386] [STREAMING] [WIP] Make polling rate...

2016-09-20 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/15162 [SPARK-17386] [STREAMING] [WIP] Make polling rate adaptive ## What changes were proposed in this pull request? This change makes the scheduler in `StreamExecution` adjust its rate

[GitHub] spark pull request #15067: [SPARK-17513] [STREAMING] [SQL] Make StreamExecut...

2016-09-20 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15067#discussion_r79662093 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -125,6 +125,32 @@ class StreamingQuerySuite extends

[GitHub] spark pull request #15067: [SPARK-17513] [STREAMING] [SQL] Make StreamExecut...

2016-09-20 Thread frreiss
Github user frreiss closed the pull request at: https://github.com/apache/spark/pull/15067 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15005: [SPARK-17421] [DOCS] Documenting the current treatment o...

2016-09-20 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15005 I've about narrowed down the options that work for OpenJDK 7 and 8 on Mac and Linux. Working on IBM Java on Linux. I can have an update in by EOD today. BTW, one thing that's been slowing me

[GitHub] spark issue #15005: [SPARK-17421] [DOCS] Documenting the current treatment o...

2016-09-14 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15005 Quick update: I'm running a series of test builds with various parameters to determine what parts of MAVEN_OPTS are currently necessary on different versions of Java. Will report back in a few days

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-13 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/13513 Ah, now I fully understand @zsxwing's earlier comment about the semantics of the semantics of `Source.getBatch()`. Those semantics have a design flaw; see the email thread I started at http

[GitHub] spark pull request #15067: [SPARK-17513] [STREAMING] [SQL] Make StreamExecut...

2016-09-12 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/15067 [SPARK-17513] [STREAMING] [SQL] Make StreamExecution garbage-collect its metadata ## What changes were proposed in this pull request? This PR modifies StreamExecution

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-12 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/13513 You could just move the metadata deletion logic from FileStreamSinkLog into CompactibleFileStreamLog. Then FileStreamSource could issue DELETE log records for files that are older than

[GitHub] spark pull request #15027: [SPARK-17475] [STREAMING] Delete CRC files if the...

2016-09-12 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/15027#discussion_r78469982 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala --- @@ -146,6 +146,11 @@ class HDFSMetadataLog[T: ClassTag

[GitHub] spark issue #15005: [SPARK-17421] [DOCS] Documenting the current treatment o...

2016-09-09 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/15005 Sure, I'll redo that part so that includes two sets of recommended options. Note that docs in the Spark 2.0.0 release say that these options aren't necessary for Java 8. --- If your project

[GitHub] spark pull request #15027: [SPARK-17475] [STREAMING] Delete CRC files if the...

2016-09-09 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/15027 [SPARK-17475] [STREAMING] Delete CRC files if the filesystem doesn't use checksum files ## What changes were proposed in this pull request? When the metadata logs for various parts

[GitHub] spark pull request #14945: [SPARK-17386] Set default trigger interval to 1/1...

2016-09-07 Thread frreiss
Github user frreiss closed the pull request at: https://github.com/apache/spark/pull/14945 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14945: [SPARK-17386] Set default trigger interval to 1/10 secon...

2016-09-07 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14945 On a closer reading of the code, there is a more expedient fix; change the default STREAMING_POLLING_DELAY parameter. Will redo. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #15005: [SPARK-17421] Documenting the current treatment o...

2016-09-07 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/15005 [SPARK-17421] Documenting the current treatment of MAVEN_OPTS. ## What changes were proposed in this pull request? Modified the documentation to clarify that `build/mvn` and `pom.xml

[GitHub] spark pull request #14986: [WIP] [SPARK-17421] Don't use -XX:MaxPermSize opt...

2016-09-07 Thread frreiss
Github user frreiss closed the pull request at: https://github.com/apache/spark/pull/14986 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14986: [WIP] [SPARK-17421] Don't use -XX:MaxPermSize option whe...

2016-09-07 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14986 Make sense. I will close this PR and just add a clarification to the documentation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #14986: [WIP] [SPARK-17421] Don't use -XX:MaxPermSize opt...

2016-09-06 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/14986 [WIP] [SPARK-17421] Don't use -XX:MaxPermSize option when Java version >= 8 ## What changes were proposed in this pull request? Modifies the `build/mvn` and `build/sbt-launch-lib.b

[GitHub] spark pull request #14945: [SPARK-17386] Set default trigger interval to 1/1...

2016-09-02 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/14945 [SPARK-17386] Set default trigger interval to 1/10 second ## What changes were proposed in this pull request? This pull request implements the most expedient change to fix SPARK-17386

[GitHub] spark issue #14553: [SPARK-16963] Changes to Source trait and related implem...

2016-08-31 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 @ScrapCodes, would you mind triggering a build of this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-08-30 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14803 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14870: [SPARK-17303] Added spark-warehouse to dev/.rat-e...

2016-08-29 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/14870 [SPARK-17303] Added spark-warehouse to dev/.rat-excludes ## What changes were proposed in this pull request? Excludes the `spark-warehouse` directory from the Apache RAT checks that src

[GitHub] spark pull request #14803: [SPARK-17153][SQL] Should read partition data whe...

2016-08-29 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14803#discussion_r76646983 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -129,13 +129,20 @@ class FileStreamSource

[GitHub] spark issue #14553: [SPARK-16963] Changes to Source trait and related implem...

2016-08-29 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 @rxin and @marmbrus, would it be possible to get this PR reviewed soon? I can split it into smaller chunks if that would make things easier; I just need to know. --- If your project is set up

[GitHub] spark pull request #14691: [SPARK-16407][STREAMING] Allow users to supply cu...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14691#discussion_r76505064 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -123,12 +124,30 @@ final class DataStreamWriter[T] private

[GitHub] spark pull request #14691: [SPARK-16407][STREAMING] Allow users to supply cu...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14691#discussion_r76504239 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala --- @@ -123,12 +124,30 @@ final class DataStreamWriter[T] private

[GitHub] spark pull request #14773: [SPARK-17203][SQL] data source options should alw...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14773#discussion_r76503740 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -65,7 +65,7 @@ case class

[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-26 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14802 LGTM. I have written nearly the exact same thing as part of [https://github.com/apache/spark/pull/14553], but can use this version of the method instead. --- If your project is set up for it, you

[GitHub] spark pull request #13513: [SPARK-15698][SQL][Streaming] Add the ability to ...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13513#discussion_r76499068 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -129,3 +131,86 @@ class FileStreamSource

[GitHub] spark pull request #14553: [SPARK-16963] Changes to Source trait and related...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r76498637 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -727,6 +732,48 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #14553: [SPARK-16963] Changes to Source trait and related...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r76498301 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/socket.scala --- @@ -24,21 +24,24 @@ import java.text.SimpleDateFormat

[GitHub] spark pull request #14553: [SPARK-16963] Changes to Source trait and related...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r76498251 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -244,6 +250,21 @@ class StreamExecution

[GitHub] spark pull request #14553: [SPARK-16963] Changes to Source trait and related...

2016-08-26 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14553#discussion_r76498223 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MetadataLog.scala --- @@ -48,4 +49,13 @@ trait MetadataLog[T] { * Return

[GitHub] spark issue #14553: [WIP] [SPARK-16963] Initial version of changes to Source...

2016-08-22 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/14553 These changes are now ready for review. The contents of this PR pass regression tests on my machines. Can one of the committers please start a Jenkins build? --- If your project is set up

[GitHub] spark pull request #14151: [SPARK-16496][SQL] Add wholetext as option for re...

2016-08-15 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14151#discussion_r74805700 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/text/TextSuite.scala --- @@ -39,6 +39,11 @@ class TextSuite extends QueryTest

[GitHub] spark pull request #14151: [SPARK-16496][SQL] Add wholetext as option for re...

2016-08-15 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/14151#discussion_r74804217 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -533,6 +533,12 @@ object SQLConf { .timeConf

[GitHub] spark pull request #14553: [WIP] [SPARK-16963] Initial version of changes to...

2016-08-08 Thread frreiss
GitHub user frreiss opened a pull request: https://github.com/apache/spark/pull/14553 [WIP] [SPARK-16963] Initial version of changes to Source trait ## What changes were proposed in this pull request? Initial proposed changes to the Source trait such that the scheduler can

[GitHub] spark issue #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScalarSubque...

2016-06-10 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/13155 Tests ran successfully on my machine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScalarSubque...

2016-06-10 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/13155 Updated changes are in. Running a full regression suite overnight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66564793 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66561815 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66561017 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66560947 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66560868 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66558119 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66558125 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66539170 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66509510 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66509444 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66509405 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark issue #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScalarSubque...

2016-06-09 Thread frreiss
Github user frreiss commented on the issue: https://github.com/apache/spark/pull/13155 @rxin I'll have an updated set of changes in tonight --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-09 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r66478261 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-02 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r65584546 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-02 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r65583548 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request #13155: [SPARK-15370] [SQL] Update RewriteCorrelatedScala...

2016-06-01 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r65454461 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedScalarSubque...

2016-05-31 Thread frreiss
Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/13155 Thanks @hvanhovell for the additional pass of review! I'll be preparing my slides for Spark Summit all day today but will come back to this PR as soon as that's done. --- If your project

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-28 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64996012 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-28 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64995985 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64944724 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64943404 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64942870 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64942480 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-27 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r64941953 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1695,16 +1695,176 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-24 Thread frreiss
Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/13155#issuecomment-221336991 Could one of the committers please trigger another build on this PR? The change set passes all the tests on my machine, but it's good to be safe. --- If your project

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-20 Thread frreiss
Github user frreiss commented on the pull request: https://github.com/apache/spark/pull/13155#issuecomment-220684859 I've added additional changes to cover two additional cases that @hvanhovell pointed out on review, plus one additional case that came up while fixing the first two

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-19 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r63954767 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1648,16 +1648,56 @@ object

[GitHub] spark pull request: [SPARK-15370] [SQL] Update RewriteCorrelatedSc...

2016-05-19 Thread frreiss
Github user frreiss commented on a diff in the pull request: https://github.com/apache/spark/pull/13155#discussion_r63933593 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1648,16 +1648,56 @@ object

  1   2   >