Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17872
at a glance, patch LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17872#discussion_r114985141
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala
---
@@ -81,8 +90,15
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17834#discussion_r114982772
--- Diff: hadoop-cloud/pom.xml ---
@@ -0,0 +1,185 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+ xmlns:xsi="http://www.w3.org
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17834#discussion_r114982436
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,203 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17834#discussion_r114982357
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,203 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17870#discussion_r114972572
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HDFSCredentialProvider.scala
---
@@ -75,8 +84,15 @@ private[security] class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17834
OK, now I understand. let me revert that bit of the patch
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17834
I've just pushed up an update which changes the module name; tested in
maven and SBT; hadoop cloud JAR dependencies pulled down.
A JAR is created, it's just a stub one. As a result
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17834#discussion_r114652036
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,190 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17834#discussion_r114646578
--- Diff: cloud/pom.xml ---
@@ -0,0 +1,106 @@
+
--- End diff --
OK
---
If your project is set up for it, you can reply
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17834#discussion_r114644953
--- Diff: pom.xml ---
@@ -1145,6 +1150,70 @@
+
+
--- End diff --
OK, I'll
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17834
The last one was on all the doc comments, and believe I've addressed them
both with the little typos and by focusing the docs on the main points for
Spark users: how stores differ from
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/17834
[SPARK-7481] [build] Add spark-hadoop-cloud module to pull in object store
access.
## What changes were proposed in this pull request?
Add a new `spark-hadoop-cloud ` module
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r114331264
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r114330941
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
github isn't letting me reopen this, so I'm going to submit the patch with
reworked docs as a new PR. The machines do not like me today.
---
If your project is set up for it, you can reply
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r114056158
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113997967
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113995508
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113979717
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113979701
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113976778
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113976861
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113976699
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113976264
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113971968
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113970275
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113969133
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113967945
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113962864
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113950929
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113950943
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113950840
--- Diff: docs/cloud-integration.md ---
@@ -0,0 +1,512 @@
+---
+layout: global
+displayTitle: Integration with Cloud Infrastructures
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113750071
--- Diff: pom.xml ---
@@ -1145,6 +1150,70 @@
+
+
--- End diff --
I'm
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113749971
--- Diff: pom.xml ---
@@ -621,6 +621,11 @@
${fasterxml.jackson.version
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113749781
--- Diff: cloud/pom.xml ---
@@ -0,0 +1,117 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+ xmlns:xsi="http://www.w3.org/2001/XMLS
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113749812
--- Diff: docs/storage-openstack-swift.md ---
@@ -19,41 +20,32 @@ Although not mandatory, it is recommended to configure
the proxy server of Swift
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113749428
--- Diff: cloud/pom.xml ---
@@ -0,0 +1,158 @@
+
+
+http://maven.apache.org/POM/4.0.0;
xmlns:xsi="http://www.w3.org/2001/XMLS
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113725231
--- Diff: cloud/pom.xml ---
@@ -0,0 +1,117 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+ xmlns:xsi="http://www.w3.org/2001/XMLS
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/12004#discussion_r113725132
--- Diff: assembly/pom.xml ---
@@ -226,5 +226,19 @@
provided
+
+
+
+ cloud
--- End
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
Reynold, I know very much about the time of reviewers, I put 1+h a day on
the hadoop codebase reviewing stuff, generally trying to review the work of
non-colleagues, so as to pull
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/12004
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/17747
[SPARK-11373] [CORE] Add metrics to the FsHistoryProvider
## What changes were proposed in this pull request?
This adds metrics to the history server, with the `FsHistoryProvider
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
I'm going to close this PR and start one based on a reapplication of this
patch onto master; gets rid of all the merge pain and is intended to be more
minimal. The latest comments of this one
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/17745
[SPARK-17159][Streaming] optimise check for new files in FileInputDStream
## What changes were proposed in this pull request?
Changes to `FileInputDStream` to eliminate multiple
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/17743
[SPARK-20448][DOCS] Document how FileInputDStream works with object storage
Change-Id: I88c272444ca734dc2cbc2592607c11287b90a383
## What changes were proposed in this pull request
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/14731
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
Ok. what is the way? Do I write a formal proposal?
Because right now there is no reliable way to get the full dependency graph
of Spark + hadoop cloud JARs + direct cloud provider
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
> I was secretly hoping you'd just give up on this patch, since it will
generate a lot of conflicts with the code I'm working on in parallel..
No. Sorry
I do susp
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r111942037
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -310,77 +338,87 @@ private[history] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r111934303
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -729,6 +778,116 @@ private[history] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r111914872
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
---
@@ -99,6 +104,19 @@ private[history] abstract
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r111913717
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala ---
@@ -410,34 +409,25 @@ private[history] class CacheMetrics
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17342#discussion_r111365777
--- Diff: core/src/test/scala/org/apache/spark/util/UtilsSuite.scala ---
@@ -1021,4 +1021,19 @@ class UtilsSuite extends SparkFunSuite
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
Line lengths fixed, tests all happy.
@vanzin âany chance of adding this to your review list?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17149#discussion_r04548
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -285,7 +285,7 @@ private[spark] class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
@squito Is this ready to go in? Like I warned, I'm not going to add tests
for this, not on its own
---
If your project is set up for it, you can reply to this email and have your
reply
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
@srowen anything else I need to do here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
@srowen anything else I need to do here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17342#discussion_r110517523
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2767,3 +2767,24 @@ private[spark] class CircularBuffer(sizeInBytes: Int
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17149#discussion_r110420176
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -285,7 +285,7 @@ private[spark] class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
Any comments on the latest patch? Anyone?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
Is there anything else I need to do here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
I don't have a time/plans to do the test here, as it's a fairly complex
piece of test setup for what a review should show isn't doing anything other
than guarantee the outcome pf
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17342#discussion_r107381697
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2767,3 +2767,24 @@ private[spark] class CircularBuffer(sizeInBytes: Int
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
looking some more, yes, as `tryWithSafeFinallyAndFailureCallbacks` wraps
task commit, it guarantees that the original cause doesn't get lost. The
abortJob code isn't so well guarded
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r107194624
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala ---
@@ -27,7 +27,8 @@ import scala.collection.JavaConverters
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r107152771
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala ---
@@ -27,7 +27,8 @@ import scala.collection.JavaConverters
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r107152263
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala ---
@@ -557,4 +557,16 @@ trait TestSuiteBase extends SparkFunSuite
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
Created [SPARK-20045](https://issues.apache.org/jira/browse/SPARK-20045). I
think there's room to improve resilience in the abort code, primarily to ensure
that the underlying failure cause
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
I haven't reviewed that bit of code: make it a separate JIRA and assign to
me. This one I came across in the HADOOP-2.8.0 RC3 testing; the underlying fix
there is going in, but the spark code
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
The latest patch embraces the fact that 2.6 is the base hadoop version so
the `hadoop-aws` JAR is always pulled in, dependencies set up. One thing to
bear in mind here that the [Phase I
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
Any more comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17364
Note that as [the exception
handler](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L244)
tries to close
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/17364
[SPARK-20038] [core]: move the currentWriter=null assignments into finally
{} â¦
## What changes were proposed in this pull request?
have
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17342#discussion_r107001274
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -2767,3 +2767,24 @@ private[spark] class CircularBuffer(sizeInBytes: Int
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
I haven't forgotten this; I've just been trying to make the module
POM-only, while adding support for Hadoop 2.6 builds, which is causing some
issues downstream. Specifically, my downstream
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17163
FWIW, if there is something related to serialization that people should be
pushing for in Hadoop 3, it is making all the little types serializable, such
as `Path`, `FileStatus` and the like
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17163
oh, this sucks. Find anyone who experienced "The great protobuf update of
2012" and ask them if they want to do it again.
Looking at the issues, AVRO-997 catches out &
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
The Hadoop FS Spec has now been updated to declare exactly what HDFS does
w.r.t timestamps, and warn that what other filesystems and object stores do are
implementation and installation
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17163
Checj with @busbey about binary compatibility with older generated/compiled
classes; that's the recurrent problem with protobuf
---
If your project is set up for it, you can reply
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17163
Looking @ hadoop source, there's not much in hadoop common terms of use of
`import org.apache.avro`, ut the `avro.Utf8` surfaces, and someone has tagged
`fs.Path` as `@Stringable`, which
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17163
It's invariably the transient stuff isn't it?
Mvnrepo on Avro 1.8.1 logs [jackson as a a compile time
dependency](http://mvnrepository.com/artifact/org.apache.avro/avro/1.8.1);
that's
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17120#discussion_r104286483
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -1253,8 +1253,26 @@ class FileStreamSourceSuite
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17120
I know that's the *current* use case, but I'm thinking about future
confusion, especially as the use case you espoused, "move from s3n to s3a
within the same window" is
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17080
thanks. One thing I realised last night is that logging the session token,
even at debug level, would have been a security risk. So it's very good that
the log statement got cut, even
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17080
@srowen dont worry, been tracking this: I filed the JIRA. Core code is good
(i.e. property/env var names).
One thing to bear in mind, the existing code propagates the env vars even
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17120
-1, non binding
I understand the rationale for this, to aid migration from s3/s3n to s3a,
but given the need is schema independence, you should be using the full path
name from
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
Style police. FWIW I think the lines that failed were already >100 chars,
it was just they got indented slightly more.
```
Scalastyle checks failed at following occurrences:
[er
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17080
I agree. I was just checking the files to make sure the strings were
consistent/correct, rather than trusting the documentation
---
If your project is set up for it, you can reply
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17080
LGTM. Verified option name in `org.apache.hadoop.fs.s3a.Constants` file;
env var name in `com.amazonaws.SDKGlobalConfiguration'
---
If your project is set up for it, you can reply
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/16990#discussion_r103185158
--- Diff:
sql/hive/src/test/resources/ql/src/test/queries/clientpositive/smb_mapjoin_25.q
---
@@ -19,7 +19,7 @@ select * from (select a.key from
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r103184528
--- Diff: docs/streaming-programming-guide.md ---
@@ -615,35 +615,114 @@ which creates a DStream from text
data received over a TCP socket
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r103183646
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
---
@@ -140,7 +137,7 @@ class FileInputDStream[K, V, F
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/16990#discussion_r103183030
--- Diff:
sql/hive/src/test/resources/ql/src/test/queries/clientpositive/smb_mapjoin_25.q
---
@@ -19,7 +19,7 @@ select * from (select a.key from
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14601
1. It's good to have some tests
2. I note that `appendS3AndSparkHadoopConfigurations()` has a weakness in
how it propagates env vars: no propagation of the session environment
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/16990
LGTM, though you'd have to go do the full coverage to verify that there's
not a typo in any of the strings. This is why although Spark has adopted the
more readable inline strings, I'm more
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14601
spark.hadoop.fs.* would work.
The (not yet shipped in ASF code) Azure Data Lake FS has, for reasons I
don't know and have only just noticed, adopted "dfs.adl" as their prefi
301 - 400 of 1115 matches
Mail list logo