Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r76773046
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -667,6 +700,90 @@ private[history] class FsHistoryProvider
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
sorry, didn't see that one. Will fix
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this fe
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14835#discussion_r76764190
--- Diff:
core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala ---
@@ -100,6 +100,7 @@ class HistoryServerSuite extends
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14718
LevelDB is JNI so you can't shade it; there's been some careful review so
that YARN NMs and Spark shuffle are in sync here. It's jackson versions which
break things.
---
If
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r76593850
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -244,6 +244,31 @@ class SparkHadoopUtil extends Logging
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
Test failures timeout related; unlikely to be due to this patch
```
Test Result (2 failures / +2)
org.apache.spark.sql.hive.HiveSparkSubmitSuite.dir
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r76515026
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -225,14 +274,26 @@ class HistoryServer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r76514866
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -244,6 +244,31 @@ class SparkHadoopUtil extends Logging
GitHub user steveloughran reopened a pull request:
https://github.com/apache/spark/pull/9571
[SPARK-11373] [CORE] Add metrics to the History Server and FsHistoryProvider
This adds metrics to the history server, with the `FsHistoryProvider`
metering its load, performance and
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/9571
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14835
I'll let other people review the source code in detail, except note that
people calling the REST API may want to ask for all entries, even if the web
view asks for less.
1. the
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14835#discussion_r76491726
--- Diff: dev/.rat-excludes ---
@@ -101,3 +101,4 @@ org.apache.spark.scheduler.ExternalClusterManager
.*\.sql
.Rbuildignore
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14827#discussion_r76401596
--- Diff: pom.xml ---
@@ -2511,8 +2511,11 @@
hadoop-2.7
+
--- End diff --
How can I set this profile
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14827
This patch tries to set the default version to 2.7; I'll see if SBT picks
it up.
This is *not* something I'm proposing for the final merge; there I expect
people t
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/14827
[SPARK-17259] [build] [WiP] Hadoop 2.7 profile to depend on Hadoop 2.7.3
## What changes were proposed in this pull request?
increment the `hadoop.version` value in the `hadoop-2.7
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
Having looked at the source code, `FileSystem.globStatus()` uses the glob
patterns, which are not the same as the posix regexp ones.
[org.apache.hadoop.fs.GlobPattern](http://grepcode.com
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r76110009
--- Diff: docs/streaming-programming-guide.md ---
@@ -644,13 +644,39 @@ methods for creating DStreams from files as input
sources
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r76105141
--- Diff: docs/streaming-programming-guide.md ---
@@ -644,13 +644,39 @@ methods for creating DStreams from files as input
sources
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r75945926
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
---
@@ -196,29 +192,33 @@ class FileInputDStream[K, V, F
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r75945790
--- Diff: docs/streaming-programming-guide.md ---
@@ -644,13 +644,39 @@ methods for creating DStreams from files as input
sources
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
The logic has got complex enough it merits unit tests. Pulling into
SparkHadoopUtils itself and writing some for the possible: simple, glob matches
one , glob matches 1+, glob doesn't
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
1. updated the code to bypass the glob routine when there is no wildcard;
this bypasses something fairly inefficient.
1. reporting FNFE on that base dir differently; skip the stack trace
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
I've now done the [s3a streaming
test/example](https://github.com/steveloughran/spark/blob/features/SPARK-7481-cloud/cloud/src/main/scala/org/apache/spark/cloud/s3/examples/S3Streaming.
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
Actually, I've just noticed that DStream behaviour isn't in sync with the
streaming programming guide, which says "files written in nested directories
not supported)".
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
LGTM. I was trying to see if there was a way to create a good test here by
triggering the takes-too-long codepath and having a counter, but there's no
obvious way to do that deterministi
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
There's no performance problem from filtering just on names. It's when
people try to filter on more complex things (file type, timestamp) they need to
call `getFileStatus(path)`
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
Oh, i don't want to take on any more work...I just think you should make
the predicate passed in something that goes `FileStatus => Boolean` instead of
`String => Boolean`, an
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14659
Chris: maybe the CallerContext class could check for bad characters,
including spaces, newlines, "," and quotation marks .. the usual things to
break parsers.
Th
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14659
having some problems adding you as a contributor; JIRA scale issues,
browser problems &c, I've asked others to try and do it. Start with the coding;
I'll sort out the con
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14601
I'd like to propose that the list of filesystem properties to propagate is
actually defined as a list in a spark property, default could be "fs.s3a,
fs.s3n, fs.s3, fs.swift, fs.w
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14601#discussion_r75584298
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -107,6 +107,14 @@ class SparkHadoopUtil extends Logging
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12695
As #13868 does adopt `org.apache.hadoop.io.Path`, I don't see this patch
being needed âthough it may highlight some places where the new code may need
applying
---
If your project i
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12695
If you are working with windows paths; Hadoop's Path class contains the
code to do this, stabilised and addressing the corner cases
---
If your project is set up for it, you can rep
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14601#discussion_r75584303
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -102,11 +102,19 @@ class SparkHadoopUtil extends Logging
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14718
Moving the jackson/leveldb dependencies isn't going to create problems on
the yarn shuffle CP are they? Given the versions aren't changing, I'm not too
worried âI just wa
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14659
That Caller context doesn't list Spark as one of the users in its
LimitedPrivate scope. Add a Hadoop patch there and I'll get it in. This avoids
arguments later when someone brea
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
Path filtering in Hadoop FS calls on anything other than filename is very
suboptimal; in #14731 you can see where the filtering has been postoned until
after the listing, when the full
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r75584026
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
---
@@ -293,8 +290,8 @@ class FileInputDStream[K, V, F
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14731#discussion_r75584030
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
---
@@ -241,16 +233,21 @@ class FileInputDStream[K, V, F
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
to be precise: the caching of file modification times is superfluous. It's
there to avoid the cost of executing `getFileStatus()` on previously scanned
files. Once you use the FileS
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14731
# I'm going to scan through and tune them elsewhere; really I'm going by
uses of the listFiles calls
There's actually no significant use elsewhere that I can see; ju
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/14731
[SPARK-17159] [streaming]: optimise check for new files in FileInputDStream
## What changes were proposed in this pull request?
This PR optimises the filesystem metadata reads in
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14371
..rebased the patch against master; addressed @vanzin's comments. the
`mkdirs()` change in `HDFSBackedStateStoreProvider` done after reviewing code
in Hadoop, esp. HDFS and RawLocal. Whe
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r75114999
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/FileBasedWriteAheadLog.scala
---
@@ -231,13 +232,17 @@ private[streaming] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r75114918
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala
---
@@ -443,6 +445,9 @@ private
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r75114636
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala
---
@@ -278,14 +278,15 @@ private
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74963037
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -226,6 +259,135 @@ class HistoryServer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74962913
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -226,6 +259,135 @@ class HistoryServer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74962863
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -226,6 +259,135 @@ class HistoryServer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74960995
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -226,6 +259,135 @@ class HistoryServer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74911975
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -114,28 +123,45 @@ class HistoryServer(
* this UI with
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74910950
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -667,6 +710,123 @@ private[history] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74910734
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -667,6 +710,123 @@ private[history] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r74910796
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -667,6 +710,123 @@ private[history] class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14646
I'm adding the ability to test against staged releases, such as Hadoop
2.7.3 RC1. Add this profile and testing that spark runs with the new RC is a
matter of setting the version with a -
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14646
I'd be against making it default for a few reasons
1. You don't want to accidentally pick up some staging artifact or upstream
snapshot.
2. I don't know how SBT/Ivy
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14646
Note that Jenkins, being SBT-based, isn't going to explore the codepath here
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/14646
[SPARK-17058] [build] Add maven snapshots-and-staging profile to build/test
against staging artifacts
## What changes were proposed in this pull request?
Adds a `snapshots-and
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/13830#discussion_r74684998
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -73,21 +73,67 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r74684971
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/util/FileBasedWriteAheadLog.scala
---
@@ -231,13 +232,17 @@ private[streaming] class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14371
Pulled the WiP; happy for full reviews âthough I'm on vacation right now,
so can't handle feedback just yet
---
If your project is set up for it, you can reply to this email and
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72452587
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala
---
@@ -340,13 +341,15 @@ private
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72451806
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -301,14 +303,23 @@ private[spark] object
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72451702
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -90,8 +90,13 @@ private[spark] class EventLoggingListener
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72451554
--- Diff:
core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala ---
@@ -240,7 +248,7 @@ private[spark] object ReliableCheckpointRDD
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72450542
--- Diff:
core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala ---
@@ -166,17 +166,25 @@ private[spark] object ReliableCheckpointRDD
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72446633
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1410,10 +1410,12 @@ class SparkContext(config: SparkConf) extends
Logging
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72429947
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1410,10 +1410,12 @@ class SparkContext(config: SparkConf) extends
Logging
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72426957
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -301,14 +303,23 @@ private[spark] object
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72426791
--- Diff:
core/src/main/scala/org/apache/spark/rdd/ReliableCheckpointRDD.scala ---
@@ -166,17 +166,25 @@ private[spark] object ReliableCheckpointRDD
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72426474
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala
---
@@ -404,6 +404,27 @@ class SparkHadoopUtil extends Logging
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72425952
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1410,10 +1410,12 @@ class SparkContext(config: SparkConf) extends
Logging
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/14371#discussion_r72302178
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -90,8 +90,13 @@ private[spark] class EventLoggingListener
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/14371
[SPARK-16736] WiP Core+ SQL superfluous fs calls
## What changes were proposed in this pull request?
A review of the code, working back from Hadoop's `FileSystem.exists()
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/13830#discussion_r72242750
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala
---
@@ -73,21 +73,67 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14163
LGTM. Clarifies that it is yarn-cluster mode only, not in client.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
This patch adds separate average values of the load times vs merge times
per event; this shows ~2x difference in replay from load in the test case.,
These `.time` gauges are little
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/12004
downgrading to a WIP as to work reliably it needs
[HADOOP-12636](https://issues.apache.org/jira/browse/HADOOP-12636) on the
hadoop code else the presence of `hadoop-aws.jar` on the CP without
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/13218#discussion_r67489509
--- Diff:
core/src/main/scala/org/apache/spark/deploy/rest/RestCsrfPreventionFilter.scala
---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/13218
I can see there is fear of breaking things, especially with third party
clients. There's also the risk of cross-version submissions; the REST API is
meant to be stable enough for back
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
One other metric set I'm thinking of relates to a JIRA on app UIs not being
visible: making the time of last scan a metric, both as an epoch time and diff
from current time. That would
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r67326183
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -395,7 +429,8 @@ private[history] class FsHistoryProvider
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r67325987
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
---
@@ -110,3 +127,87 @@ private[history] abstract
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9571
Updated patch. Addresses indentation, found and eliminated one more call to
{{initialize()}} outside of constructor.
Adds a whole new counter, `event.replay.count`, which counts the
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r66614402
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -667,6 +700,90 @@ private[history] class FsHistoryProvider
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r66613829
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -278,6 +303,9 @@ private[history] class FsHistoryProvider
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r66612954
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -114,28 +123,45 @@ class HistoryServer(
* this UI with
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/7786
> But I'm just trying to point out that the current change doesn't really
make things better. Without killing the executor, you'll still be holding on to
resources, except n
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/13579
[SPARK-15844] [core] HistoryServer doesn't come up if spark.authenticate =
true
## What changes were proposed in this pull request?
During history server startup, the
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/7786
@vanzin I suspect that if you get told you are being pre-empted, you aren't
likely to get containers elsewhere âpre-emption is a sign of demand being too
high, and your queue lower pri
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9571#discussion_r66204561
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -114,28 +123,45 @@ class HistoryServer(
* this UI with
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/11033#issuecomment-221970852
thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/13217#issuecomment-220592819
1. It's nice to see someone sitting down to deal with the windows test
problem.
1. Hadoop 2.8+ will fail meaningfully here, with an exception includ
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/13217#discussion_r64032005
--- Diff: R/WINDOWS.md ---
@@ -11,3 +11,19 @@ include Rtools and R in `PATH`.
directory in Maven in `PATH`.
4. Set `MAVEN_OPTS` as described
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/12004#issuecomment-217536672
For anyone trying to run these tests, they'll need a test xml file and
refer to it
```
mvn test -Phadoop-2.6 -Dcloud.test.configuration
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/11129#issuecomment-217520612
It's actually being fixed right now in Hadoop 2.8, which will take a while
to surface.
[YARN-4925](https://issues.apache.org/jira/browse/YARN
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/11033#issuecomment-214831899
@tgravescs the logging bit of the patch is in sync with master/ .
Is there anything else you want me to do regarding the documentation to get
it into a
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/11394
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/12004#issuecomment-214801641
Note that as this module only builds on Hadoop >= 2.6; jenkins won't be
compiling it. The tests are designed to skip running if no config file
Github user steveloughran commented on the pull request:
https://github.com/apache/spark/pull/12004#issuecomment-214753851
Oh, and there's an initial documentation page on spark + cloud
infrastructure, which tries to make clear that object stores are not real
filesystems
-
501 - 600 of 1133 matches
Mail list logo