Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19497
I guess one aspect of `saveAsNewAPIHadoopFile` is that it calls `
jobConfiguration.set("mapreduce.output.fileoutputformat.outputdir", path)`, and
`Configuration.set(String key, St
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r145096772
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,252 @@
+/*
+ * Licensed to
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144948292
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
thanks for the review everyone!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144823664
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144823298
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144822800
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144822829
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144821527
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144821389
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala
---
@@ -0,0 +1,254 @@
+/*
+ * Licensed to
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19487
The more I see of the committer internals, the less confident I am about
understanding any of it.
If your committer isn't writing stuff out, it doesn't need to have any
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
done. Not writing 0-byte files will offer significant speedup against
object stores, where the cost of a call to getFileStatus() can take hundreds of
millis. I look forward to it
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19448
> But, if I were working on a Spark distribution at a vendor, this is
something I would definitely include because it's such a useful feature.
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19487
"" can come in via configuration files; I'd treat that the same as null.
Things which aren't valid URIs though, that's som
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19487
Looking a bit more at this. I see it handles """ as well as empty, and also
other forms of invalid URI which Path can't handle today ("multiple colons
except wit
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19487#discussion_r144545827
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
---
@@ -60,15 +71,6 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
The latest PR update pulls in @dongjoon-hyun's new test; to avoid merge
conflict in the Insert suite I've rebased against master.
1. Everything handles missing files on ou
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19448
PS, for people who are interested in dynamic committers,
[MAPREDUCE-6823](https://issues.apache.org/jira/browse/MAPREDUCE-6823) is
something to look at. It allows you to switch committers
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19448
Thanks for reviewing this/getting it in. Personally, I had it in the
"improvement" category rather than bug fix. If it wasn't for that line in the
docs, there'd be no amb
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
Noted :)
@dongjoon-hyun : is the issue with ORC that if there's nothing to write, it
doesn't generate a file (so avoiding that issue with sometimes you get 0-byte
ORC file
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18979#discussion_r144505454
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
---
@@ -57,7 +60,14 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19487
LGTM. I'm going stick out today a slight roll of my PathOutputCommitter
class which is one layer above FileOutputCommitter : lets people write
committers without output & work p
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144381367
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,10 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144375059
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,11 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144239543
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,10 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144238941
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetCommitterSuite.scala
---
@@ -0,0 +1,152
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144065810
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,13 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144065074
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,13 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144065041
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetCommitterSuite.scala
---
@@ -0,0 +1,149
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
@viirya : the new data writer API will allow for a broader set of stats to
be propagated back from workers. When you are working with the object stores,
an useful stat to get back is throttle
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r143992362
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetCommitterSuite.scala
---
@@ -0,0 +1,149
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r143992319
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,13 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r143992018
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,13 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
Has anyone had a look at this recently?
The problem still exists, and while downstream filesystems can address if
they recognise the use case & lie about values, they wil
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19269
+1 for the ability to return statistics: the remote stores have lots of
information which committers may return
---
-
To
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r143530841
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java
---
@@ -0,0 +1,297
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19448
+ @rdblue
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/19448
[SPARK-22217] [SQL] ParquetFileFormat to support arbitrary OutputCommitters
## What changes were proposed in this pull request?
`ParquetFileFormat` to relax its requirement of output
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19294
@szhem that null path support in `FileOutputCommitter` came with the App
Master recovery work of
[MAPREDUCE-3711](https://issues.apache.org/jira/browse/MAPREDUCE-3711); its,
trying to
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19368
Looking @ this, things would be a lot less brittle if there wasn't a round
trip Path -> String -> Path. I'm thinking of Windows paths here in particular.
Other than tests,
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19269
One other thing that would be good now and invaluable in future is for the
`DataWriter.commit()` call to return a `Map[String,Long]` of statistics
alongside the message sent to the committer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142005126
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Command.scala
---
@@ -0,0 +1,113
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142005072
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Command.scala
---
@@ -0,0 +1,113
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142004971
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java
---
@@ -0,0 +1,297
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142004889
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java
---
@@ -0,0 +1,297
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142004831
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java
---
@@ -0,0 +1,297
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142004814
--- Diff:
sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java
---
@@ -0,0 +1,297
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142004778
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Command.scala
---
@@ -0,0 +1,113
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19269
People may know that I'm busy with some S3 committers which work with
Hadoop MapReduce & Spark, with an import of Ryan's commtter into the Hadoop
codebase. Thisa includes
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r142004644
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/17745
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19294#discussion_r140658582
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
---
@@ -130,17 +135,21 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17743
People don't realise how much object stores aren't file systems until they
discover all their assumptions are broken.
Once you know how they work, you can set up a workflo
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19294
As I play with commit logic all the way through the stack, I can' t help
thinking everyone's lives would be better if we tagged the MRv1 commit APIs as
deprecated in Hadoop 3. and u
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19294#discussion_r140188088
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
---
@@ -130,17 +135,21 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17745
Due to lack of support/interest, moved to
https://github.com/hortonworks-spark/cloud-integration
---
-
To unsubscribe, e
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/17747
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19294#discussion_r140008216
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -568,6 +568,51 @@ class PairRDDFunctionsSuite extends
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/19294#discussion_r140008084
--- Diff:
core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala ---
@@ -568,6 +568,51 @@ class PairRDDFunctionsSuite extends
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18111
thx
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18111
I believe this patch implements the original design goal: if a committer
doesn't have a working path supplied by `getWorkingPath()` then it downgrades.
It might be worthwhile do
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
Related to this, updated spec on [Hadoop output stream, Syncable and
StreamCapabilities](https://github.com/steveloughran/hadoop/blob/s3/HADOOP-13327-outputstream-trunk/hadoop-common-project
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18979#discussion_r134460176
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
---
@@ -57,7 +60,14 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17342
@Chopinxb no worries; the hard part is thinking how to fix this. I don't
see it being possible to do reliably except through an explicit download.
Hadoop 2.8+ has moved off commons-loggi
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
@adrian-ionescu wrote
> is there a need for calling getFinalStats() more than once?
No. As long as everyone is aware of it, it won't be an issue.
---
If your project i
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
> To mimic S3-like behavior, you can overwrite the file system
spark.hadoop.fs.$scheme.impl"
@gatorsmile: you will be able to do something better soon, as S3A is ad
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18979
Currently *nobody should be using s3a:// at the the temp file destination*,
which is the same as saying "nobody should be using s3a:// as the direct
destination of work", not
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18979#discussion_r133919035
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/BasicWriteTaskStatsTrackerSuite.scala
---
@@ -0,0 +1,212
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18979#discussion_r133918269
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala
---
@@ -57,7 +60,14 @@ class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18979#discussion_r133913173
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/BasicWriteTaskStatsTrackerSuite.scala
---
@@ -0,0 +1,212
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/18979
[SPARK-21762][SQL] FileFormatWriter/BasicWriteTaskStatsTracker metrics
collection fails if a new file isn't yet visible
## What changes were proposed in this pull re
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18111#discussion_r133751724
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
---
@@ -73,7 +73,10 @@ class
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17743
Just reread this; still looks correct. Review comments welcome
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17342
Created: [SPARK-21697](https://issues.apache.org/jira/browse/SPARK-21697)
with the stack trace attached
---
If your project is set up for it, you can reply to this email and have your
reply
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17342
I'm going to recommend you file a SPARK bug on issues.apache.org there & an
HDFS linked to it "NPE in BlockReaderFactory log init". It looks like the
creation of the LOG
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14601
I know this hasn't been updated, but it is still important. I can take it
on if all it needs is a test case
---
If your project is set up for it, you can reply to this email and have
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18628
Thanks for making sure this is consistent with other uses of
Configuration.get(); consistency is critical here
---
If your project is set up for it, you can reply to this email and have your
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18668#discussion_r131095350
--- Diff:
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
---
@@ -50,6 +50,7 @@ private[hive
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18668#discussion_r131094720
--- Diff: docs/configuration.md ---
@@ -2335,5 +2335,61 @@ The location of these configuration files varies
across Hadoop versions, but
a common
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18668#discussion_r131093892
--- Diff: docs/configuration.md ---
@@ -2335,5 +2335,61 @@ The location of these configuration files varies
across Hadoop versions, but
a common
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18668#discussion_r131093320
--- Diff: docs/configuration.md ---
@@ -2335,5 +2335,61 @@ The location of these configuration files varies
across Hadoop versions, but
a common
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18628#discussion_r130598803
--- Diff:
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
---
@@ -57,6 +59,19 @@ private[hive
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18628#discussion_r130598230
--- Diff:
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala
---
@@ -57,6 +59,19 @@ private[hive
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17747
I know, I just have too many open JIRAs to try and manage
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17747
Pushing up a new patched rebased to work with master.
It's getting boring all round for this patch: me having to do a merge,
retest, repush. How about finalising the review so w
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18111
Is there anything else which needs to be one here, or is it matter of
finding the right reviewer?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17747
Mima test failure was about a new method in hist server
```
[info] spark-mllib: found 0 potential binary incompatibilities while
checking against org.apache.spark:spark-mllib_2.11
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/9518
BTW, here are some ongoing Hadoop JIRAs related to its shipping statsd:
[HADOOP-12360](https://issues.apache.org/jira/browse/HADOOP-12360?focusedCommentId=16034826&
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/9518#discussion_r123483576
--- Diff:
core/src/main/scala/org/apache/spark/metrics/sink/StatsdReporter.scala ---
@@ -0,0 +1,160 @@
+/*
+ * Licensed to the Apache Software
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14601
Testing should not be too hard. Here's my *untested* attempt
```scala
val sconf = new SparkConf(false)
sconf.set("fs.example.value", "true"
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/17747
I'm going to go with your suggestion and go via the metricServer to get at
the state of counters and gauges; this is is actually better in that it will
verify that all metrics are m
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17747#discussion_r122295075
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -164,6 +169,16 @@ private[history] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17747#discussion_r122294995
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -129,6 +131,9 @@ private[history] class
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17747#discussion_r122279400
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -110,6 +117,14 @@ class HistoryServer
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/17747#discussion_r122256050
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/HistoryMetricSource.scala
---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18247
..just caught this. No, no issues with it. A retrospective non-binding +1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/18111
Not really. I thought about how I could do it, but essentially you do need
to do things underneath the commit protocol, either in the Hadoop codebase (me)
or in a test which somehow
Github user steveloughran commented on a diff in the pull request:
https://github.com/apache/spark/pull/18111#discussion_r118530138
--- Diff:
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
---
@@ -73,7 +73,10 @@ class
GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/18111
[SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fail meaningfully if
FileOutputCommitter.getWorkPath==null
## What changes were proposed in this pull request?
Handles the
Github user steveloughran closed the pull request at:
https://github.com/apache/spark/pull/9571
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
201 - 300 of 1133 matches
Mail list logo