Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13103#issuecomment-219187010
I think it would be good if you follow
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide.
---
If your project is set up for it, you can
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/13113
[SPARK-15325][SQL] Replace the usage of deprecated DataSet API in tests
(Scala/Java)
## What changes were proposed in this pull request?
It seems `unionAll(other: Dataset[T
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13104#issuecomment-219207342
Oh, I just meant it changes codes to support partitioned table for text
data source which seems disabled in Spark 2.0. It seems the guide says it does
not a JIRA
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13112#discussion_r63272945
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/stopwatches.scala
---
@@ -19,7 +19,8 @@ package org.apache.spark.ml.util
import
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13104#issuecomment-219207979
Let me cc @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13165#discussion_r63667236
--- Diff: R/pkg/R/client.R ---
@@ -43,6 +43,17 @@ determineSparkSubmitBin <- function() {
sparkSubmitBinName
}
+# R supports b
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13135#issuecomment-220009767
I see. Thanks! I will change them tomorrow.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13104#issuecomment-219969148
@jurriaan It might be nicer if the title is
`[SPARK-15323][SPARK-14463][SQL] ...` if this fixes the issue as well (So that
both can be closed as you might already
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13165#issuecomment-219955778
Please let me cc @sun-rui and @JoshRosen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/13165
[SPARK-8603][SPARKR] Incorrect file separator passed to Java and Scripts
from R in windows
## What changes were proposed in this pull request?
This PR corrects R file separator
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13164#issuecomment-219956200
Oh, I didn't even know actually there are some more.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13041#discussion_r62793802
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala
---
@@ -61,7 +61,9 @@ class DefaultSource extends
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13048#discussion_r63117779
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13086#issuecomment-218922187
(Just to make sure, it seems it is only single one across Spark codes)
```bash
grep -r "SparkSession.builder()" . | grep ".scala"
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13048#discussion_r63120588
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcOptions.scala ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13067#discussion_r62963753
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoDir.scala
---
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13067#discussion_r62963841
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoDir.scala
---
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13067#discussion_r62963804
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoDir.scala
---
@@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13105#discussion_r63274194
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/DefaultSource.scala
---
@@ -172,4 +173,13 @@ class DefaultSource
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13116#discussion_r63274228
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -18,6 +18,9 @@
package org.apache.spark.sql
import
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13116#discussion_r63274238
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
---
@@ -402,6 +402,76 @@ class DatasetSuite extends QueryTest
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/13116#discussion_r63274249
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
---
@@ -402,6 +402,76 @@ class DatasetSuite extends QueryTest
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13115#issuecomment-219213787
(I think it would be nicer if the PR description is fill up.)
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12855#issuecomment-216436047
@rxin Sure, I will thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12855#issuecomment-216435745
@rxin I thought so but I haven't tested yet. Could I will look into that if
this one is merged maybe?
---
If your project is set up for it, you can reply
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12856#issuecomment-216441995
(Maybe adding "Closes #12774" in the description?)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitH
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12601#issuecomment-216445374
@rxin I also realised Python API is supporting properties as a dict having
" arbitrary string tag/value",
[here](https://github.com/apache/
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11724#issuecomment-216465994
@rxin Sure I will add more explicit description and some tests for this.
Thanks.
---
If your project is set up for it, you can reply to this email and have your
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12855#discussion_r61863832
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
---
@@ -239,48 +239,50 @@ private[sql] class
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12855#discussion_r61863848
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
---
@@ -363,84 +365,87 @@ private[sql] class
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12855#discussion_r61863843
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala
---
@@ -363,84 +365,87 @@ private[sql] class
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11724#issuecomment-216499340
@rxin I added some more commits for unit tests in `CSVInferSchemaSuite`.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12855#issuecomment-216491871
@rxin I could find the same issue in internal datasources. I just added the
same logics and a test in `HadoopFsRelationTest `.
---
If your project is set up
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11724#issuecomment-216704194
@rxin I see. Thank you. Let me fix this up and change the description as
well with some rules for `LongType`, `DoubleType` and `DecimalType`.
---
If your project
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216733974
Hi @falaki, could you take a quick look? it won't be too long!
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216734653
Please allow me cc you, @jbax, who I guess the author of Univocity parser.
Could you please confirm that `Format.setComment()` is not affected if we
only calls
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12693#discussion_r61987018
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala ---
@@ -640,12 +640,14 @@ class CheckpointSuite extends
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216740931
@jbax Cool! Thank you for detailed explanation.
So, this uses OS default newline without `setLineSeparator()`, which is
trimmed
[here](https://github.com
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216743678
@jbax Ah, I guess `foo` and `bar` are separate rows, right? `stripLineEnd`
will be applied for each row.
If I got you wrong and `setLineSeparator
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/10943#issuecomment-217461632
ping @cloud-fan
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/10655#issuecomment-217465276
ping @RussellSpitzer
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/10953#issuecomment-217477017
@markgrover Mind adding `Closes #10681` in the PR description so that
merging script can close that together?
---
If your project is set up for it, you can reply
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12951#discussion_r62329769
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SchedulableBuilder.scala ---
@@ -90,10 +92,11 @@ private[spark] class FairSchedulableBuilder
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12951#discussion_r62330358
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala ---
@@ -21,6 +21,7 @@ import java.util.concurrent.{ConcurrentHashMap
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12951#discussion_r62330957
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala ---
@@ -47,6 +49,15 @@ private[spark] class Pool(
var name = poolName
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12951#discussion_r62330929
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -98,6 +98,14 @@ private[spark] class TaskSetManager(
var
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12951#discussion_r62331312
--- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala ---
@@ -21,6 +21,7 @@ import java.util.concurrent.{ConcurrentHashMap
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12944#discussion_r62331757
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexRecord.java
---
@@ -0,0 +1,39 @@
+/*
+ * Licensed
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/12972
[SPARK-15198][SQL] Support for pushing down filters for boolean types in
ORC data source
## What changes were proposed in this pull request?
This PR adds the support for pushing
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12972#issuecomment-217601711
Let me please cc @liancheng and also @tedyu who suggested this change.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11317#issuecomment-217602140
@RussellSpitzer I saw you answered my ping before. Excuse my ping here
again.
---
If your project is set up for it, you can reply to this email and have your
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12904#issuecomment-217604986
@rxin @sureshthalamati Do you mind holding off this change until #12921 is
merged? That PR also handles `nullValue`. Apparently, I guess `nullValue` could
affect
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12904#discussion_r62411095
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -555,4 +558,37 @@ class CSVSuite extends
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12904#issuecomment-217605160
Here is what I think CSV datasource should handle `""`, empty string and
`nullValue`.
With the option, `nullValue` set to `"
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12904#issuecomment-217605838
In case of writing, I think
```
Row("", "null", null)
```
should produce the CSV as below:
1. With the opt
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12971#discussion_r62411488
--- Diff:
mllib/src/test/java/org/apache/spark/mllib/tree/JavaDecisionTreeSuite.java ---
@@ -21,6 +21,8 @@
import java.util.HashMap;
import
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12817#issuecomment-216084433
@rxin I am so sorry, I think I totally misunderstood your initial comments
before. I just addressed your comments later. Thank you.
---
If your project is set up
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12629#issuecomment-216086878
Hi @davies @viirya , If you are not sure of handling `null`s, I can close
this for meanwhile.
But, this PR includes
- adding `OrcOptions` just like
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12774#issuecomment-216098660
@gatorsmile Does that maybe imply closing this for now and make a JIRA or
send a email to dev-mailing list in order to discuss this further?
---
If your project
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12817#issuecomment-216090207
Sure. Thank you. Do you want me to remove Python documentation here?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12834#issuecomment-216112173
cc @rxin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12834#issuecomment-216112196
(@rxin Do you want me to do this for `json()` as well?)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/12834
[SPARK-15050][SQL] Put CSV options as Python csv function parameters
## What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-15050
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12817#issuecomment-216090703
@rxin I see. Thank you.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12629#issuecomment-216086947
cc @rxin as well.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12834#discussion_r61716763
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -177,31 +180,35 @@ def json(self, path, schema=None):
:param path: string represents path
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12834#discussion_r61708031
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -274,48 +274,44 @@ def text(self, paths):
return
self._df(self._jreader.text(self
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12868#issuecomment-216848307
I think we might better change the title and description. From your
comments, I guess it is not a wrong type but dependent on Python's version. So,
I think
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12921#discussion_r62159750
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
---
@@ -192,59 +192,59 @@ private[csv] object
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12920#issuecomment-217105494
I am not sure if fixing examples can have the component `[DOC]` in the
title. I saw `[EXAMPLE]` component was used by @dongjoon-hyun. This is a pretty
minor
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12925#issuecomment-217105714
@dongjoon-hyun This one as well. Do you mind if I ask your thoughts on the
component in the title? Making good examples for PRs will help all other
contributers
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12904#issuecomment-217103186
+1 for treating them as empty strings
I guess this will be conflicted with
https://github.com/apache/spark/pull/12921 because that PR deals with some bug
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12921#discussion_r62164414
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVTypeCastSuite.scala
---
@@ -73,10 +73,10 @@ class CSVTypeCastSuite
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12927#issuecomment-217105871
@dongjoon-hyun Sorry for cc a lot but it would be great if I can hear your
thoughts.
---
If your project is set up for it, you can reply to this email and have
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/12921
[SPARK-15143][SPARK-15144][SQL] Add CSV tests with HadoopFsRelationTest and
support for nullValue for other types
## What changes were proposed in this pull request?
Currently
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12923#issuecomment-217086289
cc @rxin and @jbax (who is the author of Univocity library and suggested
this change).
---
If your project is set up for it, you can reply to this email and have
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12921#discussion_r62151223
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -447,7 +446,7 @@ class CSVSuite extends
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12921#issuecomment-217082924
cc @rxin @falaki
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/12923
[SPARK-15148][SQL] Upgrade Univocity library from 2.0.2 to 2.1.0
## What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-15148
Mainly
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12923#issuecomment-217094506
Thanks, @holdenk! It is not urgent. I can do this in this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12946#issuecomment-217373085
(@koeninger it seems the last part of the title is truncated)
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12777#issuecomment-217391825
Hi @yhuai Would you mind taking a look for this please?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/13021
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/13021#issuecomment-218073322
I am closing this after talking with @srowen in the JIRA ticket.
---
If your project is set up for it, you can reply to this email and have your
reply appear
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/13021
[SPARK-15245][SQL] Stream API throws an exception for non-directory path
with incorrect message.
## What changes were proposed in this pull request?
https://issues.apache.org/jira
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/10194#issuecomment-218082133
Maybe @davies because I see most of codes are written by him
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12865#issuecomment-216527112
(This is a pretty minor but I think `cc @rxin` can be removed but in the
comments because the PR description explains the PR itself and the names of
reviewers
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216772060
Thank you @jbax. I will try to do so after checking If I can identify any
useful changes with Spark.
---
If your project is set up for it, you can reply
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12868#discussion_r61996965
--- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py ---
@@ -49,7 +49,7 @@ def parseVector(line):
parser.add_argument
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216764368
@rxin in terms of funtionalities and performance, No.
But it shortens codes and I thought it is confusing whether `comment`
option in `CSVOptions` affects
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216765450
@rxin If it is too minor to merge, I can close and then do this in another
PR maybe after investigating the newline stuff discussed above.
---
If your project
Github user HyukjinKwon closed the pull request at:
https://github.com/apache/spark/pull/12818
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216765934
Closing this. I will bring up this again maybe in
https://github.com/apache/spark/pull/12268.
---
If your project is set up for it, you can reply to this email
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216748780
Oh, I misunderstood your first comment.I think I should not take out
`setLineSeparator()` here but maybe I should open another issue ticket to set
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216750176
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12818#issuecomment-216763118
@rxin To cut it short, I got a confirm, from the original author of
Univocity, `setComment()` has no effect as long as Spark does not write
comments from
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/12834#discussion_r61745877
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -258,64 +283,73 @@ def parquet(self, *paths):
@ignore_unicode_prefix
@since(1.6
GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/12855
[SPARK-10216][SQL] Avoid creating empty files during overwrite into Hive
table with group by query
## What changes were proposed in this pull request?
Currently, `INSERT
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12855#issuecomment-216411891
I submitted this PR because #8411 looks abandoned and looks the author is
not answering from the last comment by a commiter. (It has been inactive almost
halt
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/12855#issuecomment-216411956
@yhuai Could you please take a look?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
901 - 1000 of 12622 matches
Mail list logo