GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/2638
[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to
prevent breaking binary-compatibility.
Original problem is
[SPARK-3764](https://issues.apache.org/jira/browse/SPARK-3764
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/2638#issuecomment-57800537
@srowen, Thank you for your comment.
Indeed, when deploy completed apps to Spark cluster, there is a particular
instance of Spark.
But Spark app developers
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/2673#issuecomment-58225042
Hi @pwendell, I had a similar issue related to artifacts in Maven Central
and Hadoop versions.
Could you take a look at
[SPARK-3764](https://issues.apache.org/jira
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15445571
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +209,70 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15445580
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +209,70 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15503212
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,81 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15503576
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,81 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15503732
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,81 @@ case class EndsWith(left
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50437674
I'm sorry but now I become confused.
`Length` and `Strlen` look like becoming almost the same implementation.
What do you intend the difference between them
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50442618
First, I would like to confirm, but which do you want to add to HQL,
`Length` or `Strlen`?
The title of this PR says to add `Length` to HQL, but the implementation
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50450124
@javadba Ah, I see. Thank you for your detail.
I'll continue to review :)
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15516740
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,82 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15517240
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,82 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15517292
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,82 @@ case class EndsWith(left
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50465289
@javadba I couldn't understand what you want `Strlen` to return.
Could you clarify the semantics of `Strlen` again, please?
---
If your project is set up for it, you
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50554217
Hi, the helper `len()` is not needed.
What Hive's `Length` is doing is to calculate the number of code points,
which can be done by `String#codePointCount
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15560814
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -209,6 +212,82 @@ case class EndsWith(left
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/993#issuecomment-50578802
Hi @marmbrus, thanks for great work!
But it seems to break build.
I got the following result when I run `sbt assembly` or `sbt publish-local
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50637172
Hi @javadba, FYI.
I believe there are 3 types of length around string in Java/Scala.
1) the number of 16-bit characters in the string
To get this, use
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50858955
Oops, I had forgotten that Hive's `Length` can handle binary type.
It would be better to use `Length` instead of `CharLength` and make it
handle binary type
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50947564
Hi @javadba, you said some of which do differ from my results, but which is
different one?
I can see these tests are checking as the same as my results.
Some
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-50947924
Ah, I see. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15725455
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -208,6 +211,96 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15725507
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -208,6 +211,96 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1586#discussion_r15726561
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -208,6 +211,96 @@ case class EndsWith(left
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1750#discussion_r15740705
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala
---
@@ -33,6 +33,19 @@ case class UnaryMinus(child
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-51145793
Hi @javadba, I tested `org.apache.spark.sql.SQLQuerySuite` and
`org.apache.spark.sql.hive.execution.HiveQuerySuite` locally, and they worked
fine even if I reverted
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1586#issuecomment-51179894
@javadba, @marmbrus
I saw the case of SOF sometimes, it was not with @javadba's sequence,
though.
I can't identify the exact reason now, but I guess
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1887
[SPARK-2965][SQL] Fix HashOuterJoin output nullabilities.
Output attributes of opposite side of `OuterJoin` should be nullable.
You can merge this pull request into a Git repository by running
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1889
[SPARK-2969][SQL] Make ScalaReflection be able to handle
MapType.containsNull and MapType.valueContainsNull.
Make `ScalaReflection` be able to handle like:
- `Seq[Int]` as `ArrayType
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1889#discussion_r16094328
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala
---
@@ -372,7 +372,7 @@ object MapType
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1889#discussion_r16096763
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala
---
@@ -372,7 +372,7 @@ object MapType
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1889#discussion_r16097181
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala
---
@@ -372,7 +372,7 @@ object MapType
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/2825
SPARK-3969 Optimizer should have a super class as an interface.
Some developers want to replace `Optimizer` to fit their projects but can't
do so because currently `Optimizer` is an `object`.
You
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/2825#discussion_r18995386
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -28,7 +28,9 @@ import
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/2825#issuecomment-59461447
Hi @chenghao-intel, thank you for your comment.
Yes, that's right. I don't want to mix the logical plan physical plan
optimization and I'll extend `SparkStrategies
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/2835
[SPARK-3986][SQL] Fix package names to fit their directory names.
Package names of 2 test suites are different from their directory names.
- `GeneratedEvaluationSuite
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/2825#discussion_r19050063
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExpressionOptimizationSuite.scala
---
@@ -30,7 +30,7 @@ class
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/2825#discussion_r19050137
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -28,7 +28,9 @@ import
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/990#discussion_r13576056
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -154,6 +155,10 @@ object NullPropagation extends Rule
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/990#discussion_r13576722
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -154,6 +155,10 @@ object NullPropagation extends Rule
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/990#issuecomment-45582389
@marmbrus I checked and remove the unneeded rules from `NullPropagation`.
Could you please check the changes?
---
If your project is set up for it, you can reply
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1034
[SPARK-2093] [SQL] NullPropagation should use exact type value.
`NullPropagation` should use exact type value when transform `Count` or
`Sum`.
You can merge this pull request into a Git repository
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1034#discussion_r13584575
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -100,8 +100,8 @@ object ColumnPruning extends Rule
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/990#discussion_r13635448
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantFoldingSuite.scala
---
@@ -173,4 +173,63 @@ class ConstantFoldingSuite
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1133
[SPARK-2196] [SQL] Fix nullability of CaseWhen.
`CaseWhen` should use `branches.length` to check if `elseValue` is provided
or not.
You can merge this pull request into a Git repository by running
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1133#issuecomment-46633857
Added some tests and I noticed that the `CaseWhen` should also be nullable
if the `elseValue` is nullable.
---
If your project is set up for it, you can reply
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1193
[SPARK-2254] [SQL] ScalaRefection should mark primitive types as
non-nullable.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1226
[SPARK-2287] [SQL] Make ScalaReflection be able to handle Generic case
classes.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47192918
This will cause merge conflict with #1193, but I can fix it soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1193#issuecomment-47193596
@rxin Thanks! I will fix soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47194382
Fixed merge conflict with #1193.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47206743
Could you please retest this?
Previous tests seemed like Hive metastore was something wrong.
(Can I let Jenkins do retest?)
---
If your project is set up
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1235
[SPARK-2295] [SQL] Make JavaBeans nullability stricter.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache-spark issues/SPARK-2295
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47298915
@marmbrus I see, I will do it next time, thanks.
But why failed...
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1151#discussion_r14285252
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala ---
@@ -136,13 +137,12 @@ class SqlParser extends StandardTokenParsers
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47328485
Passed Hive tests. Why? Just merged master.
And Python tests failed...
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47416531
Hi @marmbrus,
I found that there is a case at my local throwing the same exception with
`test-only` like:
```
testOnly
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1226#issuecomment-47417440
Ah, we must call `SHOW TABLES` before `TestHive.reset()` the same as
[here](https://github.com/ueshin/apache-spark/blob/issues/SPARK-2287/sql/hive/src/test/scala/org
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1266
[SPARK-2327] [SQL] Fix nullabilities of Join/Generate/Aggregate.
Fix nullabilities of `Join`/`Generate`/`Aggregate` because:
- Output attributes of opposite side of `OuterJoin` should be nullable
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1268
[SPARK-2328] [SQL] Add execution of `SHOW TABLES` before `TestHive.reset()`.
`PruningSuite` is executed first of Hive tests unfortunately,
`TestHive.reset()` breaks the test environment
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1266#issuecomment-47518136
I guess this needs #1268 to pass Hive tests.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1268#issuecomment-47518299
If #1226 is merged before this, this is not needed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1266#issuecomment-47635811
Oops, passed #1268 related errors, but others failed...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/906#issuecomment-47743422
Hi, I encountered this kind of Servlet API conflict.
When are you planning to merge this? Or is there something to do?
---
If your project is set up for it, you can
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/906#issuecomment-47804679
Close this?
I encountered after the #1271 was merged.
I believe at least we need to exclude from Hive related dependencies.
---
If your project is set up
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1266#discussion_r14543737
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
---
@@ -46,10 +46,22 @@ case class Generate
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1266#discussion_r14543764
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins.scala ---
@@ -319,7 +319,27 @@ case class BroadcastNestedLoopJoin
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1266#issuecomment-4714
@marmbrus, Thank you for your comments.
Modified to use `withNullability`.
And no problem, because exact types and nullability is important for my
project
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1150#discussion_r14569171
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -274,7 +274,7 @@ private[sql] abstract class SparkStrategies
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1301#issuecomment-48077448
Thank you for your comments.
Fixed and pushed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1339
[SPARK-2415] [SQL] RowWriteSupport should handle empty ArrayType correctly.
`RowWriteSupport` doesn't write empty `ArrayType` value, so the read value
becomes `null`.
It should write empty
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1346#issuecomment-48560066
Hi, I'm wondering if `MapType` will have something like `containsNull` for
`ArrayType`.
---
If your project is set up for it, you can reply to this email and have your
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1346#issuecomment-48566629
@yhuai, I understand. Thank you for your reply.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1355
[SPARK-2428][SQL] Add except and intersect methods to SchemaRDD.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache-spark issues
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1357
[SPARK-2431][SQL] Refine StringComparison and related codes.
Refine `StringComparison` and related codes as follows:
- `StringComparison` could be similar to `StringRegexExpression
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1373
[SPARK-2446][SQL] Add BinaryType support to Parquet I/O.
To support `BinaryType`, the following changes are needed:
- Make `StringType` use `OriginalType.UTF8`
- Add `BinaryType` using
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1373#issuecomment-48781530
@marmbrus Yes, I think so.
But this new behavior is the same as Avro, Thrift and the next Hive (0.14).
To load the string data saved with previous versions, `Cast
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1398
[SPARK-2467] Revert SparkBuild to publish-local to both .m2 and .ivy2.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache-spark
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1359#discussion_r14915562
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala
---
@@ -205,3 +207,72 @@ case class EndsWith(left
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1426
[SPARK-2504][SQL] Fix nullability of Substring expression.
This is a follow-up of #1359 with nullability narrowing.
You can merge this pull request into a Git repository by running:
$ git pull
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1428
[SPARK-2509][SQL] Add optimization for Substring.
`Substring` including `null` literal cases could be added to
`NullPropagation`.
You can merge this pull request into a Git repository by running
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1428#discussion_r14981973
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -171,6 +171,9 @@ object NullPropagation extends Rule
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1428#issuecomment-49125413
@rxin Ah, wait for a moment.
What do you think about foldability of the `Substring` I mentioned above?
---
If your project is set up for it, you can reply
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1428#issuecomment-49125499
I'll open another issue about foldability of `Substring`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/1428#issuecomment-49125889
@rxin I agree that we need the better way for `NullPropagation ` instead of
too much pattern matching.
---
If your project is set up for it, you can reply to this email
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1432
[SPARK-2518][SQL] Fix foldability of Substring expression.
This is a follow-up of #1428.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1440#discussion_r14993202
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
---
@@ -249,3 +263,7 @@ case class Cast(child: Expression
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1451
[SPARK-2535][SQL] Add StringComparison case to NullPropagation.
`StringComparison` expressions including `null` literal cases could be
added to `NullPropagation`.
You can merge this pull request
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/1491
[SPARK-2588][SQL] Add some more DSLs.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ueshin/apache-spark issues/SPARK-2588
Alternatively you
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38432140
Added a test case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38757007
`LocalActor` has 4 threads to handle Actor messages but the replacement of
the ContextClassLoader happens only one of them.
I believe that the **UNBALANCED** state
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38766469
I checked related codes:
-
[CoarseGrainedExecutorBackend.scala#L56](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38872508
No, but what should I do? rebase onto current master branch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38872759
Ah I see, there are some changes in `Executor`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38877616
Rebased onto master and pushed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/15#issuecomment-38890320
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user ueshin opened a pull request:
https://github.com/apache/spark/pull/283
SPARK-1380: Add sort-merge based cogroup/joins.
I've written cogroup/joins based on 'Sort-Merge' algorithm.
You can merge this pull request into a Git repository by running:
$ git pull https
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/283#issuecomment-39417683
@rxin Thank you for your reply.
There are some case to use merge join for optimization:
1. If data to be joined are already sorted by join keys, merge join
Github user ueshin commented on the pull request:
https://github.com/apache/spark/pull/283#issuecomment-39421176
@mridulm Thank you for your reply.
There are 2 points I have to mention about memory:
- Before shuffle
If data are sorted, no more memory is needed
1 - 100 of 2567 matches
Mail list logo