GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/18019
[SPARK-20748][SQL] Add built-in SQL function CH[A]R.
## What changes were proposed in this pull request?
Add built-in SQL function `CH[A]R`:
For `CHR(bigint|double n)`, returns the ASCII
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17999
[SPARK-20751][SQL] Add built-in SQL Function - COT
## What changes were proposed in this pull request?
Add built-in SQL Function - COT.
## How was this patch tested?
unit
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/17558
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17558
@HyukjinKwon
This implementation is not elegant, but can solve my problem, I'll apply it
to my own branch later.
---
If your project is set up for it, you can reply to this email and have
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/14720
It has been fixed by https://github.com/apache/spark/pull/17342.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17920
@hvanhovell I closed outdated PR: https://github.com/apache/spark/pull/15259
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/15259
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17920
[SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when
calling createJoinKey
## What changes were proposed in this pull request?
The following SQL query cause
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/15259
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/15259
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/15259
@hvanhovell I updated PR description.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17886
@gatorsmile `JDBC client` and `beeline` all works.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17886
CC @srowen, @liancheng Can you please review this change?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/15466
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17886
[SPARK-13983][SQL][WIP] Fix HiveThriftServer2 can not get "--hiveconf" and
''--hivevar" variables since 2.x
## What changes were proposed in this pull request?
Fix HiveTh
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17856
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17856
[SPARK-19660][SQL] Replace the deprecated property name fs.default.name to
fs.defaultFS that newly introduced
## What changes were proposed in this pull request?
Replace the deprecated
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/17637
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17637
[SPARK-20337][CORE] Support upgrade a jar dependency and don't restart
SparkContext
## What changes were proposed in this pull request?
Support upgrade a jar dependency and don't restart
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17558
`SparkContext` support `add jar`, but doesn't support `uninstall jar`.
Imagine that I have a spark-sql or
[spark-thriftserver](https://github.com/apache/spark/tree/v2.1.0/sql/hive-thriftserver
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17558
[SPARK-20247][CORE] Add jar but this jar is missing later shouldn't affect
jobs that doesn't use this jar
## What changes were proposed in this pull request?
Catch exception when jar
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/17505
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17505
@gatorsmile You are right. It is similar to
[HIVE-12908](https://github.com/apache/hive/commit/26268deb4844d3f3c530769c6276b17b0c6caaa0).
There are 3 bottlenecks for many output files
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/17505#discussion_r109852106
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
---
@@ -694,12 +694,25 @@ private[hive] class HiveClientImpl
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/17505#discussion_r109851825
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
---
@@ -694,12 +694,25 @@ private[hive] class HiveClientImpl
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/17505#discussion_r109846638
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -242,6 +251,16 @@ private[client] class Shim_v0_12 extends Shim
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17505
[SPARK-20187][SQL] Replace loadTable with moveFile to speed up load table
for many output files
## What changes were proposed in this pull request?
[HiveClientImpl.loadTable](https
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17449
[SPARK-20120][SQL] spark-sql support silent mode
## What changes were proposed in this pull request?
It is similar to Hive silent mode, just show the query result. see: [Hive
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17442
[SPARK-20107][SQL] Speed up HadoopMapReduceCommitProtocol#commitJob for
many output files
## What changes were proposed in this pull request?
Set
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17362
@weiqingy is doing [Allow adding jars from
hdfs](https://github.com/apache/spark/pull/17342).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/15466
@srowen please help to review, thanks a lot.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/13735
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17020
@srowen The hadoop
[DeprecatedProperties](https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/DeprecatedProperties.html)
`mapred.reduce.tasks` can automatically converted
Github user wangyum commented on the issue:
https://github.com/apache/incubator-hivemall/pull/59
Be consistent with the Spark.
https://github.com/apache/spark/blob/d76c18746bdd0908bae89e62da9667001be75624/pom.xml#L125
---
If your project is set up for it, you can reply
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/17162
cc @srowen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17162
[SPARK-19550][SparkR][DOCS] Update R document to use JDK8
## What changes were proposed in this pull request?
Update R document to use JDK8.
## How was this patch tested
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-hivemall/pull/59
[HIVEMALL-85] Upgrade hivemall-xgboost's hadoop version to 2.6.5
## What changes were proposed in this pull request?
Upgrade hivemall-xgboost's hadoop version to 2.6.5
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16819
@vanzin What do you think about current approach? I have tested on a same
Spark hive-thriftserver, the `spark.dynamicAllocation.maxExecutors` wiil
decrease if I kill 4 NodeManager:
```
17
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16990
OK. I have reverted `set
hive.mapreduce.job.reduces.speculative.execution=false` to `set
hive.mapred.reduce.tasks.speculative.execution=false`.
---
If your project is set up for it, you can reply
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16990#discussion_r103200223
--- Diff: python/pyspark/tests.py ---
@@ -1515,12 +1515,12 @@ def test_oldhadoop(self):
conf
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16819
@vanzin We must pull the configuration from ResourceManager,
ResourceManager can't push.
So setting the max before each stage? This feels too frequent.
In fact, This is suitable
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/17020
[SPARK-19693][SQL] Make the SET mapreduce.job.reduces automatically
converted to spark.sql.shuffle.partitions
## What changes were proposed in this pull request?
Make the `SET
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16819#discussion_r102365927
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
---
@@ -1193,6 +1189,37 @@ private[spark] class Client
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16990
@srowen @felixcheung
The SQL query is related to the file name, see:
https://github.com/apache/spark/blob/v2.1.0/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16990
I'm working on the tests fail.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16819
@srowen . Dynamic set `spark.dynamicAllocation.maxExecutors` can avoid
some strange problems:
1. [Spark application hang when dynamic allocation is
enabled](https://issues.apache.org/jira
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16990
[SPARK-19660][CORE][SQL] Replace the configuration property names that are
deprecated in the version of Hadoop 2.6
## What changes were proposed in this pull request?
Replace all
Github user wangyum commented on the issue:
https://github.com/apache/incubator-hivemall/pull/46
@myui I agree with you.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-hivemall/pull/45
[HIVEMALL-71] Handle null values and add a unit Tests to RescaleUDF.
## What changes were proposed in this pull request?
Handle null values and add a unit Tests to `RescaleUDF
Github user wangyum commented on the issue:
https://github.com/apache/incubator-hivemall/pull/44
@maropu , @myui I have sent an email to
`d...@hivemall.incubator.apache.org`. have you received it ?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user wangyum commented on the issue:
https://github.com/apache/incubator-hivemall/pull/44
OK,
`hivemall.smile.regression.RandomForestRegressionUDTF` create 2 functions.
`define-all.spark` also add them?
https://github.com/apache/incubator-hivemall/blob/master
Github user wangyum commented on the issue:
https://github.com/apache/incubator-hivemall/pull/44
I'm not sure whether these functions should add to `define-all.spark`:
https://github.com/apache/incubator-hivemall/blob/master/resources/ddl/define-all.hive#L467-L483
---
If your
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-hivemall/pull/44
[HIVEMALL-65] Update define-all.spark
## What changes were proposed in this pull request?
Make define-all.spark keep correspondence with define-all.hive.
## What type
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-hivemall/pull/38
Support spark-sql
## What changes were proposed in this pull request?
Support spark-sql to execute source define-all.hive,
define-additional.hive, define-all.deprecated.hive
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16819
It will reduce the function call on
[CoarseGrainedSchedulerBackend.requestTotalExecutors()](https://github.com/apache/spark/blob/v2.1.0/core/src/main/scala/org/apache/spark/scheduler/cluster
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16819
[SPARK-16441][YARN] Set maxNumExecutor depends on yarn cluster resources.
## What changes were proposed in this pull request?
Dynamic set `spark.dynamicAllocation.maxExecutors` by cluster
Github user wangyum commented on the issue:
https://github.com/apache/incubator-hivemall/pull/23
@maropu , Do not merge it. I will fix `XGBoostSuite` test fail after the
Spring Festival.
---
If your project is set up for it, you can reply to this email and have your
reply appear
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-hivemall/pull/23
[HIVEMAIL-31] Change the branch of spark-2.0 to spark-2 and upgrade Spark
to 2.1.0
## What changes were proposed in this pull request?
Change the branch of spark-2.0 to spark-2
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16527
I don't quit understand @srowen mentioned. so I simply changed it to drop
`dataSize - retainedSize + retainedSize / 10` items at a time.
If max is 100 and there are 150 items, it would drop
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16527
I use following code log trim stages/jobs time consuming:
```:scala
/** If stages is too large, remove and garbage collect old stages */
private def trimStagesIfNecessary(stages
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16527#discussion_r95377515
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -409,7 +409,8 @@ class JobProgressListener(conf: SparkConf) extends
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16527
[SPARK-19146][Core]Drop more elements when stageData.taskData.size >
retainedTasks
## What changes were proposed in this pull request?
Drop more elements when `stageData.taskData.s
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/16526
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16526
Drop more elements when stageData.taskData.size > retainedTasks
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
##
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16252
@srowen I have restored it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16252#discussion_r92439275
--- Diff:
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16252#discussion_r92396555
--- Diff:
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16252
@srowen @viirya I have added it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16252
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16252
`org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress
test for failOnDataLoss=false` has succeeded on my local test.
---
If your project is set up for it, you can
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16252#discussion_r91969759
--- Diff:
core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala ---
@@ -694,7 +694,7 @@ private[storage] class PartiallyUnrolledIterator[T
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16252
[SPARK-18827][Core] Fix cannot read broadcast on disk
## What changes were proposed in this pull request?
Fix cannot read broadcast on disk
## How was this patch tested?
Add
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16122
@mallman OK, thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16122
This patch fails because hive-0.12 and hive-0.13 doesn't has `getMetaConf`
method.
see [HIVE-7532](https://issues.apache.org/jira/browse/HIVE-7532),
---
If your project is set up for it, you
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/16122
@mallman MySQL and version is 5.6.29
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16122
[SPARK-18681][SQL] Fix filtering to compatible with partition keys of type
int
## What changes were proposed in this pull request?
Cloudera's Hive default configuration
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16079#discussion_r90250928
--- Diff: sbin/spark-daemon.sh ---
@@ -176,11 +175,11 @@ run_command() {
case "$mode" in
(class)
- execute_comma
Github user wangyum commented on a diff in the pull request:
https://github.com/apache/spark/pull/16079#discussion_r90244270
--- Diff: sbin/spark-daemon.sh ---
@@ -124,9 +124,8 @@ if [ "$SPARK_NICENESS" = "" ]; then
fi
execute_command() {
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/16079
[SPARK-18645][Deploy] Fix spark-daemon.sh arguments error lead to throws
Unrecognized option
## What changes were proposed in this pull request?
spark-daemon.sh will lost single quotes
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/15466
@liancheng, Please help me review this PR, I have apply this patch to our
spark cluster.
![spark-hive-var](https://cloud.githubusercontent.com/assets/5399861/19689537/4ca5239e-9b00-11e6-912f
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/15466
[SPARK-13983][SQL] HiveThriftServer2 can not get "--hiveconf" or
''--hivevar" variables since 1.6 version (both multi-session and single session)
## What changes were proposed in th
Github user wangyum commented on the issue:
https://github.com/apache/spark/pull/15259
@hvanhovell
The following SQL query thrown IndexOutOfBoundsException:
```sql
SELECT
count(int)
FROM
(
SELECT t1.int, t2.int2
FROM (SELECT * FROM
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/15259
[SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when
calling createJoinKey
## What changes were proposed in this pull request?
Fix thrown IndexOutOfBoundsException like
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/14377
[SPARK-16625][SQL] General data types to be mapped to Oracle
## What changes were proposed in this pull request?
Spark will convert **BooleanType** to **BIT(1)**, **LongType
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/13735
[SPARK-15328][MLLIB][ML] Word2Vec import for original binary format
## What changes were proposed in this pull request?
Add `loadGoogleModel()` function to import original wor2vec binary
Github user wangyum closed the pull request at:
https://github.com/apache/spark/pull/13119
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user wangyum closed the pull request at:
https://github.com/apache/incubator-eagle/pull/230
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user wangyum closed the pull request at:
https://github.com/apache/incubator-eagle/pull/228
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-eagle/pull/228
EAGLE-330 Fix Hive ql.Parser can't parser a hive query sql with keywords.
[EAGLE-330](https://issues.apache.org/jira/browse/EAGLE-330):Fix Hive
ql.Parser can't parser a hive query sql
Github user wangyum closed the pull request at:
https://github.com/apache/incubator-eagle/pull/227
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user wangyum opened a pull request:
https://github.com/apache/incubator-eagle/pull/227
EAGLE-330 Fix Hive ql.Parser can't parser a hive query sql with keywords.
[EAGLE-330](https://issues.apache.org/jira/browse/EAGLE-330):Fix Hive
ql.Parser can't parser a hive query sql
GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/13119
[SPARK-15328][MLLIB][ML] Word2Vec import for original binary format
## What changes were proposed in this pull request?
Add `loadGoogleModel()` function to import original wor2vec binary
901 - 992 of 992 matches
Mail list logo