Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20864
I thought the directory is also created from this line:
https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/18666
Maybe I missed something, but it seems Spark has its own class loader right
now, which can load the class from the given URL:
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/18666
I asked the following question in
https://github.com/apache/spark/pull/20864: is it necessary to create these
temp directories when the hive thrift server starts? It sounds some legacy from
Hive
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20864
@samartinucci @zuotingbing a high-level question: is it necessary to create
these temp directories when the hive thrift server starts? It sounds some
legacy from Hive and we can skip creating
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20702
lgtm!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20681#discussion_r171770982
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -67,6 +67,8 @@ sparkSession <- if (windows_with_hadoop()) {
sparkR.session(mas
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20681
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20681
Overall, I think this suite needs a refactoring: split to in-memory catalog
one and hive catalog one. The catalog conf should not be manipulated after the
spark context is created. The other way
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20681
We can remove the test, but it is not a good practice. You don't know
exactly why the test is added, which hidden assuption he wants to guarantee,
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20681
My original plan to fix the test should not work, because of this test:
https://github.com/apache/spark/blob/master/R/pkg/tests/fulltests/test_sparkSQL.R#L3343
The new plan is to run some
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20681
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20681
@felixcheung Can you take a look at the changes in the R tests?
---
-
To unsubscribe, e-mail: reviews-unsubscr
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20681
[SPARK-23518][SQL] Completely remove metastore access if the query is not
using tables
## What changes were proposed in this pull request?
https://github.com/apache/spark/pull/18944
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20557
There may be some spark JDBC/ODBC drivers need to parse the returned
results to get all the columns. We should avoid making changes on the returned
"schema" from the server side. You c
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20565
SPAR[SPARK-23379][SQL] remove redundant metastore access
## What changes were proposed in this pull request?
If the target database name is as same as the current database, we should
be
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20564#discussion_r167381054
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -1107,11 +1107,6 @@ private[spark] class HiveExternalCatalog
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20564
[SPARK-23378][SQL] move setCurrentDatabase from HiveExternalCatalog to
HiveClientImpl
## What changes were proposed in this pull request?
This enforces the rule that no calls from
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20407#discussion_r167352400
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -156,6 +156,15 @@ object SQLConf {
.booleanConf
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20407#discussion_r167351883
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -262,6 +262,10 @@ abstract class SparkStrategies extends
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20441
@gatorsmile sorry for the late reply. I think the root cause is in hive
metastore. I created one pr to bypass it:
https://github.com/apache/spark/pull/20562
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20562
[SPARK-23275][SQL] fix the thread leaking in hive/tests
## What changes were proposed in this pull request?
The two lines actually can trigger the hive metastore bug:
https
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/17886#discussion_r165441473
--- Diff:
sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java
---
@@ -221,6 +227,70 @@ private void
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/17886
@gatorsmile this is a great patch. The test can be improved, but I think it
is safe to merge as it.
---
-
To unsubscribe, e
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/19219
The major issue this PR tries to cover has been fixed by
https://github.com/apache/spark/pull/20029, so I think we are good if there are
no calls to `HiveClientImpl.newSession`. We can close this
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20385
Actually, one more thing, do you need to consider the UDT as one attribute
of a structured type?
https://github.com/apache/spark/pull/20385/files#diff-842e3447fc453de26c706db1cac8f2c4L467
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20385
LGTM!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20425
[WIP] remove the redundant code in HiveExternalCatlog and HiveClientImpl
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
## How
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20420
LGTM! Thanks for doing this!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20385#discussion_r163648777
--- Diff:
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala
---
@@ -102,6 +102,8
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20025
lgtm
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/18983
LGTM! It is only created once though.
Frankly, we should completely remove the implementation of `newSession()`
method
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20025#discussion_r162698093
--- Diff:
sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/SessionManager.java
---
@@ -80,7 +76,6 @@ public synchronized void
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20025
@gatorsmile @felixcheung I left one comment, otherwise lgtm.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20025#discussion_r162426604
--- Diff:
sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/SessionManager.java
---
@@ -79,35 +75,19 @@ public synchronized void
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20174
@mgaido91 Your proposal and current approach are both with one line change.
Since the issue is actually related to the hash aggregate implementation, I
think it is reasonable to include it in the
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20174#discussion_r160757582
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1221,7 +1221,12 @@ object
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20202
thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20174#discussion_r160305419
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
---
@@ -230,6 +236,7 @@ case class HashAggregateExec
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20174#discussion_r160217718
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
---
@@ -230,6 +236,7 @@ case class HashAggregateExec
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20174#discussion_r160216751
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
---
@@ -102,10 +102,12 @@ case class
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20174
test this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/20174#discussion_r160042482
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
---
@@ -245,11 +252,15 @@ case class
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20174
S[SPARK-22951][SQL] aggregate should not produce empty rows if data frame
is empty
## What changes were proposed in this pull request?
WIP
## How was this patch tested
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20029
lgtm!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20029
By [this
line](https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala#L78),
yes
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20025
My understanding is that the reflection was used because we might use a
different version of hive then we didn't control what it was done inside the
`super.init`. However, after we inline
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20025
@zuotingbing I think all the code in SparkSQLSessionManager.scala should
gone because they are just some reflection hacks. It is possible to call
`super.init(hiveConf)` instead to get the session
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20109
[SPARK-22891][SQL] Make hive client creation thread safe
## What changes were proposed in this pull request?
This is to walk around the hive issue:
https://issues.apache.org/jira/browse
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/20029
@zuotingbing I took a close look at the related code and thought the issue
you raised is valid:
1. The hiveClient created for the
[resourceLoader](https://github.com/apache/spark/blob
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/20099
[SPARK-22916][SQL] shouldn't bias towards build right if user does not
specify
## What changes were proposed in this pull request?
When there are no broadcast hints, the current
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/18812
I actually did not get the motivation of this PR. HiveThriftServer2 can run
independently or be started with a SQL context:
https://github.com/apache/spark/pull/18812/files#diff
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/18812#discussion_r157361046
--- Diff:
sql/hive-thriftserver/src/main/java/org/apache/hive/service/server/HiveServer2.java
---
@@ -39,6 +32,8 @@
import
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/18812#discussion_r157361025
--- Diff:
sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java
---
@@ -37,21 +30,29 @@
import
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/19989
I think this method can take care of resource clean up automatically:
https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/19721
lgtm!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/19460
[SPARK-2][core] Fix the ARRAY_MAX in BufferHolder and add a test
## What changes were proposed in this pull request?
We should not break the assumption that the length of the
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/19266#discussion_r143346155
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java
---
@@ -35,6 +35,11 @@
* if the fields of
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/19394#discussion_r142290483
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
@@ -280,13 +280,20 @@ abstract class SparkPlan extends QueryPlan
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/19230#discussion_r139329522
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java
---
@@ -16,6 +16,7 @@
*/
package
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/19230#discussion_r139329523
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala
---
@@ -0,0 +1,201 @@
+/*
+ * Licensed to
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/19230
@viirya @cloud-fan unit test updated.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user liufengdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/19230#discussion_r139072681
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java
---
@@ -99,73 +100,18 @@ public ArrayData copy
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/19230
[SPARK-22003][SQL] support array column in vectorized reader with UDF
## What changes were proposed in this pull request?
The UDF needs to deserialize the `UnsafeRow`. When the column
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/18400
[SPARK-21188][CORE] releaseAllLocksForTask should synchronize the whole
method
## What changes were proposed in this pull request?
Since the objects `readLocksByTask`, `writeLocksByTask
GitHub user liufengdb opened a pull request:
https://github.com/apache/spark/pull/18208
[SPARK-20991] BROADCAST_TIMEOUT conf should be a TimeoutConf
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
The construction
Github user liufengdb commented on the issue:
https://github.com/apache/spark/pull/17397
lgtm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
66 matches
Mail list logo