[GitHub] spark pull request #21783: [SPARK-24799]A solution of dealing with data skew...

2018-07-16 Thread marymwu
GitHub user marymwu opened a pull request: https://github.com/apache/spark/pull/21783 [SPARK-24799]A solution of dealing with data skew in left,right,inner join ## What changes were proposed in this pull request? For the left,right,inner join statment execution

[GitHub] spark pull request #21759: sfas

2018-07-13 Thread marymwu
Github user marymwu closed the pull request at: https://github.com/apache/spark/pull/21759 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21759: sfas

2018-07-13 Thread marymwu
GitHub user marymwu opened a pull request: https://github.com/apache/spark/pull/21759 sfas ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch

[jira] [Created] (SPARK-24799) A solution of dealing with data skew in left,right,inner join

2018-07-13 Thread marymwu (JIRA)
marymwu created SPARK-24799: --- Summary: A solution of dealing with data skew in left,right,inner join Key: SPARK-24799 URL: https://issues.apache.org/jira/browse/SPARK-24799 Project: Spark Issue

[jira] [Updated] (SPARK-17181) [Spark2.0 web ui]The status of the certain jobs is still displayed as running even if all the stages of this job have already finished

2016-08-22 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-17181: Attachment: job1000-2.png job1000-1.png > [Spark2.0 web ui]The status of the certain j

[jira] [Created] (SPARK-17181) [Spark2.0 web ui]The status of the certain jobs is still displayed as running even if all the stages of this job have already finished

2016-08-22 Thread marymwu (JIRA)
marymwu created SPARK-17181: --- Summary: [Spark2.0 web ui]The status of the certain jobs is still displayed as running even if all the stages of this job have already finished Key: SPARK-17181 URL: https

[jira] [Commented] (SPARK-5770) Use addJar() to upload a new jar file to executor, it can't be added to classloader

2016-08-22 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430243#comment-15430243 ] marymwu commented on SPARK-5770: Hey, we have ran into the same issue too. We try to fix this but failed

[jira] [Created] (SPARK-16970) [spark2.0] spark2.0 doesn't catch the java exception thrown by reflect function in sql statement which causes the job abort

2016-08-09 Thread marymwu (JIRA)
marymwu created SPARK-16970: --- Summary: [spark2.0] spark2.0 doesn't catch the java exception thrown by reflect function in sql statement which causes the job abort Key: SPARK-16970 URL: https://issues.apache.org/jira

[jira] [Created] (SPARK-16833) [Spark2.0]when creating temporary function,command "add jar" doesn't work unless restart spark

2016-08-01 Thread marymwu (JIRA)
marymwu created SPARK-16833: --- Summary: [Spark2.0]when creating temporary function,command "add jar" doesn't work unless restart spark Key: SPARK-16833 URL: https://issues.apache.org/jira/browse/S

[jira] [Commented] (SPARK-16601) Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports

2016-08-01 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401599#comment-15401599 ] marymwu commented on SPARK-16601: - Got it, thanks.BTW, some grammar changes in spark2.0 compared

[jira] [Commented] (SPARK-16603) Spark2.0 fail in executing the sql statement which field name begins with number,like "d.30_day_loss_user" while spark1.6 supports

2016-08-01 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401598#comment-15401598 ] marymwu commented on SPARK-16603: - ok,got it,thanks > Spark2.0 fail in executing the sql statement wh

[jira] [Commented] (SPARK-16605) Spark2.0 cannot "select" data from a table stored as an orc file which has been created by hive while hive or spark1.6 supports

2016-08-01 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401597#comment-15401597 ] marymwu commented on SPARK-16605: - I see,thanks! > Spark2.0 cannot "select" data from a

[jira] [Commented] (SPARK-16601) Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports

2016-07-19 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383625#comment-15383625 ] marymwu commented on SPARK-16601: - I'd like to create a table in a named DB > Spark2.0 fail in creat

[jira] [Updated] (SPARK-16605) Spark2.0 cannot "select" data from a table stored as an orc file which has been created by hive while hive or spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16605: Attachment: screenshot-1.png > Spark2.0 cannot "select" data from a table stored as an orc f

[jira] [Created] (SPARK-16605) Spark2.0 cannot "select" data from a table stored as an orc file which has been created by hive while hive or spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
marymwu created SPARK-16605: --- Summary: Spark2.0 cannot "select" data from a table stored as an orc file which has been created by hive while hive or spark1.6 supports Key: SPARK-16605 URL: https://issues.

[jira] [Created] (SPARK-16604) Spark2.0 fail in executing the sql statement which includes partition field in the "select" statement while spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
marymwu created SPARK-16604: --- Summary: Spark2.0 fail in executing the sql statement which includes partition field in the "select" statement while spark1.6 supports Key: SPARK-16604 URL: https://issues.apach

[jira] [Created] (SPARK-16603) Spark2.0 fail in executing the sql statement which field name begins with number,like "d.30_day_loss_user" while spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
marymwu created SPARK-16603: --- Summary: Spark2.0 fail in executing the sql statement which field name begins with number,like "d.30_day_loss_user" while spark1.6 supports Key: SPARK-16603 URL: https://issues.

[jira] [Created] (SPARK-16602) Spark2.0-error occurs when execute the sql statement which includes "nvl" function while spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
marymwu created SPARK-16602: --- Summary: Spark2.0-error occurs when execute the sql statement which includes "nvl" function while spark1.6 supports Key: SPARK-16602 URL: https://issues.apache.org/jira/browse/S

[jira] [Updated] (SPARK-16601) Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16601: Attachment: error log.png > Spark2.0 fail in creating table using sql statement "crea

[jira] [Created] (SPARK-16601) Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports

2016-07-18 Thread marymwu (JIRA)
marymwu created SPARK-16601: --- Summary: Spark2.0 fail in creating table using sql statement "create table `db.tableName` xxx" while spark1.6 supports Key: SPARK-16601 URL: https://issues.apache.org/jira/br

[jira] [Created] (HIVE-14160) Reduce-task costs a long time to finish on the condition that the certain sql "select a,distinct(b) group by a" has been executed on the data which has skew distribution

2016-07-05 Thread marymwu (JIRA)
marymwu created HIVE-14160: -- Summary: Reduce-task costs a long time to finish on the condition that the certain sql "select a,distinct(b) group by a" has been executed on the data which has skew distribution Key:

[jira] [Created] (SPARK-16376) [Spark web UI]:HTTP ERROR 500 when using rest api "/applications/[app-id]/jobs" if array "stageIds" is empty

2016-07-05 Thread marymwu (JIRA)
marymwu created SPARK-16376: --- Summary: [Spark web UI]:HTTP ERROR 500 when using rest api "/applications/[app-id]/jobs" if array "stageIds" is empty Key: SPARK-16376 URL: https://issues.apache.o

[jira] [Updated] (SPARK-16375) [Spark web UI]:The wrong value(numCompletedTasks) has been assigned to the variable numSkippedTasks

2016-07-05 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16375: Attachment: numSkippedTasksWrongValue.png > [Spark web UI]:The wrong value(numCompletedTasks) has b

[jira] [Created] (SPARK-16375) [Spark web UI]:The wrong value(numCompletedTasks) has been assigned to the variable numSkippedTasks

2016-07-05 Thread marymwu (JIRA)
marymwu created SPARK-16375: --- Summary: [Spark web UI]:The wrong value(numCompletedTasks) has been assigned to the variable numSkippedTasks Key: SPARK-16375 URL: https://issues.apache.org/jira/browse/SPARK-16375

[jira] [Updated] (SPARK-16089) Spark2.0 doesn't support the certain static partition SQL statment as "insert overwrite table targetTB PARTITION (partition field=xx) select field1,field2,...,partition

2016-07-03 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16089: Component/s: SQL > Spark2.0 doesn't support the certain static partition SQL statment as &quo

[jira] [Updated] (SPARK-16092) Spark2.0 take no effect after set hive.exec.dynamic.partition.mode=nonstrict as a global variable in Spark2.0 configuration file while Spark1.6 does

2016-07-03 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16092: Component/s: SQL > Spark2.0 take no effect after set hive.exec.dynamic.partition.mode=nonstr

[jira] [Updated] (SPARK-16093) Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold = 1

2016-07-03 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16093: Component/s: SQL > Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold

[jira] [Created] (SPARK-16255) Spark2.0 doesn't support the following SQL statement:"insert into directory "/u_qa_user/hive_testdata/test1/t1" select * from d_test_tpc_2g_txt.auction" while Hive suppo

2016-06-28 Thread marymwu (JIRA)
marymwu created SPARK-16255: --- Summary: Spark2.0 doesn't support the following SQL statement:"insert into directory "/u_qa_user/hive_testdata/test1/t1" select * from d_test_tpc_2g_txt.auction" while Hive supports

[jira] [Updated] (SPARK-16254) Spark2.0 monitor web ui->Tasks (for all stages)->the number of Succeed is more than Total

2016-06-28 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16254: Affects Version/s: 2.0.0 > Spark2.0 monitor web ui->Tasks (for all stages)->the number o

[jira] [Updated] (SPARK-16254) Spark2.0 monitor web ui->Tasks (for all stages)->the number of Succeed is more than Total

2016-06-28 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16254: Attachment: Reference.png > Spark2.0 monitor web ui->Tasks (for all stages)->the number o

[jira] [Created] (SPARK-16254) Spark2.0 monitor web ui->Tasks (for all stages)->the number of Succeed is more than Total

2016-06-28 Thread marymwu (JIRA)
marymwu created SPARK-16254: --- Summary: Spark2.0 monitor web ui->Tasks (for all stages)->the number of Succeed is more than Total Key: SPARK-16254 URL: https://issues.apache.org/jira/browse/SPARK

[jira] [Commented] (SPARK-16092) Spark2.0 take no effect after set hive.exec.dynamic.partition.mode=nonstrict as a global variable in Spark2.0 configuration file while Spark1.6 does

2016-06-27 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350764#comment-15350764 ] marymwu commented on SPARK-16092: - Any body help? > Spark2.0 take no effect after

[jira] [Commented] (SPARK-16089) Spark2.0 doesn't support the certain static partition SQL statment as "insert overwrite table targetTB PARTITION (partition field=xx) select field1,field2,...,partitio

2016-06-27 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350765#comment-15350765 ] marymwu commented on SPARK-16089: - Any body help? > Spark2.0 doesn't support the certain sta

[jira] [Commented] (SPARK-16093) Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold = 1

2016-06-27 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350763#comment-15350763 ] marymwu commented on SPARK-16093: - Any body help? > Spark2.0 take no effect after

[jira] [Updated] (SPARK-16093) Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold = 1

2016-06-21 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16093: Attachment: Errorlog.txt > Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold

[jira] [Created] (SPARK-16093) Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold = 1

2016-06-21 Thread marymwu (JIRA)
marymwu created SPARK-16093: --- Summary: Spark2.0 take no effect after set spark.sql.autoBroadcastJoinThreshold = 1 Key: SPARK-16093 URL: https://issues.apache.org/jira/browse/SPARK-16093 Project: Spark

[jira] [Created] (SPARK-16092) Spark2.0 take no effect after set hive.exec.dynamic.partition.mode=nonstrict as a global variable in Spark2.0 configuration file while Spark1.6 does

2016-06-21 Thread marymwu (JIRA)
marymwu created SPARK-16092: --- Summary: Spark2.0 take no effect after set hive.exec.dynamic.partition.mode=nonstrict as a global variable in Spark2.0 configuration file while Spark1.6 does Key: SPARK-16092 URL: https

[jira] [Updated] (SPARK-16089) Spark2.0 doesn't support the certain static partition SQL statment as "insert overwrite table targetTB PARTITION (partition field=xx) select field1,field2,...,partition

2016-06-21 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-16089: Attachment: StaticPartitionSQLStatementError.png > Spark2.0 doesn't support the certain static partit

[jira] [Created] (SPARK-16089) Spark2.0 doesn't support the certain static partition SQL statment as "insert overwrite table targetTB PARTITION (partition field=xx) select field1,field2,...,partition

2016-06-21 Thread marymwu (JIRA)
marymwu created SPARK-16089: --- Summary: Spark2.0 doesn't support the certain static partition SQL statment as "insert overwrite table targetTB PARTITION (partition field=xx) select field1,field2,...,partition field from sourceTB where partition

[jira] [Commented] (SPARK-15802) SparkSQL connection fail using shell command "bin/beeline -u "jdbc:hive2://*.*.*.*:10000/default""

2016-06-14 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329051#comment-15329051 ] marymwu commented on SPARK-15802: - We still have a question. How to use "binary" protocol? It s

[jira] [Commented] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file

2016-06-13 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328860#comment-15328860 ] marymwu commented on SPARK-15757: - Any update? > Error occurs when using Spark sql "select&qu

[jira] [Commented] (SPARK-15802) SparkSQL connection fail using shell command "bin/beeline -u "jdbc:hive2://*.*.*.*:10000/default""

2016-06-07 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319908#comment-15319908 ] marymwu commented on SPARK-15802: - looking forward to your reply, thanks > SparkSQL connection f

[jira] [Commented] (SPARK-15802) SparkSQL connection fail using shell command "bin/beeline -u "jdbc:hive2://*.*.*.*:10000/default""

2016-06-07 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15319906#comment-15319906 ] marymwu commented on SPARK-15802: - what's the right protocol? how to specify it ? > SparkSQL connect

[GitHub] spark pull request #13550: SPARK-15755

2016-06-07 Thread marymwu
GitHub user marymwu opened a pull request: https://github.com/apache/spark/pull/13550 SPARK-15755 JIRA Issue: https://issues.apache.org/jira/browse/SPARK-15755 java.lang.NullPointerException when run spark 2.0 setting spark.serializer

[jira] [Created] (SPARK-15802) SparkSQL connection fail using shell command "bin/beeline -u "jdbc:hive2://*.*.*.*:10000/default""

2016-06-07 Thread marymwu (JIRA)
marymwu created SPARK-15802: --- Summary: SparkSQL connection fail using shell command "bin/beeline -u "jdbc:hive2://*.*.*.*:1/default"" Key: SPARK-15802 URL: https://issues.apache.org/jir

[jira] [Commented] (SPARK-15755) java.lang.NullPointerException when run spark 2.0 setting spark.serializer=org.apache.spark.serializer.KryoSerializer

2016-06-06 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317782#comment-15317782 ] marymwu commented on SPARK-15755: - Any comments? > java.lang.NullPointerException when run spark

[jira] [Commented] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file

2016-06-06 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317778#comment-15317778 ] marymwu commented on SPARK-15757: - The error occurs steps are as follows: hope it helps 1. use hive

[jira] [Commented] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file

2016-06-05 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316170#comment-15316170 ] marymwu commented on SPARK-15757: - Actually, Field "inv_date_sk" does exist! We have exec

[jira] [Issue Comment Deleted] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc

2016-06-05 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-15757: Comment: was deleted (was: Actually, Field "inv_date_sk" does exist! We have executed "

[jira] [Updated] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file

2016-06-05 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-15757: Attachment: Result.png > Error occurs when using Spark sql "select" statement on orc fi

[jira] [Commented] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file

2016-06-05 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316168#comment-15316168 ] marymwu commented on SPARK-15757: - Actually, Field "inv_date_sk" does exist! We have exec

[jira] [Updated] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file

2016-06-03 Thread marymwu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] marymwu updated SPARK-15757: Summary: Error occurs when using Spark sql "select" statement on orc file after hive sql "i

[jira] [Created] (SPARK-15757) Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed

2016-06-03 Thread marymwu (JIRA)
marymwu created SPARK-15757: --- Summary: Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed Key: SPARK-15757

[jira] [Created] (SPARK-15756) SQL “stored as orcfile” cannot be supported while hive supports both keywords "orc" and "orcfile"

2016-06-03 Thread marymwu (JIRA)
marymwu created SPARK-15756: --- Summary: SQL “stored as orcfile” cannot be supported while hive supports both keywords "orc" and "orcfile" Key: SPARK-15756 URL: https://issues.apache.org/jir

[jira] [Created] (SPARK-15755) java.lang.NullPointerException when run spark 2.0 setting spark.serializer=org.apache.spark.serializer.KryoSerializer

2016-06-03 Thread marymwu (JIRA)
marymwu created SPARK-15755: --- Summary: java.lang.NullPointerException when run spark 2.0 setting spark.serializer=org.apache.spark.serializer.KryoSerializer Key: SPARK-15755 URL: https://issues.apache.org/jira/browse