[jira] [Created] (SPARK-23306) Race condition in TaskMemoryManager

2018-02-01 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-23306: -- Summary: Race condition in TaskMemoryManager Key: SPARK-23306 URL: https://issues.apache.org/jira/browse/SPARK-23306 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21505) A dynamic join operator to improve the join reliability

2017-11-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236365#comment-16236365 ] Zhan Zhang commented on SPARK-21505: Any comments on this feature? Do you think the design is OK? If

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

2017-07-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095425#comment-16095425 ] Zhan Zhang commented on SPARK-21492: root cause: In the SortMergeJoin, inner/leftOuter/rightOuter,

[jira] [Created] (SPARK-21492) Memory leak in SortMergeJoin

2017-07-20 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-21492: -- Summary: Memory leak in SortMergeJoin Key: SPARK-21492 URL: https://issues.apache.org/jira/browse/SPARK-21492 Project: Spark Issue Type: Bug

[jira] [Issue Comment Deleted] (SPARK-20215) ReuseExchange is boken in SparkSQL

2017-04-14 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-20215: --- Comment: was deleted (was: Seems to be fixed in SPARK-20229) > ReuseExchange is boken in SparkSQL >

[jira] [Commented] (SPARK-20215) ReuseExchange is boken in SparkSQL

2017-04-14 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968734#comment-15968734 ] Zhan Zhang commented on SPARK-20215: Seems to be fixed in SPARK-20229 > ReuseExchange is boken in

[jira] [Created] (SPARK-20215) ReuseExchange is boken in SparkSQL

2017-04-04 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-20215: -- Summary: ReuseExchange is boken in SparkSQL Key: SPARK-20215 URL: https://issues.apache.org/jira/browse/SPARK-20215 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-20006) Separate threshold for broadcast and shuffled hash join

2017-03-17 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931054#comment-15931054 ] Zhan Zhang edited comment on SPARK-20006 at 3/18/17 4:42 AM: - The default

[jira] [Commented] (SPARK-20006) Separate threshold for broadcast and shuffled hash join

2017-03-17 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15931054#comment-15931054 ] Zhan Zhang commented on SPARK-20006: The default ShuffledHashJoin threshold can fallback to the

[jira] [Updated] (SPARK-20006) Separate threshold for broadcast and shuffled hash join

2017-03-17 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-20006: --- Description: Currently both canBroadcast and canBuildLocalHashMap use the same configuration:

[jira] [Created] (SPARK-20006) Separate threshold for broadcast and shuffled hash join

2017-03-17 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-20006: -- Summary: Separate threshold for broadcast and shuffled hash join Key: SPARK-20006 URL: https://issues.apache.org/jira/browse/SPARK-20006 Project: Spark Issue

[jira] [Created] (SPARK-19908) Direct buffer memory OOM should not cause stage retries.

2017-03-10 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-19908: -- Summary: Direct buffer memory OOM should not cause stage retries. Key: SPARK-19908 URL: https://issues.apache.org/jira/browse/SPARK-19908 Project: Spark Issue

[jira] [Updated] (SPARK-19890) Make MetastoreRelation statistics estimation more accurately

2017-03-09 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-19890: --- Description: Currently the MetastoreRelation statistics is retrieved on the analyze phase, and the

[jira] [Created] (SPARK-19890) Make MetastoreRelation statistics estimation more accurately

2017-03-09 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-19890: -- Summary: Make MetastoreRelation statistics estimation more accurately Key: SPARK-19890 URL: https://issues.apache.org/jira/browse/SPARK-19890 Project: Spark

[jira] [Commented] (SPARK-19839) Fix memory leak in BytesToBytesMap

2017-03-06 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897975#comment-15897975 ] Zhan Zhang commented on SPARK-19839: When BytesToBytesMap spills, its longArray should be released.

[jira] [Created] (SPARK-19839) Fix memory leak in BytesToBytesMap

2017-03-06 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-19839: -- Summary: Fix memory leak in BytesToBytesMap Key: SPARK-19839 URL: https://issues.apache.org/jira/browse/SPARK-19839 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19815) Not orderable should be applied to right key instead of left key

2017-03-03 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895522#comment-15895522 ] Zhan Zhang commented on SPARK-19815: I am thinking the logic again. On the surface, the logic may be

[jira] [Updated] (SPARK-19815) Not orderable should be applied to right key instead of left key

2017-03-03 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-19815: --- Summary: Not orderable should be applied to right key instead of left key (was: Not order able

[jira] [Created] (SPARK-19815) Not order able should be applied to right key instead of left key

2017-03-03 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-19815: -- Summary: Not order able should be applied to right key instead of left key Key: SPARK-19815 URL: https://issues.apache.org/jira/browse/SPARK-19815 Project: Spark

[jira] [Commented] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-02-11 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15862588#comment-15862588 ] Zhan Zhang commented on SPARK-19354: This fix is actually critical. In production, we found that this

[jira] [Commented] (SPARK-13450) SortMergeJoin will OOM when join rows have lot of same keys

2017-01-09 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812550#comment-15812550 ] Zhan Zhang commented on SPARK-13450: ExternalAppendOnlyMap estimate the size of the data saved. In

[jira] [Commented] (SPARK-18637) Stateful UDF should be considered as nondeterministic

2016-11-29 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706961#comment-15706961 ] Zhan Zhang commented on SPARK-18637: [~hvanhovell] It is an annotation. /** * UDFType annotations

[jira] [Comment Edited] (SPARK-18637) Stateful UDF should be considered as nondeterministic

2016-11-29 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706961#comment-15706961 ] Zhan Zhang edited comment on SPARK-18637 at 11/29/16 11:52 PM: ---

[jira] [Updated] (SPARK-18637) Stateful UDF should be considered as nondeterministic

2016-11-29 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-18637: --- Component/s: SQL > Stateful UDF should be considered as nondeterministic >

[jira] [Commented] (SPARK-18637) Stateful UDF should be considered as nondeterministic

2016-11-29 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706905#comment-15706905 ] Zhan Zhang commented on SPARK-18637: Here is the comments from UDFType /** * If a UDF stores

[jira] [Created] (SPARK-18637) Stateful UDF should be considered as nondeterministic

2016-11-29 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-18637: -- Summary: Stateful UDF should be considered as nondeterministic Key: SPARK-18637 URL: https://issues.apache.org/jira/browse/SPARK-18637 Project: Spark Issue

[jira] [Commented] (SPARK-18550) Make the queue capacity of LiveListenerBus configurable.

2016-11-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15688361#comment-15688361 ] Zhan Zhang commented on SPARK-18550: I was not aware it has been fixed already. Please help to close

[jira] [Created] (SPARK-18550) Make the queue capacity of LiveListenerBus configurable.

2016-11-22 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-18550: -- Summary: Make the queue capacity of LiveListenerBus configurable. Key: SPARK-18550 URL: https://issues.apache.org/jira/browse/SPARK-18550 Project: Spark Issue

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-11-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15687474#comment-15687474 ] Zhan Zhang commented on SPARK-17637: [~hvanhovell] Thanks. PR is updated with conflicts resolved. >

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-09-23 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15516851#comment-15516851 ] Zhan Zhang commented on SPARK-17637: [~jerryshao] The idea is straightforward. Instead of doing round

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-09-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15515247#comment-15515247 ] Zhan Zhang commented on SPARK-17637: cc [~rxin] A quick prototype shows that for a tested pipeline,

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-09-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514105#comment-15514105 ] Zhan Zhang commented on SPARK-17637: The plan is to introduce a new configuration so that different

[jira] [Created] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-09-22 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-17637: -- Summary: Packed scheduling for Spark tasks across executors Key: SPARK-17637 URL: https://issues.apache.org/jira/browse/SPARK-17637 Project: Spark Issue Type:

[jira] [Created] (SPARK-17526) Display the executor log links with the job failure message on Spark UI and Console

2016-09-13 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-17526: -- Summary: Display the executor log links with the job failure message on Spark UI and Console Key: SPARK-17526 URL: https://issues.apache.org/jira/browse/SPARK-17526

[jira] [Updated] (SPARK-15848) Spark unable to read partitioned table in avro format and column name in upper case

2016-06-09 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-15848: --- Affects Version/s: 1.6.1 > Spark unable to read partitioned table in avro format and column name in

[jira] [Commented] (SPARK-15848) Spark unable to read partitioned table in avro format and column name in upper case

2016-06-09 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323195#comment-15323195 ] Zhan Zhang commented on SPARK-15848: cat > file1.csv< file2.csv< val tbl =

[jira] [Created] (SPARK-15848) Spark unable to read partitioned table in avro format and column name in upper case

2016-06-09 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-15848: -- Summary: Spark unable to read partitioned table in avro format and column name in upper case Key: SPARK-15848 URL: https://issues.apache.org/jira/browse/SPARK-15848

[jira] [Commented] (SPARK-15441) dataset outer join seems to return incorrect result

2016-05-24 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297881#comment-15297881 ] Zhan Zhang commented on SPARK-15441: Currently new GenericInternalRow(right.output.length) is used as

[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems

2016-01-30 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125205#comment-15125205 ] Zhan Zhang commented on SPARK-7009: --- Yes. This one is obsoleted. > Build assembly JAR via ant to avoid

[jira] [Commented] (SPARK-11075) Spark SQL Thrift Server authentication issue on kerberized yarn cluster

2016-01-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15113560#comment-15113560 ] Zhan Zhang commented on SPARK-11075: Duplicated to SPARK-5159? > Spark SQL Thrift Server

[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2016-01-15 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102183#comment-15102183 ] Zhan Zhang commented on SPARK-5159: --- What happen if an user have a valid visit to a table, which will be

[jira] [Comment Edited] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2016-01-15 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102183#comment-15102183 ] Zhan Zhang edited comment on SPARK-5159 at 1/15/16 5:50 PM: What happen if an

[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true

2016-01-14 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15098734#comment-15098734 ] Zhan Zhang commented on SPARK-5159: --- This issue is definitely broken. But fixing it needs a complete

[jira] [Commented] (SPARK-11704) Optimize the Cartesian Join

2015-11-14 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005682#comment-15005682 ] Zhan Zhang commented on SPARK-11704: [~maropu] You are right. I mean fetching from network is a big

[jira] [Commented] (SPARK-11705) Eliminate unnecessary Cartesian Join

2015-11-13 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15004744#comment-15004744 ] Zhan Zhang commented on SPARK-11705: simple reproduce step: import sqlContext.implicits._ case class

[jira] [Comment Edited] (SPARK-11704) Optimize the Cartesian Join

2015-11-13 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005136#comment-15005136 ] Zhan Zhang edited comment on SPARK-11704 at 11/14/15 5:16 AM: -- I think we

[jira] [Commented] (SPARK-11704) Optimize the Cartesian Join

2015-11-13 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005134#comment-15005134 ] Zhan Zhang commented on SPARK-11704: [~maropu] Maybe I misunderstand. If RDD2 is coming from

[jira] [Commented] (SPARK-11704) Optimize the Cartesian Join

2015-11-13 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005136#comment-15005136 ] Zhan Zhang commented on SPARK-11704: I think we can add a cleanup hook in SQLContext, and when the

[jira] [Created] (SPARK-11704) Optimize the Cartesian Join

2015-11-12 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-11704: -- Summary: Optimize the Cartesian Join Key: SPARK-11704 URL: https://issues.apache.org/jira/browse/SPARK-11704 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-11704) Optimize the Cartesian Join

2015-11-12 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-11704: --- Issue Type: Improvement (was: Bug) > Optimize the Cartesian Join > --- > >

[jira] [Created] (SPARK-11705) Eliminate unnecessary Cartesian Join

2015-11-12 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-11705: -- Summary: Eliminate unnecessary Cartesian Join Key: SPARK-11705 URL: https://issues.apache.org/jira/browse/SPARK-11705 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-11562) Provide user an option to init SQLContext or HiveContext in spark shell

2015-11-06 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994661#comment-14994661 ] Zhan Zhang commented on SPARK-11562: Thanks [~jerrylam] report the issue and provide the suggestion.

[jira] [Created] (SPARK-11562) Provide user an option to init SQLContext or HiveContext in spark shell

2015-11-06 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-11562: -- Summary: Provide user an option to init SQLContext or HiveContext in spark shell Key: SPARK-11562 URL: https://issues.apache.org/jira/browse/SPARK-11562 Project: Spark

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-21 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14967456#comment-14967456 ] Zhan Zhang commented on SPARK-11087: [~patcharee] I tried again, used the step you provided, and

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-21 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1496#comment-1496 ] Zhan Zhang commented on SPARK-11087: [~patcharee] I use the embeded hive metastore without any

[jira] [Comment Edited] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-15 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959599#comment-14959599 ] Zhan Zhang edited comment on SPARK-11087 at 10/15/15 8:58 PM: -- [~patcharee]

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-15 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959599#comment-14959599 ] Zhan Zhang commented on SPARK-11087: [~patcharee] I try to duplicate your table as much as possible,

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-15 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959342#comment-14959342 ] Zhan Zhang commented on SPARK-11087: [~patcharee] I tried a simple case with partition and predicate

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-14 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957301#comment-14957301 ] Zhan Zhang commented on SPARK-11087: I will take a look at this one. > spark.sql.orc.filterPushdown

[jira] [Commented] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2015-10-13 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1499#comment-1499 ] Zhan Zhang commented on SPARK-11087: no matter whether the table is sorted or not, the predicate

[jira] [Commented] (SPARK-10623) turning on predicate pushdown throws nonsuch element exception when RDD is empty

2015-09-15 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746164#comment-14746164 ] Zhan Zhang commented on SPARK-10623: It is caused by the SearchArgument.Builder is not correctly

[jira] [Commented] (SPARK-10304) Partition discovery does not throw an exception if the dir structure is invalid

2015-09-08 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735710#comment-14735710 ] Zhan Zhang commented on SPARK-10304: Did more investigation. Currently all files are included

[jira] [Commented] (SPARK-10304) Partition discovery does not throw an exception if the dir structure is valid

2015-08-31 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723851#comment-14723851 ] Zhan Zhang commented on SPARK-10304: [~yhuai] I tried to reproduce the problem with the same

[jira] [Commented] (SPARK-10304) Partition discovery does not throw an exception if the dir structure is valid

2015-08-31 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723994#comment-14723994 ] Zhan Zhang commented on SPARK-10304: [~yhuai] I think the NPE is caused by the directory has multiple

[jira] [Commented] (SPARK-10304) Partition discovery does not throw an exception if the dir structure is valid

2015-08-31 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724122#comment-14724122 ] Zhan Zhang commented on SPARK-10304: [~lian cheng] forget about my question. From the code, it is not

[jira] [Commented] (SPARK-10304) Partition discovery does not throw an exception if the dir structure is valid

2015-08-27 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717951#comment-14717951 ] Zhan Zhang commented on SPARK-10304: [~yhuai] Thanks for the information, and initial

[jira] [Commented] (SPARK-10304) Need to add a null check in unwrapperFor in HiveInspectors

2015-08-27 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716204#comment-14716204 ] Zhan Zhang commented on SPARK-10304: [~yhuai] Is the field.getFieldObjectInspector

[jira] [Commented] (SPARK-10304) Need to add a null check in unwrapperFor in HiveInspectors

2015-08-26 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715958#comment-14715958 ] Zhan Zhang commented on SPARK-10304: [~yhuai] NP. Will look at it. Need to add a

[jira] [Commented] (SPARK-5111) HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5

2015-08-11 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682189#comment-14682189 ] Zhan Zhang commented on SPARK-5111: --- [~adkathu...@yahoo.com] Hive is upgrade to 1.2 in

[jira] [Commented] (SPARK-5111) HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5

2015-08-07 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662227#comment-14662227 ] Zhan Zhang commented on SPARK-5111: --- Since hive upgrade is done. This jira is not valid

[jira] [Commented] (SPARK-8501) ORC data source may give empty schema if an ORC file containing zero rows is picked for schema discovery

2015-07-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612643#comment-14612643 ] Zhan Zhang commented on SPARK-8501: --- Because in spark, we will not create the orc file

[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

2015-06-26 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603166#comment-14603166 ] Zhan Zhang commented on SPARK-2883: --- [~philclaridge] Please refer to the test case in

[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

2015-06-26 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603164#comment-14603164 ] Zhan Zhang commented on SPARK-2883: --- [~biao luo] saveAsOrcFile and orcFile is not

[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

2015-06-26 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603669#comment-14603669 ] Zhan Zhang commented on SPARK-2883: --- [~philclaridge] I try the spark-shell in local

[jira] [Commented] (SPARK-5111) HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5

2015-06-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596244#comment-14596244 ] Zhan Zhang commented on SPARK-5111: --- [~bolke] Thanks for the feedback. I will take a

[jira] [Commented] (SPARK-6112) Provide external block store support through HDFS RAM_DISK

2015-06-22 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14596277#comment-14596277 ] Zhan Zhang commented on SPARK-6112: --- [~bghit] Here is one example link for the ramdisk

[jira] [Commented] (SPARK-6112) Provide external block store support through HDFS RAM_DISK

2015-06-20 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594935#comment-14594935 ] Zhan Zhang commented on SPARK-6112: --- Thansk [~arpitagarwal] for the detail setup.

[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems

2015-06-17 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590680#comment-14590680 ] Zhan Zhang commented on SPARK-7009: --- [~airhorns] Please refer to

[jira] [Commented] (SPARK-7009) Build assembly JAR via ant to avoid zip64 problems

2015-06-16 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589108#comment-14589108 ] Zhan Zhang commented on SPARK-7009: --- The PR may be outdated, and not working against

[jira] [Updated] (SPARK-6112) Provide external block store support through HDFS RAM_DISK

2015-05-27 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6112: -- Summary: Provide external block store support through HDFS RAM_DISK (was: Provide OffHeap support

[jira] [Commented] (SPARK-5111) HiveContext and Thriftserver cannot work in secure cluster beyond hadoop2.5

2015-04-13 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493483#comment-14493483 ] Zhan Zhang commented on SPARK-5111: --- [~crystal_gaoyu] I am not sure. You may try to

[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-08 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14486143#comment-14486143 ] Zhan Zhang commented on SPARK-6479: --- [~rxin] I updated the doc. If you think the overall

[jira] [Updated] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-08 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6479: -- Attachment: spark-6479-tachyon.patch patch with Tachyon migration. Not complete patch, as it will add

[jira] [Updated] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-08 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6479: -- Attachment: SPARK-6479OffheapAPIdesign (1).pdf Add exception from implementation Create off-heap

[jira] [Comment Edited] (SPARK-2883) Spark Support for ORCFile format

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393141#comment-14393141 ] Zhan Zhang edited comment on SPARK-2883 at 4/2/15 7:54 PM: ---

[jira] [Comment Edited] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393590#comment-14393590 ] Zhan Zhang edited comment on SPARK-6479 at 4/2/15 10:24 PM:

[jira] [Comment Edited] (SPARK-2883) Spark Support for ORCFile format

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393141#comment-14393141 ] Zhan Zhang edited comment on SPARK-2883 at 4/2/15 7:54 PM: ---

[jira] [Updated] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6479: -- Attachment: SPARK-6479OffheapAPIdesign.pdf Add failure case handling overall design and example.

[jira] [Commented] (SPARK-6112) Provide OffHeap support through HDFS RAM_DISK

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393287#comment-14393287 ] Zhan Zhang commented on SPARK-6112: --- Design spec for API attached to SPARK-6479 and wait

[jira] [Updated] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6479: -- Attachment: SPARK-6479.pdf This is the updated version for offheap store internal api design. Create

[jira] [Comment Edited] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393590#comment-14393590 ] Zhan Zhang edited comment on SPARK-6479 at 4/2/15 10:23 PM:

[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393590#comment-14393590 ] Zhan Zhang commented on SPARK-6479: --- [~rxin] Thanks for the feedback. I updated the

[jira] [Commented] (SPARK-2883) Spark Support for ORCFile format

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393141#comment-14393141 ] Zhan Zhang commented on SPARK-2883: --- Following code demonstrate the usage of the orc

[jira] [Commented] (SPARK-3720) support ORC in spark sql

2015-04-02 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393146#comment-14393146 ] Zhan Zhang commented on SPARK-3720: --- [~iward] I have update the patch with new api

[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)

2015-03-27 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384872#comment-14384872 ] Zhan Zhang commented on SPARK-6479: --- I have a short version for this API and will post

[jira] [Commented] (SPARK-3720) support ORC in spark sql

2015-03-23 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376361#comment-14376361 ] Zhan Zhang commented on SPARK-3720: --- [~iward] Since this jiar is duplicated to

[jira] [Updated] (SPARK-6112) Provide OffHeap support through HDFS RAM_DISK

2015-03-23 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6112: -- Attachment: SparkOffheapsupportbyHDFS.pdf Design doc for hdfs offheap support Provide OffHeap support

[jira] [Updated] (SPARK-6479) Create off-heap block storage API (internal)

2015-03-23 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6479: -- Attachment: SparkOffheapsupportbyHDFS.pdf The design doc also includes stuff from SPARK-6112 Create

[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)

2015-03-23 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376978#comment-14376978 ] Zhan Zhang commented on SPARK-6479: --- The current API may not be good enough as it has

[jira] [Updated] (SPARK-6112) Provide OffHeap support through HDFS RAM_DISK

2015-03-19 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhan Zhang updated SPARK-6112: -- Summary: Provide OffHeap support through HDFS RAM_DISK (was: Leverage HDFS RAM_DISK capacity to

  1   2   >