[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070973#comment-16070973 ] Kazuaki Ishizaki commented on SPARK-21271: -- I see. I will work for this. Thank you for letting

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070966#comment-16070966 ] Kazuaki Ishizaki commented on SPARK-21271: -- I see. This issue comes from the violation

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070961#comment-16070961 ] Kazuaki Ishizaki commented on SPARK-21271: -- I see. For var-length part, its regulation

[jira] [Commented] (SPARK-21271) UnsafeRow.hashCode assertion when sizeInBytes not multiple of 8

2017-06-30 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070933#comment-16070933 ] Kazuaki Ishizaki commented on SPARK-21271: -- In {{UnsafeRow}}, the length of fixed part should

[jira] [Updated] (SPARK-20783) Enhance ColumnVector to support compressed representation

2017-06-29 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20783: - Description: Current {{ColumnVector}} handles uncompressed data for parquet

[jira] [Updated] (SPARK-20783) Enhance ColumnVector to support compressed representation

2017-06-29 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20783: - Summary: Enhance ColumnVector to support compressed representation (was: Enhance

[jira] [Created] (SPARK-21217) Support ColumnVector.Array.toArray()

2017-06-26 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-21217: Summary: Support ColumnVector.Array.toArray() Key: SPARK-21217 URL: https://issues.apache.org/jira/browse/SPARK-21217 Project: Spark Issue Type

[jira] [Commented] (SPARK-21074) Parquet files are read fully even though only count() is requested

2017-06-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053532#comment-16053532 ] Kazuaki Ishizaki commented on SPARK-21074: -- I think that [~spektom] expected the similar

Re: Java access to internal representation of DataTypes.DateType

2017-06-14 Thread Kazuaki Ishizaki
Does this code help you? https://github.com/apache/spark/blob/master/sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java#L156-L194 Kazuaki Ishizaki From: Anton Kravchenko <kravchenko.anto...@gmail.com> To: "user @spark" <user@spark.apache.org>

[jira] [Updated] (SPARK-21047) Add test suites for complicated cases in ColumnarBatchSuite

2017-06-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-21047: - Summary: Add test suites for complicated cases in ColumnarBatchSuite (was: Add test

[jira] [Updated] (SPARK-21047) Add test suites for complicated cases

2017-06-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-21047: - Description: Current {{ColumnarBatchSuite}} has very simple test cases for array

[jira] [Updated] (SPARK-21047) Add test suites for complicated cases

2017-06-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-21047: - Summary: Add test suites for complicated cases (was: Add test cases for nested array

[jira] [Commented] (SPARK-21053) Number overflow on agg function of Dataframe

2017-06-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045858#comment-16045858 ] Kazuaki Ishizaki commented on SPARK-21053: -- Can you post a program that can reproduce this issue

[jira] [Updated] (SPARK-21047) Add test cases for nested array

2017-06-10 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-21047: - Summary: Add test cases for nested array (was: Add a test case for nested array) >

[jira] [Created] (SPARK-21047) Add a test case for nested array

2017-06-10 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-21047: Summary: Add a test case for nested array Key: SPARK-21047 URL: https://issues.apache.org/jira/browse/SPARK-21047 Project: Spark Issue Type: Sub

[jira] [Commented] (SPARK-20752) Build-in SQL Function Support - SQRT

2017-06-10 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045466#comment-16045466 ] Kazuaki Ishizaki commented on SPARK-20752: -- ping [~smilegator] > Build-in SQL Function Supp

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-05 Thread Kazuaki Ishizaki
6T13:44:48+09:00 [INFO] Final Memory: 53M/510M [INFO] [WARNING] The requested profile "hive" could not be activated because it does not exist. Kazuaki Ishizaki From: Michael Armbrust <mich...@datab

[jira] [Created] (SPARK-20907) Use testQuietly for test suites that generate long log output

2017-05-27 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20907: Summary: Use testQuietly for test suites that generate long log output Key: SPARK-20907 URL: https://issues.apache.org/jira/browse/SPARK-20907 Project: Spark

[jira] [Commented] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-05-25 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025789#comment-16025789 ] Kazuaki Ishizaki commented on SPARK-19372: -- I see. Let me create a PR for 2.2.0 > C

[jira] [Commented] (SPARK-20866) Dataset map does not respect nullable field

2017-05-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16023435#comment-16023435 ] Kazuaki Ishizaki commented on SPARK-20866: -- I confirmed that it has been fixed in master branch

[jira] [Comment Edited] (SPARK-20825) Generate code to get value from table cache with wider column in ColumnarBatch for other data types

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018746#comment-16018746 ] Kazuaki Ishizaki edited comment on SPARK-20825 at 5/23/17 4:46 AM

[jira] [Updated] (SPARK-20825) Generate code to get value from table cache with wider column in ColumnarBatch for other data types

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20825: - Summary: Generate code to get value from table cache with wider column in ColumnarBatch

[jira] [Updated] (SPARK-20824) Generate code to get value from table cache with wider column in ColumnarBatch

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20824: - Summary: Generate code to get value from table cache with wider column in ColumnarBatch

[jira] [Comment Edited] (SPARK-20824) Generate code to get value from table cache with wider column in ColumnarBatch

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018744#comment-16018744 ] Kazuaki Ishizaki edited comment on SPARK-20824 at 5/23/17 4:45 AM

[jira] [Comment Edited] (SPARK-20823) Generate code to build table cache using ColumnarBatch and to get value from ColumnVector for other types

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018742#comment-16018742 ] Kazuaki Ishizaki edited comment on SPARK-20823 at 5/23/17 4:43 AM

[jira] [Updated] (SPARK-20823) Generate code to build table cache using ColumnarBatch and to get value from ColumnVector for other types

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20823: - Summary: Generate code to build table cache using ColumnarBatch and to get value from

[jira] [Comment Edited] (SPARK-20822) Generate code to build table cache using ColumnarBatch and to get value from ColumnVector

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018738#comment-16018738 ] Kazuaki Ishizaki edited comment on SPARK-20822 at 5/23/17 4:43 AM

[jira] [Updated] (SPARK-20822) Generate code to build table cache using ColumnarBatch and to get value from ColumnVector

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20822: - Summary: Generate code to build table cache using ColumnarBatch and to get value from

[jira] [Comment Edited] (SPARK-20823) Generate code for building table cache using ColumnarBatch for other types

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018742#comment-16018742 ] Kazuaki Ishizaki edited comment on SPARK-20823 at 5/22/17 10:35 PM

[jira] [Updated] (SPARK-20823) Generate code for building table cache using ColumnarBatch for other types

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20823: - Summary: Generate code for building table cache using ColumnarBatch for other types

[jira] [Updated] (SPARK-20822) Generate code for building table cache using ColumnarBatch

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20822: - Summary: Generate code for building table cache using ColumnarBatch (was: Generate code

[jira] [Comment Edited] (SPARK-20822) Generate code for building table cache using ColumnarBatch

2017-05-22 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018738#comment-16018738 ] Kazuaki Ishizaki edited comment on SPARK-20822 at 5/22/17 10:34 PM

Re: [build system] jenkins got itself wedged...

2017-05-21 Thread Kazuaki Ishizaki
It looked well these days. However, it seems to go down slowly again... When I tried to see console log (e.g. https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77149/consoleFull ), a server returns "proxy error." Regards, Kazuaki Ishizaki From: shane

[jira] [Commented] (SPARK-20826) Support compression/decompression of ColumnVector in generated code

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018748#comment-16018748 ] Kazuaki Ishizaki commented on SPARK-20826: -- Wait for merge of SPARK-20820, SPARK-20822

[jira] [Commented] (SPARK-20825) Generate code that directly gets a value from each column in CachedBatch for table cache for other data types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018746#comment-16018746 ] Kazuaki Ishizaki commented on SPARK-20825: -- Follow-on of SPARK-20824. Generate Java code

[jira] [Commented] (SPARK-20824) Generate code that directly gets a value from each column in CachedBatch for table cache

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018744#comment-16018744 ] Kazuaki Ishizaki commented on SPARK-20824: -- Follow-on of SPARK-20822. Generate Java code

[jira] [Comment Edited] (SPARK-20824) Generate code that directly gets a value from each column in CachedBatch for table cache

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018744#comment-16018744 ] Kazuaki Ishizaki edited comment on SPARK-20824 at 5/21/17 8:18 AM

[jira] [Commented] (SPARK-20822) Generate code for table cache using ColumnarBatch

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018738#comment-16018738 ] Kazuaki Ishizaki commented on SPARK-20822: -- Waiting for merge of SPARK-20770. Generate Java code

[jira] [Commented] (SPARK-20823) Generate code for table cache using ColumnarBatch for other types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018742#comment-16018742 ] Kazuaki Ishizaki commented on SPARK-20823: -- Follow-on of SPARK-20822. Generate Java code

[jira] [Commented] (SPARK-20821) Add compression/decompression of data to ColumnVector for other data types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018737#comment-16018737 ] Kazuaki Ishizaki commented on SPARK-20821: -- Follow-on of SPARK-20807. Support other data types

[jira] [Commented] (SPARK-20820) Add compression/decompression of data to ColumnVector for other compression schemes

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018736#comment-16018736 ] Kazuaki Ishizaki commented on SPARK-20820: -- Follow-on of SPARK-20807. Support additional

[jira] [Created] (SPARK-20826) Support compression/decompression of ColumnVector in generated code

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20826: Summary: Support compression/decompression of ColumnVector in generated code Key: SPARK-20826 URL: https://issues.apache.org/jira/browse/SPARK-20826 Project

[jira] [Created] (SPARK-20825) Generate code that directly gets a value from each column in CachedBatch for table cache for other data types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20825: Summary: Generate code that directly gets a value from each column in CachedBatch for table cache for other data types Key: SPARK-20825 URL: https://issues.apache.org

[jira] [Created] (SPARK-20824) Generate code that a value from each column in CachedBatch for table cache

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20824: Summary: Generate code that a value from each column in CachedBatch for table cache Key: SPARK-20824 URL: https://issues.apache.org/jira/browse/SPARK-20824

[jira] [Updated] (SPARK-20824) Generate code that directly gets a value from each column in CachedBatch for table cache

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20824: - Summary: Generate code that directly gets a value from each column in CachedBatch

[jira] [Created] (SPARK-20823) Generate code for table cache using ColumnarBatch for other types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20823: Summary: Generate code for table cache using ColumnarBatch for other types Key: SPARK-20823 URL: https://issues.apache.org/jira/browse/SPARK-20823 Project

[jira] [Created] (SPARK-20822) Generate code for table cache using ColumnarBatch

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20822: Summary: Generate code for table cache using ColumnarBatch Key: SPARK-20822 URL: https://issues.apache.org/jira/browse/SPARK-20822 Project: Spark

[jira] [Created] (SPARK-20821) Add compression/decompression of data to ColumnVector for other data types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20821: Summary: Add compression/decompression of data to ColumnVector for other data types Key: SPARK-20821 URL: https://issues.apache.org/jira/browse/SPARK-20821

[jira] [Created] (SPARK-20820) Add compression/decompression of data to ColumnVector for other compression schemes

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20820: Summary: Add compression/decompression of data to ColumnVector for other compression schemes Key: SPARK-20820 URL: https://issues.apache.org/jira/browse/SPARK-20820

[jira] [Comment Edited] (SPARK-20819) Enhance ColumnVector to keep UnsafeArrayData for other types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018732#comment-16018732 ] Kazuaki Ishizaki edited comment on SPARK-20819 at 5/21/17 7:58 AM

[jira] [Commented] (SPARK-20819) Enhance ColumnVector to keep UnsafeArrayData for other types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018732#comment-16018732 ] Kazuaki Ishizaki commented on SPARK-20819: -- Follow-on of SPARK-20783 to support other type

[jira] [Created] (SPARK-20819) Enhance ColumnVector to keep UnsafeArrayData for other types

2017-05-21 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20819: Summary: Enhance ColumnVector to keep UnsafeArrayData for other types Key: SPARK-20819 URL: https://issues.apache.org/jira/browse/SPARK-20819 Project: Spark

[jira] [Commented] (SPARK-20750) Built-in SQL Function Support - REPLACE

2017-05-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018637#comment-16018637 ] Kazuaki Ishizaki commented on SPARK-20750: -- I will work on this > Built-in SQL Function Supp

[jira] [Commented] (SPARK-20749) Built-in SQL Function Support - all variants of LEN[GTH]

2017-05-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018625#comment-16018625 ] Kazuaki Ishizaki commented on SPARK-20749: -- I will work on this > Built-in SQL Function Supp

[jira] [Commented] (SPARK-20752) Build-in SQL Function Support - SQRT

2017-05-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018624#comment-16018624 ] Kazuaki Ishizaki commented on SPARK-20752: -- [~smilegator] Can I confirm this JIRA? According

[jira] [Created] (SPARK-20817) Benchmark.getProcessorName() returns "Unknown processor" on ppc and 390 platforms

2017-05-20 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20817: Summary: Benchmark.getProcessorName() returns "Unknown processor" on ppc and 390 platforms Key: SPARK-20817 URL: https://issues.apache.org/jira/browse/S

[jira] [Commented] (SPARK-20752) Build-in SQL Function Support - SQRT

2017-05-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018330#comment-16018330 ] Kazuaki Ishizaki commented on SPARK-20752: -- I am working on this > Build-in SQL Funct

[jira] [Created] (SPARK-20807) Add compression/decompression of data to ColumnVector

2017-05-19 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20807: Summary: Add compression/decompression of data to ColumnVector Key: SPARK-20807 URL: https://issues.apache.org/jira/browse/SPARK-20807 Project: Spark

[jira] [Created] (SPARK-20783) Enhance ColumnVector to keep UnsafeArrayData for array

2017-05-17 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20783: Summary: Enhance ColumnVector to keep UnsafeArrayData for array Key: SPARK-20783 URL: https://issues.apache.org/jira/browse/SPARK-20783 Project: Spark

[jira] [Created] (SPARK-20770) Improve ColumnStats

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20770: Summary: Improve ColumnStats Key: SPARK-20770 URL: https://issues.apache.org/jira/browse/SPARK-20770 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-14098: - Description: [Here|https://docs.google.com/document/d/1

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-14098: - Issue Type: Umbrella (was: Improvement) > Generate Java code to bu

[jira] [Commented] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012488#comment-16012488 ] Kazuaki Ishizaki commented on SPARK-14098: -- [~lins05] Sorry, I overlooked this message. I synced

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-14098: - Summary: Generate Java code to build CachedColumnarBatch and get values from

[jira] [Commented] (ARROW-300) [Format] Add buffer compression option to IPC file format

2017-05-12 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008315#comment-16008315 ] Kazuaki Ishizaki commented on ARROW-300: Thank you for your response. I was also busy for preparing

Re: [VOTE] Apache Spark 2.2.0 (RC2)

2017-05-09 Thread Kazuaki Ishizaki
9T17:51:04+09:00 [INFO] Final Memory: 53M/514M [INFO] [WARNING] The requested profile "hive" could not be activated because it does not exist. Kazuaki Ishizaki, From: Michael Armbrust <mich...@datab

[jira] [Created] (SPARK-20537) OffHeapColumnVector reallocation may not copy existing data

2017-04-30 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20537: Summary: OffHeapColumnVector reallocation may not copy existing data Key: SPARK-20537 URL: https://issues.apache.org/jira/browse/SPARK-20537 Project: Spark

Re: [VOTE] Apache Spark 2.2.0 (RC1)

2017-04-28 Thread Kazuaki Ishizaki
9T02:23:08+09:00 [INFO] Final Memory: 53M/491M [INFO] ---- Kazuaki Ishizaki, From: Michael Armbrust <mich...@databricks.com> To: "dev@spark.apache.org" <dev@spark.apache.org> Date: 2017/04/28 03:

Re: [VOTE] Apache Spark 2.1.1 (RC4)

2017-04-28 Thread Kazuaki Ishizaki
576M [INFO] Regards, Kazuaki Ishizaki, From: Michael Armbrust <mich...@databricks.com> To: "dev@spark.apache.org" <dev@spark.apache.org> Date: 2017/04/27 09:30 Subject:[VOTE] Apache

[jira] [Created] (SPARK-20479) Performance degradation for large number of hash-aggregated columns

2017-04-26 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20479: Summary: Performance degradation for large number of hash-aggregated columns Key: SPARK-20479 URL: https://issues.apache.org/jira/browse/SPARK-20479 Project

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-26 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985319#comment-15985319 ] Kazuaki Ishizaki commented on SPARK-20184: -- When # of the aggregated columns gets large, I saw

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-25 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983490#comment-15983490 ] Kazuaki Ishizaki commented on SPARK-20392: -- Here are my observations: According

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981728#comment-15981728 ] Kazuaki Ishizaki commented on SPARK-20392: -- Thank you. I confirmed that blockbuster.csv is slow

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-21 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978178#comment-15978178 ] Kazuaki Ishizaki commented on SPARK-20392: -- Is there a program to reproduce this while we see

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-20 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976897#comment-15976897 ] Kazuaki Ishizaki commented on SPARK-20184: -- The root cause is overhead in Java code generated

[jira] [Commented] (ARROW-300) [Format] Add buffer compression option to IPC file format

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975288#comment-15975288 ] Kazuaki Ishizaki commented on ARROW-300: [~wesmckinn] Thank you for your kindly and positive

[jira] [Commented] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975194#comment-15975194 ] Kazuaki Ishizaki commented on SPARK-16845: -- You are seeing another exception. While [This PR

[jira] [Comment Edited] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975194#comment-15975194 ] Kazuaki Ishizaki edited comment on SPARK-16845 at 4/19/17 6:05 PM: --- You

[jira] [Comment Edited] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974841#comment-15974841 ] Kazuaki Ishizaki edited comment on SPARK-18891 at 4/19/17 3:08 PM: --- I

[jira] [Comment Edited] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974841#comment-15974841 ] Kazuaki Ishizaki edited comment on SPARK-18891 at 4/19/17 3:08 PM: --- I

[jira] [Comment Edited] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974841#comment-15974841 ] Kazuaki Ishizaki edited comment on SPARK-18891 at 4/19/17 3:07 PM: --- I

[jira] [Comment Edited] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974841#comment-15974841 ] Kazuaki Ishizaki edited comment on SPARK-18891 at 4/19/17 3:00 PM: --- I

[jira] [Commented] (SPARK-18891) Support for specific collection types

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974841#comment-15974841 ] Kazuaki Ishizaki commented on SPARK-18891: -- I confirmed that that the latest master branch works

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974766#comment-15974766 ] Kazuaki Ishizaki commented on SPARK-18492: -- Do you see the same problem with the latest master

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974682#comment-15974682 ] Kazuaki Ishizaki commented on SPARK-20184: -- I succeeded to reproduce this... {code} % git log

Re: [VOTE] Apache Spark 2.1.1 (RC3)

2017-04-19 Thread Kazuaki Ishizaki
672M [INFO] Regards, Kazuaki Ishizaki, From: Michael Armbrust <mich...@databricks.com> To: "dev@spark.apache.org" <dev@spark.apache.org> Date: 2017/04/19 04:00 Subject:[VOTE] Apache

[jira] [Commented] (SPARK-20341) Support BigIngeger values > 19 precision

2017-04-18 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973343#comment-15973343 ] Kazuaki Ishizaki commented on SPARK-20341: -- This exception is thrown since {{BigInt

[jira] [Commented] (SPARK-20341) Support BigIngeger values > 19 precision

2017-04-18 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973227#comment-15973227 ] Kazuaki Ishizaki commented on SPARK-20341: -- Master branch also throws an exception. > Supp

[jira] [Closed] (SPARK-13644) Add the source file name and line into Logger when an exception occurs in the generated code

2017-04-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki closed SPARK-13644. Resolution: Unresolved > Add the source file name and line into Logger when an except

[jira] [Commented] (SPARK-13644) Add the source file name and line into Logger when an exception occurs in the generated code

2017-04-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970313#comment-15970313 ] Kazuaki Ishizaki commented on SPARK-13644: -- It is not resolved yet. But, let me close this once

[jira] [Commented] (ARROW-300) [Format] Add buffer compression option to IPC file format

2017-04-10 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963751#comment-15963751 ] Kazuaki Ishizaki commented on ARROW-300: Current Apache Spark supports [the following compression

[jira] [Updated] (SPARK-20254) SPARK-19716 generates unnecessary data conversion for Dataset with primitive array

2017-04-07 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20254: - Summary: SPARK-19716 generates unnecessary data conversion for Dataset with primitive

[jira] [Updated] (SPARK-20254) SPARK-19716 generates inefficient Java code from a primitive array of Dataset

2017-04-07 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20254: - Description: Since {{unresolvedmapobjects}} is newly introduced by SPARK-19716

[jira] [Updated] (SPARK-20254) SPARK-19716 generates inefficient Java code from a primitive array of Dataset

2017-04-07 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-20254: - Summary: SPARK-19716 generates inefficient Java code from a primitive array of Dataset

[jira] [Created] (SPARK-20254) SPARK-19716 generate inefficient Java code from a primitive array of Dataset

2017-04-07 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20254: Summary: SPARK-19716 generate inefficient Java code from a primitive array of Dataset Key: SPARK-20254 URL: https://issues.apache.org/jira/browse/SPARK-20254

[jira] [Created] (SPARK-20253) Remove unnecessary nullchecks of a return value from Spark runtime routines in generated Java code

2017-04-07 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20253: Summary: Remove unnecessary nullchecks of a return value from Spark runtime routines in generated Java code Key: SPARK-20253 URL: https://issues.apache.org/jira/browse

[jira] [Comment Edited] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-07 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956420#comment-15956420 ] Kazuaki Ishizaki edited comment on SPARK-20184 at 4/7/17 8:29 AM

[jira] [Commented] (SPARK-20241) java.lang.IllegalArgumentException: Can not set final [B field org.codehaus.janino.util.ClassFile$CodeAttribute.code to org.codehaus.janino.util.ClassFile$CodeAttribut

2017-04-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960275#comment-15960275 ] Kazuaki Ishizaki commented on SPARK-20241: -- I realized that it is not easy to reproduce

[jira] [Commented] (SPARK-20241) java.lang.IllegalArgumentException: Can not set final [B field org.codehaus.janino.util.ClassFile$CodeAttribute.code to org.codehaus.janino.util.ClassFile$CodeAttribut

2017-04-06 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959495#comment-15959495 ] Kazuaki Ishizaki commented on SPARK-20241: -- I think that this problem is caused by using

[jira] [Commented] (SPARK-20210) Scala tests aborted in Spark SQL on ppc64le

2017-04-05 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956436#comment-15956436 ] Kazuaki Ishizaki commented on SPARK-20210: -- I run the following two tests (DatasetCacheSuite

<    4   5   6   7   8   9   10   11   12   >