[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-16070: ---------------------------------- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * SPARK-16071: Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * SPARK-16073: Performance of Parquet encodings on saving primitive arrays was: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * SPARK-16071: Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * Performance of Parquet encodings on saving primitive arrays > DataFrame/Parquet issues with primitive arrays > ---------------------------------------------- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL > Affects Versions: 2.0.0 > Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * SPARK-16071: Not sufficient array size checks > ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] > or silent errors) > * SPARK-16073: Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org