[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16070: - Labels: bulk-closed (was: ) > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng >Priority: Major > Labels: bulk-closed > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * SPARK-16071: Not sufficient array size checks > ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] > or silent errors) > * SPARK-16073: Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16070: -- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * SPARK-16071: Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * SPARK-16073: Performance of Parquet encodings on saving primitive arrays was: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * SPARK-16071: Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * Performance of Parquet encodings on saving primitive arrays > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * SPARK-16071: Not sufficient array size checks > ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] > or silent errors) > * SPARK-16073: Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16070: -- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * SPARK-16071: Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * Performance of Parquet encodings on saving primitive arrays was: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * Performance of Parquet encodings on saving primitive arrays > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * SPARK-16071: Not sufficient array size checks > ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] > or silent errors) > * Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16070: -- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) ** There * Performance of Parquet encodings on saving primitive arrays was: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([[https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException NegativeArraySizeException]] or silent errors) ** There * Performance of Parquet encodings on saving primitive arrays > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * Not sufficient array size checks > ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] > or silent errors) > ** There > * Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16070: -- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) * Performance of Parquet encodings on saving primitive arrays was: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] or silent errors) ** There * Performance of Parquet encodings on saving primitive arrays > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * Not sufficient array size checks > ([NegativeArraySizeException|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException] > or silent errors) > * Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16070: -- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([[https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException NegativeArraySizeException]] or silent errors) ** There * Performance of Parquet encodings on saving primitive arrays was: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([NegativeArraySizeException](https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException) or silent errors) ** There * Performance of Parquet encodings on saving primitive arrays > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * Not sufficient array size checks > ([[https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException > NegativeArraySizeException]] or silent errors) > ** There > * Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-16070) DataFrame/Parquet issues with primitive arrays
[ https://issues.apache.org/jira/browse/SPARK-16070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16070: -- Description: I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as (usually large) primitive arrays. Issues: * SPARK-16043: Tungsten array data is not specialized for primitive types * Not sufficient array size checks ([NegativeArraySizeException](https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException) or silent errors) ** There * Performance of Parquet encodings on saving primitive arrays was:I created this umbrella JIRA to track DataFrame/Parquet issues with primitive arrays. This is mostly related to machine learning use cases, where feature indices/values are stored as primitive arrays. > DataFrame/Parquet issues with primitive arrays > -- > > Key: SPARK-16070 > URL: https://issues.apache.org/jira/browse/SPARK-16070 > Project: Spark > Issue Type: Umbrella > Components: MLlib, SQL >Affects Versions: 2.0.0 >Reporter: Xiangrui Meng > > I created this umbrella JIRA to track DataFrame/Parquet issues with primitive > arrays. This is mostly related to machine learning use cases, where feature > indices/values are stored as (usually large) primitive arrays. > Issues: > * SPARK-16043: Tungsten array data is not specialized for primitive types > * Not sufficient array size checks > ([NegativeArraySizeException](https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20text%20~%20NegativeArraySizeException) > or silent errors) > ** There > * Performance of Parquet encodings on saving primitive arrays -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org