[jira] [Created] (SPARK-17555) ExternalShuffleBlockResolver fails randomly with External Shuffle Service and Dynamic Resource Allocation on Mesos running under Marathon

2016-09-15 Thread Brad Willard (JIRA)
Brad Willard created SPARK-17555: Summary: ExternalShuffleBlockResolver fails randomly with External Shuffle Service and Dynamic Resource Allocation on Mesos running under Marathon Key: SPARK-17555 URL: https://i

[jira] [Commented] (SPARK-14564) Python Word2Vec missing setWindowSize method

2016-04-19 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248230#comment-15248230 ] Brad Willard commented on SPARK-14564: -- Do you guys think it's possible to get this

[jira] [Created] (SPARK-14564) Python Word2Vec missing setWindowSize method

2016-04-12 Thread Brad Willard (JIRA)
Brad Willard created SPARK-14564: Summary: Python Word2Vec missing setWindowSize method Key: SPARK-14564 URL: https://issues.apache.org/jira/browse/SPARK-14564 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-11428) Schema Merging Broken for Some Queries

2015-11-03 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988337#comment-14988337 ] Brad Willard commented on SPARK-11428: -- As a work around. You can create a new parqu

[jira] [Updated] (SPARK-11428) Schema Merging Broken for Some Queries

2015-10-30 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-11428: - Description: I have data being written into parquet format via spark streaming. The data can cha

[jira] [Created] (SPARK-11428) Schema Merging Broken for Some Queries

2015-10-30 Thread Brad Willard (JIRA)
Brad Willard created SPARK-11428: Summary: Schema Merging Broken for Some Queries Key: SPARK-11428 URL: https://issues.apache.org/jira/browse/SPARK-11428 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-10488) No longer possible to create SparkConf in pyspark application

2015-09-08 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735159#comment-14735159 ] Brad Willard edited comment on SPARK-10488 at 9/8/15 5:03 PM: -

[jira] [Commented] (SPARK-10488) No longer possible to create SparkConf in pyspark application

2015-09-08 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735159#comment-14735159 ] Brad Willard commented on SPARK-10488: -- So I have a comical workaround now. I can le

[jira] [Comment Edited] (SPARK-10488) No longer possible to create SparkConf in pyspark application

2015-09-08 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735141#comment-14735141 ] Brad Willard edited comment on SPARK-10488 at 9/8/15 4:54 PM: -

[jira] [Commented] (SPARK-10488) No longer possible to create SparkConf in pyspark application

2015-09-08 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735141#comment-14735141 ] Brad Willard commented on SPARK-10488: -- [~srowen] I have it working via that method

[jira] [Created] (SPARK-10488) No longer possible to create SparkConf in pyspark application

2015-09-08 Thread Brad Willard (JIRA)
Brad Willard created SPARK-10488: Summary: No longer possible to create SparkConf in pyspark application Key: SPARK-10488 URL: https://issues.apache.org/jira/browse/SPARK-10488 Project: Spark

[jira] [Issue Comment Deleted] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-30 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Comment: was deleted (was: This is only broken on queries. I can load the dataframe with the unquer

[jira] [Commented] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-30 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609100#comment-14609100 ] Brad Willard commented on SPARK-8128: - This is only broken on queries. I can load the

[jira] [Updated] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-12 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Description: I'm loading a folder of parquet files with about 600 parquet files and loading it into

[jira] [Updated] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Description: I'm loading a folder of parquet files with about 600 parquet files and loading it into

[jira] [Comment Edited] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576027#comment-14576027 ] Brad Willard edited comment on SPARK-8128 at 6/11/15 9:36 PM: --

[jira] [Updated] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Description: I'm loading a folder of parquet files with about 600 parquet files and loading it into

[jira] [Updated] (SPARK-8128) Schema Merging Broken: Dataframe Fails to Recognize Column in Schema

2015-06-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Summary: Schema Merging Broken: Dataframe Fails to Recognize Column in Schema (was: Dataframe Fails

[jira] [Commented] (SPARK-8128) Dataframe Fails to Recognize Column in Schema

2015-06-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582543#comment-14582543 ] Brad Willard commented on SPARK-8128: - I have more logging from the job before it dies

[jira] [Updated] (SPARK-8128) Dataframe Fails to Recognize Column in Schema

2015-06-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Affects Version/s: 1.4.0 1.3.0 > Dataframe Fails to Recognize Column in Schem

[jira] [Commented] (SPARK-8128) Dataframe Fails to Recognize Column in Schema

2015-06-06 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14576027#comment-14576027 ] Brad Willard commented on SPARK-8128: - Initially I had through this was a bug related

[jira] [Updated] (SPARK-8128) Dataframe Fails to Recognize Column in Schema

2015-06-05 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-8128: Description: I'm loading a folder of parquet files with about 600 parquet files and loading it into

[jira] [Created] (SPARK-8128) Dataframe Fails to Recognize Column in Schema

2015-06-05 Thread Brad Willard (JIRA)
Brad Willard created SPARK-8128: --- Summary: Dataframe Fails to Recognize Column in Schema Key: SPARK-8128 URL: https://issues.apache.org/jira/browse/SPARK-8128 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-7869) Spark Data Frame Fails to Load Postgres Tables with JSONB DataType Columns

2015-05-26 Thread Brad Willard (JIRA)
Brad Willard created SPARK-7869: --- Summary: Spark Data Frame Fails to Load Postgres Tables with JSONB DataType Columns Key: SPARK-7869 URL: https://issues.apache.org/jira/browse/SPARK-7869 Project: Spark

[jira] [Commented] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544295#comment-14544295 ] Brad Willard commented on SPARK-7640: - sounds good. I'm going to try and make a custom

[jira] [Commented] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544251#comment-14544251 ] Brad Willard commented on SPARK-7640: - So installing python 27 was just an example to

[jira] [Commented] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544242#comment-14544242 ] Brad Willard commented on SPARK-7640: - So the centos repo doesn't seem to actually ins

[jira] [Commented] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544136#comment-14544136 ] Brad Willard commented on SPARK-7640: - I think this might be working I disabled the a

[jira] [Commented] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544077#comment-14544077 ] Brad Willard commented on SPARK-7640: - I'm happy to try, do you know specifically whic

[jira] [Updated] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-7640: Description: If you create a spark cluster in a private vpc, the amazon yum repos return 403 permis

[jira] [Commented] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544066#comment-14544066 ] Brad Willard commented on SPARK-7640: - I manually implemented the first one just to ge

[jira] [Created] (SPARK-7640) Private VPC with default Spark AMI breaks yum

2015-05-14 Thread Brad Willard (JIRA)
Brad Willard created SPARK-7640: --- Summary: Private VPC with default Spark AMI breaks yum Key: SPARK-7640 URL: https://issues.apache.org/jira/browse/SPARK-7640 Project: Spark Issue Type: Improve

[jira] [Commented] (SPARK-7447) Large Job submission lag when using Parquet w/ Schema Merging

2015-05-08 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535208#comment-14535208 ] Brad Willard commented on SPARK-7447: - Thanks, you are a hero. > Large Job submission

[jira] [Updated] (SPARK-7447) Large Job submission lag when using Parquet w/ Schema Merging

2015-05-07 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-7447: Environment: Spark 1.3.1, aws, persistent hdfs version 2 with ebs storage, pyspark, 8 x c3.8xlarge

[jira] [Created] (SPARK-7447) Large Job submission lag when using Parquet w/ Schema Merging

2015-05-07 Thread Brad Willard (JIRA)
Brad Willard created SPARK-7447: --- Summary: Large Job submission lag when using Parquet w/ Schema Merging Key: SPARK-7447 URL: https://issues.apache.org/jira/browse/SPARK-7447 Project: Spark Is

[jira] [Closed] (SPARK-5075) Memory Leak when repartitioning SchemaRDD or running queries in general

2015-04-22 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard closed SPARK-5075. --- Resolution: Won't Fix > Memory Leak when repartitioning SchemaRDD or running queries in general >

[jira] [Commented] (SPARK-4977) spark-ec2 start resets all the spark/conf configurations

2015-04-21 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504977#comment-14504977 ] Brad Willard commented on SPARK-4977: - I would love to see this addressed. I have larg

[jira] [Created] (SPARK-6982) Data Frame and Spark SQL should allow filtering on key portion of incremental parquet files

2015-04-17 Thread Brad Willard (JIRA)
Brad Willard created SPARK-6982: --- Summary: Data Frame and Spark SQL should allow filtering on key portion of incremental parquet files Key: SPARK-6982 URL: https://issues.apache.org/jira/browse/SPARK-6982

[jira] [Commented] (SPARK-5008) Persistent HDFS does not recognize EBS Volumes

2015-01-13 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275586#comment-14275586 ] Brad Willard commented on SPARK-5008: - [~nchammas] I went ahead and created a cluster

[jira] [Commented] (SPARK-5008) Persistent HDFS does not recognize EBS Volumes

2015-01-11 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272991#comment-14272991 ] Brad Willard commented on SPARK-5008: - [~nchammas] I can try that once I get back into

[jira] [Commented] (SPARK-5008) Persistent HDFS does not recognize EBS Volumes

2015-01-10 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272593#comment-14272593 ] Brad Willard commented on SPARK-5008: - Yes. 1.1.1 was fine. — Sent from Mailbox On S

[jira] [Commented] (SPARK-4778) PySpark Json and groupByKey broken

2015-01-10 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272591#comment-14272591 ] Brad Willard commented on SPARK-4778: - You can close as can not reproduce. I've alread

[jira] [Commented] (SPARK-5075) Memory Leak when repartitioning SchemaRDD or running queries in general

2015-01-09 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271706#comment-14271706 ] Brad Willard commented on SPARK-5075: - I wanted to add that this is greatly exacerbate

[jira] [Created] (SPARK-5151) Parquet Predicate Pushdown Does Not Work with Nested Structures.

2015-01-08 Thread Brad Willard (JIRA)
Brad Willard created SPARK-5151: --- Summary: Parquet Predicate Pushdown Does Not Work with Nested Structures. Key: SPARK-5151 URL: https://issues.apache.org/jira/browse/SPARK-5151 Project: Spark

[jira] [Updated] (SPARK-5075) Memory Leak when repartitioning SchemaRDD or running queries in general

2015-01-06 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-5075: Labels: ec2 json memory-leak memory_leak parquet pyspark repartition s3 (was: ec2 json parquet pysp

[jira] [Updated] (SPARK-5075) Memory Leak when repartitioning SchemaRDD or running queries in general

2015-01-06 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-5075: Description: I'm trying to repartition a json dataset for better cpu optimization and save in parqu

[jira] [Updated] (SPARK-5075) Memory Leak when repartitioning SchemaRDD or running queries in general

2015-01-06 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brad Willard updated SPARK-5075: Summary: Memory Leak when repartitioning SchemaRDD or running queries in general (was: Memory Leak

[jira] [Created] (SPARK-5092) Selecting from a nested structure with SparkSQL should return a nested structure

2015-01-05 Thread Brad Willard (JIRA)
Brad Willard created SPARK-5092: --- Summary: Selecting from a nested structure with SparkSQL should return a nested structure Key: SPARK-5092 URL: https://issues.apache.org/jira/browse/SPARK-5092 Project:

[jira] [Created] (SPARK-5075) Memory Leak when repartitioning SchemaRDD from JSON

2015-01-04 Thread Brad Willard (JIRA)
Brad Willard created SPARK-5075: --- Summary: Memory Leak when repartitioning SchemaRDD from JSON Key: SPARK-5075 URL: https://issues.apache.org/jira/browse/SPARK-5075 Project: Spark Issue Type: B

[jira] [Commented] (SPARK-4779) PySpark Shuffle Fails Looking for Files that Don't Exist when low on Memory

2015-01-04 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263925#comment-14263925 ] Brad Willard commented on SPARK-4779: - [~davies] I've already killed the environment a

[jira] [Created] (SPARK-5008) Persistent HDFS does not recognize EBS Volumes

2014-12-30 Thread Brad Willard (JIRA)
Brad Willard created SPARK-5008: --- Summary: Persistent HDFS does not recognize EBS Volumes Key: SPARK-5008 URL: https://issues.apache.org/jira/browse/SPARK-5008 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-4779) PySpark Shuffle Fails Looking for Files that Don't Exist when low on Memory

2014-12-30 Thread Brad Willard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261240#comment-14261240 ] Brad Willard commented on SPARK-4779: - I got this error on doing a large group by key

[jira] [Created] (SPARK-4779) PySpark Shuffle Fails Looking for Files that Don't Exist when low on Memory

2014-12-06 Thread Brad Willard (JIRA)
Brad Willard created SPARK-4779: --- Summary: PySpark Shuffle Fails Looking for Files that Don't Exist when low on Memory Key: SPARK-4779 URL: https://issues.apache.org/jira/browse/SPARK-4779 Project: Spar

[jira] [Created] (SPARK-4778) PySpark Json and groupByKey broken

2014-12-06 Thread Brad Willard (JIRA)
Brad Willard created SPARK-4778: --- Summary: PySpark Json and groupByKey broken Key: SPARK-4778 URL: https://issues.apache.org/jira/browse/SPARK-4778 Project: Spark Issue Type: Bug Comp