[jira] [Updated] (SPARK-22217) ParquetFileFormat to support arbitrary OutputCommitters

2017-10-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-22217: --- Priority: Minor (was: Major) > ParquetFileFormat to support arbitrary OutputCommitters > ---

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200367#comment-16200367 ] Steve Loughran commented on SPARK-22240: Amazon EMR is amazon's own fork of Spark

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200485#comment-16200485 ] Steve Loughran commented on SPARK-22240: What's the link to the multiline JIRA? A

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-12 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201863#comment-16201863 ] Steve Loughran commented on SPARK-22240: thanks. Now for a question which is prob

[jira] [Commented] (SPARK-21797) spark cannot read partitioned data in S3 that are partly in glacier

2017-10-12 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201917#comment-16201917 ] Steve Loughran commented on SPARK-21797: Update, in HADOOP-14874 I've noted we co

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-13 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203495#comment-16203495 ] Steve Loughran commented on SPARK-22240: We've got a test in HADOOP-14943 which l

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-13 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203709#comment-16203709 ] Steve Loughran commented on SPARK-22240: [~hyukjin.kwon]: we now see that on s3a,

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-13 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203966#comment-16203966 ] Steve Loughran commented on SPARK-22240: Point me at a simple test suite for the

[jira] [Commented] (SPARK-2984) FileNotFoundException on _temporary directory

2017-10-17 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207573#comment-16207573 ] Steve Loughran commented on SPARK-2984: --- bq. multiple batches writing to same locati

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217325#comment-16217325 ] Steve Loughran commented on SPARK-22240: I'm doing some testing with master & rea

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217398#comment-16217398 ] Steve Loughran commented on SPARK-22240: no, spark 2.2 doesn't fix this. I have

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-28 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223400#comment-16223400 ] Steve Loughran commented on SPARK-22240: so this partition calculation problem is

[jira] [Commented] (SPARK-2984) FileNotFoundException on _temporary directory

2017-11-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233924#comment-16233924 ] Steve Loughran commented on SPARK-2984: --- Darron: different stack trace, different pa

[jira] [Commented] (SPARK-2984) FileNotFoundException on _temporary directory

2017-11-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233937#comment-16233937 ] Steve Loughran commented on SPARK-2984: --- [~soumdmw] you asked bq. is there a simple

[jira] [Commented] (SPARK-17593) list files on s3 very slow

2017-11-10 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247886#comment-16247886 ] Steve Loughran commented on SPARK-17593: Hey nick, yes, need to move to FileSys

[jira] [Comment Edited] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-09 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811767#comment-15811767 ] Steve Loughran edited comment on SPARK-19111 at 1/9/17 1:26 PM: ---

[jira] [Commented] (SPARK-19100) Schedule tasks in descending order of estimated input size / estimated task duration

2017-01-09 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811786#comment-15811786 ] Steve Loughran commented on SPARK-19100: it's hard to imagine any dataset where l

[jira] [Commented] (SPARK-19100) Schedule tasks in descending order of estimated input size / estimated task duration

2017-01-09 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811797#comment-15811797 ] Steve Loughran commented on SPARK-19100: Relevant citations * Grover and Carey,

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-01-13 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821773#comment-15821773 ] Steve Loughran commented on SPARK-19111: Just realised one more thing If the all

[jira] [Updated] (SPARK-11353) Writing to S3 buckets, which only support AWS4-HMAC-SHA256 fails with s3n URLs

2017-01-15 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-11353: --- Summary: Writing to S3 buckets, which only support AWS4-HMAC-SHA256 fails with s3n URLs (was

[jira] [Resolved] (SPARK-11353) Writing to S3 buckets, which only support AWS4-HMAC-SHA256 fails with s3n URLs

2017-01-15 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-11353. Resolution: Duplicate This is a duplicate of SPARK-13044; that's transitive a WONTFIX due t

[jira] [Commented] (SPARK-19111) S3 Mesos history upload fails silently if too large

2017-02-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848403#comment-15848403 ] Steve Loughran commented on SPARK-19111: Charles, you might also want to keep an

[jira] [Comment Edited] (SPARK-19407) defaultFS is used FileSystem.get instead of getting it from uri scheme

2017-02-03 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851290#comment-15851290 ] Steve Loughran edited comment on SPARK-19407 at 2/3/17 10:17 AM: --

[jira] [Commented] (SPARK-19407) defaultFS is used FileSystem.get instead of getting it from uri scheme

2017-02-03 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851290#comment-15851290 ] Steve Loughran commented on SPARK-19407: Yes, looks like {{StreamMetadata.read()

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882995#comment-15882995 ] Steve Loughran commented on SPARK-19715: This is a silly question, but has the si

[jira] [Updated] (SPARK-14561) History Server does not see new logs in S3

2017-02-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-14561: --- Component/s: Spark Core > History Server does not see new logs in S3 > --

[jira] [Commented] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-25 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884230#comment-15884230 ] Steve Loughran commented on SPARK-19715: OK. I'd recommend going twith Path.getUR

[jira] [Comment Edited] (SPARK-19715) Option to Strip Paths in FileSource

2017-02-25 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884230#comment-15884230 ] Steve Loughran edited comment on SPARK-19715 at 2/25/17 1:24 PM: --

[jira] [Created] (SPARK-19739) SparkHadoopUtil.appendS3AndSparkHadoopConfigurations to propagate full set of AWS env vars

2017-02-25 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-19739: -- Summary: SparkHadoopUtil.appendS3AndSparkHadoopConfigurations to propagate full set of AWS env vars Key: SPARK-19739 URL: https://issues.apache.org/jira/browse/SPARK-19739

[jira] [Updated] (SPARK-7481) Add spark-cloud module to pull in object store support

2017-02-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-7481: -- Description: To keep the s3n classpath right, to add s3a, swift & azure, the dependencies of sp

[jira] [Updated] (SPARK-7481) Add spark-cloud module to pull in object store support

2017-02-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-7481: -- Affects Version/s: (was: 1.3.1) 2.1.0 > Add spark-cloud module to pul

[jira] [Updated] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-02-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-7481: -- Summary: Add spark-hadoop-cloud module to pull in object store support (was: Add spark-cloud mo

[jira] [Commented] (SPARK-6951) History server slow startup if the event log directory is large

2017-03-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890040#comment-15890040 ] Steve Loughran commented on SPARK-6951: --- Having been downstream of YARN timeline ser

[jira] [Commented] (SPARK-19790) OutputCommitCoordinator should not allow another task to commit after an ExecutorFailure

2017-03-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897863#comment-15897863 ] Steve Loughran commented on SPARK-19790: The only time a task output committer sh

[jira] [Commented] (SPARK-19790) OutputCommitCoordinator should not allow another task to commit after an ExecutorFailure

2017-03-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898003#comment-15898003 ] Steve Loughran commented on SPARK-19790: Thinking some more & looking at code sni

[jira] [Created] (SPARK-19978) spark thrift server to switch to normative hadoop 2.2+ service lifecycle

2017-03-16 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-19978: -- Summary: spark thrift server to switch to normative hadoop 2.2+ service lifecycle Key: SPARK-19978 URL: https://issues.apache.org/jira/browse/SPARK-19978 Project:

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2017-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933324#comment-15933324 ] Steve Loughran commented on SPARK-10109: I think the cause is actually that in so

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2017-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933340#comment-15933340 ] Steve Loughran commented on SPARK-10109: This is a bit related to the execution/c

[jira] [Created] (SPARK-20038) FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant

2017-03-20 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-20038: -- Summary: FileFormatWriter.ExecuteWriteTask.releaseResources() implementations to be re-entrant Key: SPARK-20038 URL: https://issues.apache.org/jira/browse/SPARK-20038

[jira] [Commented] (SPARK-19013) java.util.ConcurrentModificationException when using s3 path as checkpointLocation

2017-03-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938321#comment-15938321 ] Steve Loughran commented on SPARK-19013: One thing that code be done here would b

[jira] [Commented] (SPARK-20061) Reading a file with colon (:) from S3 fails with URISyntaxException

2017-03-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943008#comment-15943008 ] Steve Loughran commented on SPARK-20061: ":" is one of those "implicitly forbidde

[jira] [Resolved] (SPARK-20061) Reading a file with colon (:) from S3 fails with URISyntaxException

2017-03-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-20061. Resolution: Duplicate > Reading a file with colon (:) from S3 fails with URISyntaxException

[jira] [Commented] (SPARK-10294) When Parquet writer's close method throws an exception, we will call close again and trigger a NPE

2017-03-28 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15944887#comment-15944887 ] Steve Loughran commented on SPARK-10294: consider it a failure in the exception l

[jira] [Comment Edited] (SPARK-6527) sc.binaryFiles can not access files on s3

2017-04-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259963#comment-15259963 ] Steve Loughran edited comment on SPARK-6527 at 4/1/17 12:41 PM:

[jira] [Commented] (SPARK-6527) sc.binaryFiles can not access files on s3

2017-04-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952201#comment-15952201 ] Steve Loughran commented on SPARK-6527: --- Hadoop 2.8.0 is out the door, try against t

[jira] [Commented] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-04-03 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954233#comment-15954233 ] Steve Loughran commented on SPARK-20153: This is fixed in Hadoop 2.8 with [per-bu

[jira] [Comment Edited] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-04-03 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954233#comment-15954233 ] Steve Loughran edited comment on SPARK-20153 at 4/3/17 10:13 PM: --

[jira] [Commented] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-04-04 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955956#comment-15955956 ] Steve Loughran commented on SPARK-20153: I'm glad we are both in agreement about

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2017-04-05 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956879#comment-15956879 ] Steve Loughran commented on SPARK-20202: # the ugliness need to inset the spark t

[jira] [Commented] (SPARK-2984) FileNotFoundException on _temporary directory

2017-04-07 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960925#comment-15960925 ] Steve Loughran commented on SPARK-2984: --- For s3a commits, HADOOP-13786 is going to b

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2017-04-10 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962903#comment-15962903 ] Steve Loughran commented on SPARK-20202: One thing I do recall as trouble here wa

[jira] [Commented] (SPARK-10109) NPE when saving Parquet To HDFS

2017-04-14 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968934#comment-15968934 ] Steve Loughran commented on SPARK-10109: SPARK-20038 should stop the failure bein

[jira] [Commented] (SPARK-20153) Support Multiple aws credentials in order to access multiple Hive on S3 table in spark application

2017-04-18 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973541#comment-15973541 ] Steve Loughran commented on SPARK-20153: [~tafra...@gmail.com] : thanks for disco

[jira] [Resolved] (SPARK-17593) list files on s3 very slow

2017-04-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-17593. Resolution: Fixed closing as fixed now that Hadoop 2.8.0 is out the door. Upgrade your hado

[jira] [Created] (SPARK-20448) Document how FileInputDStream works with object storage

2017-04-24 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-20448: -- Summary: Document how FileInputDStream works with object storage Key: SPARK-20448 URL: https://issues.apache.org/jira/browse/SPARK-20448 Project: Spark I

[jira] [Commented] (SPARK-17159) Improve FileInputDStream.findNewFiles list performance

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981081#comment-15981081 ] Steve Loughran commented on SPARK-17159: pulled out documentation into separate J

[jira] [Commented] (SPARK-20107) Add spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version option to configuration.md

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981552#comment-15981552 ] Steve Loughran commented on SPARK-20107: This does not solve the problem you thin

[jira] [Comment Edited] (SPARK-20107) Add spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version option to configuration.md

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981552#comment-15981552 ] Steve Loughran edited comment on SPARK-20107 at 4/24/17 5:30 PM: --

[jira] [Commented] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981602#comment-15981602 ] Steve Loughran commented on SPARK-7481: --- One thing I want to emphasise here is: I ha

[jira] [Commented] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-04-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984771#comment-15984771 ] Steve Loughran commented on SPARK-7481: --- I think we ended up going in circles on tha

[jira] [Commented] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-04-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985040#comment-15985040 ] Steve Loughran commented on SPARK-7481: --- (This is a fairly long comment, but it trie

[jira] [Commented] (SPARK-19582) DataFrameReader conceptually inadequate

2017-05-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15992985#comment-15992985 ] Steve Loughran commented on SPARK-19582: All spark is doing is taking a URL To da

[jira] [Created] (SPARK-20560) Review Spark's handling of filesystems returning "localhost" in getFileBlockLocations

2017-05-02 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-20560: -- Summary: Review Spark's handling of filesystems returning "localhost" in getFileBlockLocations Key: SPARK-20560 URL: https://issues.apache.org/jira/browse/SPARK-20560

[jira] [Commented] (SPARK-20560) Review Spark's handling of filesystems returning "localhost" in getFileBlockLocations

2017-05-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993008#comment-15993008 ] Steve Loughran commented on SPARK-20560: {{FileSystem.getFileBlockLocations(path)

[jira] [Resolved] (SPARK-20560) Review Spark's handling of filesystems returning "localhost" in getFileBlockLocations

2017-05-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved SPARK-20560. Resolution: Invalid "localhost" is filtered, been done in {{HadoopRDD.getPreferredLocations

[jira] [Commented] (SPARK-20560) Review Spark's handling of filesystems returning "localhost" in getFileBlockLocations

2017-05-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993323#comment-15993323 ] Steve Loughran commented on SPARK-20560: To follow this up, I've now got a test w

[jira] [Commented] (SPARK-20370) create external table on read only location fails

2017-05-02 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993405#comment-15993405 ] Steve Loughran commented on SPARK-20370: Is this the bit under the PR tagged "!!

[jira] [Commented] (SPARK-15923) Spark Application rest api returns "no such app: "

2016-07-11 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370547#comment-15370547 ] Steve Loughran commented on SPARK-15923: I do think the docs could be clarified a

[jira] [Commented] (SPARK-13514) Spark Shuffle Service 1.6.0 issue in Yarn

2016-07-12 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372728#comment-15372728 ] Steve Loughran commented on SPARK-13514: does the same file:// config setting exi

[jira] [Commented] (SPARK-7481) Add spark-cloud module to pull in aws+azure object store FS accessors; test integration

2016-07-22 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389707#comment-15389707 ] Steve Loughran commented on SPARK-7481: --- Sad but true. * The PR I've put up adds th

[jira] [Commented] (SPARK-7481) Add spark-cloud module to pull in aws+azure object store FS accessors; test integration

2016-07-22 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389730#comment-15389730 ] Steve Loughran commented on SPARK-7481: --- ps, latest s3a state # [Object stores in

[jira] [Created] (SPARK-16736) remove redundant FileSystem.exists() calls from Spark codebase

2016-07-26 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-16736: -- Summary: remove redundant FileSystem.exists() calls from Spark codebase Key: SPARK-16736 URL: https://issues.apache.org/jira/browse/SPARK-16736 Project: Spark

[jira] [Updated] (SPARK-16737) ListingFileCatalog comments about RPC calls in object store isn't correct

2016-07-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-16737: --- Description: The comment text which came in with SPARK-16121 says {code} - Although S3/S3A/

[jira] [Created] (SPARK-16737) ListingFileCatalog comments about RPC calls in object store isn't correct

2016-07-26 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-16737: -- Summary: ListingFileCatalog comments about RPC calls in object store isn't correct Key: SPARK-16737 URL: https://issues.apache.org/jira/browse/SPARK-16737 Project

[jira] [Updated] (SPARK-16736) remove redundant FileSystem status checks calls from Spark codebase

2016-07-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-16736: --- Summary: remove redundant FileSystem status checks calls from Spark codebase (was: remove re

[jira] [Commented] (SPARK-16736) remove redundant FileSystem status checks calls from Spark codebase

2016-07-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393770#comment-15393770 ] Steve Loughran commented on SPARK-16736: See also HIVE-14323 > remove redundant

[jira] [Commented] (SPARK-10063) Remove DirectParquetOutputCommitter

2016-08-07 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410870#comment-15410870 ] Steve Loughran commented on SPARK-10063: The solution for this is going to be s3g

[jira] [Created] (SPARK-17058) Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-08-15 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-17058: -- Summary: Add maven snapshots-and-staging profile to build/test against staging artifacts Key: SPARK-17058 URL: https://issues.apache.org/jira/browse/SPARK-17058 P

[jira] [Updated] (SPARK-14387) Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc

2016-08-15 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-14387: --- Affects Version/s: 2.0.0 Target Version/s: 2.0.1 Component/s: SQL > Enable Hiv

[jira] [Created] (SPARK-17159) Improve FileInputDStream.findNewFiles list performance

2016-08-19 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-17159: -- Summary: Improve FileInputDStream.findNewFiles list performance Key: SPARK-17159 URL: https://issues.apache.org/jira/browse/SPARK-17159 Project: Spark Is

[jira] [Commented] (SPARK-17159) Improve FileInputDStream.findNewFiles list performance

2016-08-19 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15428677#comment-15428677 ] Steve Loughran commented on SPARK-17159: # the most minimal change is to get rid

[jira] [Updated] (SPARK-17179) Consider improving partition pruning in HiveMetastoreCatalog

2016-08-22 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-17179: --- Affects Version/s: 2.0.0 Priority: Major (was: Critical) Description:

[jira] [Commented] (SPARK-3685) Spark's local dir should accept only local paths

2015-03-17 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365087#comment-14365087 ] Steve Loughran commented on SPARK-3685: --- YARN-1197 covers supporting resizing existi

[jira] [Created] (SPARK-6389) YARN app diagnostics report doesn't report NPEs

2015-03-17 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-6389: - Summary: YARN app diagnostics report doesn't report NPEs Key: SPARK-6389 URL: https://issues.apache.org/jira/browse/SPARK-6389 Project: Spark Issue Type: B

[jira] [Commented] (SPARK-3351) Yarn YarnRMClientImpl.shutdown can be called before register - NPE

2015-03-17 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365419#comment-14365419 ] Steve Loughran commented on SPARK-3351: --- Thomas: what Yarn version was this against?

[jira] [Commented] (SPARK-3351) Yarn YarnRMClientImpl.shutdown can be called before register - NPE

2015-03-17 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14365547#comment-14365547 ] Steve Loughran commented on SPARK-3351: --- The NPE is certainly gone. I don't know abo

[jira] [Commented] (SPARK-1200) Make it possible to use unmanaged AM in yarn-client mode

2015-03-19 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369339#comment-14369339 ] Steve Loughran commented on SPARK-1200: --- You know, we could benefit all YARN apps if

[jira] [Created] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access

2015-03-20 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-6433: - Summary: hive tests to import spark-sql test JAR for QueryTest access Key: SPARK-6433 URL: https://issues.apache.org/jira/browse/SPARK-6433 Project: Spark

[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access

2015-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371280#comment-14371280 ] Steve Loughran commented on SPARK-6433: --- ..sorry, I'd missed that previous report. W

[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access

2015-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371283#comment-14371283 ] Steve Loughran commented on SPARK-6433: --- They have diverged, the original sql one ha

[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access

2015-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371299#comment-14371299 ] Steve Loughran commented on SPARK-6433: --- Similarly, the original sql package's {{Que

[jira] [Commented] (SPARK-6433) hive tests to import spark-sql test JAR for QueryTest access

2015-03-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371350#comment-14371350 ] Steve Loughran commented on SPARK-6433: --- There's one interesting question: whether o

[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)

2015-03-25 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379817#comment-14379817 ] Steve Loughran commented on SPARK-6479: --- As part of my ongoing work to take formalit

[jira] [Commented] (SPARK-6479) Create off-heap block storage API (internal)

2015-03-27 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384520#comment-14384520 ] Steve Loughran commented on SPARK-6479: --- Henry: utterly unrelated. I was merely offe

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-03-29 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385890#comment-14385890 ] Steve Loughran commented on SPARK-1537: --- # I've just tried to see where YARN-2444 st

[jira] [Commented] (SPARK-6568) spark-shell.cmd --jars option does not accept the jar that has space in its path

2015-03-30 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386552#comment-14386552 ] Steve Loughran commented on SPARK-6568: --- Can you show the full stack trace? > spark

[jira] [Commented] (SPARK-2348) In Windows having a enviorinment variable named 'classpath' gives error

2015-03-30 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386561#comment-14386561 ] Steve Loughran commented on SPARK-2348: --- Supporting {{%CLASSPATH%}} is dangerous, as

[jira] [Commented] (SPARK-799) Windows versions of the deploy scripts

2015-03-30 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386571#comment-14386571 ] Steve Loughran commented on SPARK-799: -- Proving python versions of the launcher script

[jira] [Commented] (SPARK-2356) Exception: Could not locate executable null\bin\winutils.exe in the Hadoop

2015-03-30 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386585#comment-14386585 ] Steve Loughran commented on SPARK-2356: --- It's coming from {{ UserGroupInformation.s

[jira] [Commented] (SPARK-6646) Spark 2.0: Rearchitecting Spark for Mobile Platforms

2015-04-01 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390373#comment-14390373 ] Steve Loughran commented on SPARK-6646: --- Obviously the barrier will be data source a

<    1   2   3   4   5   6   7   8   9   >