[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292595#comment-16292595 ] Steve Loughran commented on SPARK-18294: Following up on this, one question: Why support the older mapred protocol? The standard impl in Hadoop just relays to the new stuff, it just complicates everyones life as there's two test paths, APIs to document, risk of different failure modes. The v1 API isn't being actively developed, and really its time to move off it > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo >Assignee: Jiang Xingbo > Fix For: 2.3.0 > > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065023#comment-16065023 ] Apache Spark commented on SPARK-18294: -- User 'jiangxb1987' has created a pull request for this issue: https://github.com/apache/spark/pull/18438 > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049816#comment-16049816 ] Dayou Zhou commented on SPARK-18294: Hi [~jiangxb1987] does this answer your question? Any help appreciated, thanks. > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048164#comment-16048164 ] Aarati Khobare commented on SPARK-18294: Hi Jiang I am new to spark and hive, so please let me know if I am missing any point. We are running an insert command on a hive table (created from a storage handler with custom input/output format) through spark 2 shell. The executors do start and do their work. But the driver keeps waiting. This is implemented using older mappred api. Mostly there is a problem with the committer. The custom output format class does not have getOutputCommitter() method. I looked at the code in SparkHadoopMapReduceWriter.scala. It seems it does not take the committer class from JobConf. Also output formatter's checkOutputSpecs is not called, even if the spark.hadoop.validateOutputSpecs property is set to true. >From JIRA and code it seems that spark2 does not support mapper api and >support only map reduce api. Please let us know if we are missing any thing? Thanks. > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047233#comment-16047233 ] Dayou Zhou commented on SPARK-18294: Thanks for responding. My colleague Aarati Khobare will provide you with the details. > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047229#comment-16047229 ] Jiang Xingbo commented on SPARK-18294: -- This is actually legacy code refactoring, it shouldn't affect common user case because the old code is still valid. Could you expand on why you need this? > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16047196#comment-16047196 ] Dayou Zhou commented on SPARK-18294: Hi [~jiangxb1987][~jiangxb], Thank you for making this fix -- we really need this fix. Just checking with you the status on this and when you can merge it into master? > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18294) Implement commit protocol to support `mapred` package's committer
[ https://issues.apache.org/jira/browse/SPARK-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15659269#comment-15659269 ] Apache Spark commented on SPARK-18294: -- User 'jiangxb1987' has created a pull request for this issue: https://github.com/apache/spark/pull/15861 > Implement commit protocol to support `mapred` package's committer > - > > Key: SPARK-18294 > URL: https://issues.apache.org/jira/browse/SPARK-18294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Jiang Xingbo > > Current `FileCommitProtocol` is based on `mapreduce` package, we should > implement a `HadoopMapRedCommitProtocol` that supports the older mapred > package's commiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org