[jira] [Commented] (FLINK-10218) Allow writing DataSet without explicit path parameter

2018-08-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593478#comment-16593478
 ] 

ASF GitHub Bot commented on FLINK-10218:


link3280 closed pull request #6616: [FLINK-10218] Allow writing DataSet without 
explicit path parameter
URL: https://github.com/apache/flink/pull/6616
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
b/flink-java/src/main/java/org/apache/flink/api/java/DataSet.java
index 3dd4f6a8216..976a3c65f8c 100644
--- a/flink-java/src/main/java/org/apache/flink/api/java/DataSet.java
+++ b/flink-java/src/main/java/org/apache/flink/api/java/DataSet.java
@@ -1727,6 +1727,21 @@ public void printToErr() throws Exception {
return output(new PrintingOutputFormat(sinkIdentifier, 
true));
}
 
+   /**
+* Writes a DataSet using a {@link FileOutputFormat} to a specified 
location.
+* This method adds a data sink to the program.
+*
+* @param outputFormat The FileOutputFormat to write the DataSet.
+* @return The DataSink that writes the DataSet.
+*
+* @see FileOutputFormat
+*/
+   public DataSink write(FileOutputFormat outputFormat) {
+   Preconditions.checkNotNull(outputFormat, "Output format must 
not be null.");
+   Preconditions.checkNotNull(outputFormat.getOutputFilePath(), 
"File path must not be null.");
+   return output(outputFormat);
+   }
+
/**
 * Writes a DataSet using a {@link FileOutputFormat} to a specified 
location.
 * This method adds a data sink to the program.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow writing DataSet without explicit path parameter
> -
>
> Key: FLINK-10218
> URL: https://issues.apache.org/jira/browse/FLINK-10218
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.6.0
>Reporter: Paul Lin
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, DataSet API has two overloaded `write` methods for using 
> FileOutputFormat as output format, and both require a path parameter, but the 
> output path could already be set in the FileOutputFormat object. What's more, 
> the subclasses of FileOutputFormat mostly don't have default constructors and 
> required a path parameter too, so users have to set output path twice in the 
> code, like:
> {code:java}
>   String output = "hdfs:///tmp/";
>   dataset.write(new TextOutputFormat<>(new Path(output)), output);
> {code}
> So I propose to add another write helper method that requires no path 
> parameter. May someone assign this issue to me?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10218) Allow writing DataSet without explicit path parameter

2018-08-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593477#comment-16593477
 ] 

ASF GitHub Bot commented on FLINK-10218:


link3280 commented on issue #6616: [FLINK-10218] Allow writing DataSet without 
explicit path parameter
URL: https://github.com/apache/flink/pull/6616#issuecomment-416189573
 
 
   @yanghua @zentol OK, I will close this PR. Thanks for your time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow writing DataSet without explicit path parameter
> -
>
> Key: FLINK-10218
> URL: https://issues.apache.org/jira/browse/FLINK-10218
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.6.0
>Reporter: Paul Lin
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, DataSet API has two overloaded `write` methods for using 
> FileOutputFormat as output format, and both require a path parameter, but the 
> output path could already be set in the FileOutputFormat object. What's more, 
> the subclasses of FileOutputFormat mostly don't have default constructors and 
> required a path parameter too, so users have to set output path twice in the 
> code, like:
> {code:java}
>   String output = "hdfs:///tmp/";
>   dataset.write(new TextOutputFormat<>(new Path(output)), output);
> {code}
> So I propose to add another write helper method that requires no path 
> parameter. May someone assign this issue to me?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10218) Allow writing DataSet without explicit path parameter

2018-08-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593285#comment-16593285
 ] 

ASF GitHub Bot commented on FLINK-10218:


zentol commented on a change in pull request #6616: [FLINK-10218] Allow writing 
DataSet without explicit path parameter
URL: https://github.com/apache/flink/pull/6616#discussion_r212897718
 
 

 ##
 File path: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java
 ##
 @@ -1727,6 +1727,21 @@ public void printToErr() throws Exception {
return output(new PrintingOutputFormat(sinkIdentifier, 
true));
}
 
+   /**
+* Writes a DataSet using a {@link FileOutputFormat} to a specified 
location.
+* This method adds a data sink to the program.
+*
+* @param outputFormat The FileOutputFormat to write the DataSet.
+* @return The DataSink that writes the DataSet.
+*
+* @see FileOutputFormat
+*/
+   public DataSink write(FileOutputFormat outputFormat) {
+   Preconditions.checkNotNull(outputFormat, "Output format must 
not be null.");
+   Preconditions.checkNotNull(outputFormat.getOutputFilePath(), 
"File path must not be null.");
+   return output(outputFormat);
 
 Review comment:
   this right here is already a viable alternative for users, hence I would 
reject this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow writing DataSet without explicit path parameter
> -
>
> Key: FLINK-10218
> URL: https://issues.apache.org/jira/browse/FLINK-10218
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.6.0
>Reporter: Paul Lin
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, DataSet API has two overloaded `write` methods for using 
> FileOutputFormat as output format, and both require a path parameter, but the 
> output path could already be set in the FileOutputFormat object. What's more, 
> the subclasses of FileOutputFormat mostly don't have default constructors and 
> required a path parameter too, so users have to set output path twice in the 
> code, like:
> {code:java}
>   String output = "hdfs:///tmp/";
>   dataset.write(new TextOutputFormat<>(new Path(output)), output);
> {code}
> So I propose to add another write helper method that requires no path 
> parameter. May someone assign this issue to me?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10218) Allow writing DataSet without explicit path parameter

2018-08-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593225#comment-16593225
 ] 

ASF GitHub Bot commented on FLINK-10218:


yanghua commented on issue #6616: [FLINK-10218] Allow writing DataSet without 
explicit path parameter
URL: https://github.com/apache/flink/pull/6616#issuecomment-416125382
 
 
   @link3280 Thanks for your contribution, It would be a good idea to add some 
tests for this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow writing DataSet without explicit path parameter
> -
>
> Key: FLINK-10218
> URL: https://issues.apache.org/jira/browse/FLINK-10218
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.6.0
>Reporter: Paul Lin
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, DataSet API has two overloaded `write` methods for using 
> FileOutputFormat as output format, and both require a path parameter, but the 
> output path could already be set in the FileOutputFormat object. What's more, 
> the subclasses of FileOutputFormat mostly don't have default constructors and 
> required a path parameter too, so users have to set output path twice in the 
> code, like:
> {code:java}
>   String output = "hdfs:///tmp/";
>   dataset.write(new TextOutputFormat<>(new Path(output)), output);
> {code}
> So I propose to add another write helper method that requires no path 
> parameter. May someone assign this issue to me?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10218) Allow writing DataSet without explicit path parameter

2018-08-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593154#comment-16593154
 ] 

ASF GitHub Bot commented on FLINK-10218:


link3280 opened a new pull request #6616: [FLINK-10218] Allow writing DataSet 
without explicit path parameter
URL: https://github.com/apache/flink/pull/6616
 
 
   ## What is the purpose of the change
   
   Add an file output helper method, which requires only FileOutputFormat 
parameter, to DataSet API. This can avoid setting duplicate path parameters, 
since the output path could be found in FileOutputFormat.
   
   ## Brief change log
   
 - *Add an file output helper method, which requires only FileOutputFormat 
parameter, to DataSet API.*
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow writing DataSet without explicit path parameter
> -
>
> Key: FLINK-10218
> URL: https://issues.apache.org/jira/browse/FLINK-10218
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.6.0
>Reporter: Paul Lin
>Priority: Minor
>  Labels: pull-request-available
>
> Currently, DataSet API has two overloaded `write` methods for using 
> FileOutputFormat as output format, and both require a path parameter, but the 
> output path could already be set in the FileOutputFormat object. What's more, 
> the subclasses of FileOutputFormat mostly don't have default constructors and 
> required a path parameter too, so users have to set output path twice in the 
> code, like:
> {code:java}
>   String output = "hdfs:///tmp/";
>   dataset.write(new TextOutputFormat<>(new Path(output)), output);
> {code}
> So I propose to add another write helper method that requires no path 
> parameter. May someone assign this issue to me?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10218) Allow writing DataSet without explicit path parameter

2018-08-26 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16593135#comment-16593135
 ] 

vinoyang commented on FLINK-10218:
--

[~Paul Lin] Currently you may have no contributor permissions, ping 
[~till.rohrmann] [~Zentol]

> Allow writing DataSet without explicit path parameter
> -
>
> Key: FLINK-10218
> URL: https://issues.apache.org/jira/browse/FLINK-10218
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Affects Versions: 1.6.0
>Reporter: Paul Lin
>Priority: Minor
>
> Currently, DataSet API has two overloaded `write` methods for using 
> FileOutputFormat as output format, and both require a path parameter, but the 
> output path could already be set in the FileOutputFormat object. What's more, 
> the subclasses of FileOutputFormat mostly don't have default constructors and 
> required a path parameter too, so users have to set output path twice in the 
> code, like:
> {code:java}
>   String output = "hdfs:///tmp/";
>   dataset.write(new TextOutputFormat<>(new Path(output)), output);
> {code}
> So I propose to add another write helper method that requires no path 
> parameter. May someone assign this issue to me?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)