[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

2019-06-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858473#comment-16858473
 ] 

ASF GitHub Bot commented on DRILL-7261:
---

asfgit commented on pull request #1796: DRILL-7261: Simplify Easy framework 
config
URL: https://github.com/apache/drill/pull/1796
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify Easy format config for new scan framework
> --
>
> Key: DRILL-7261
> URL: https://issues.apache.org/jira/browse/DRILL-7261
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

2019-06-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858230#comment-16858230
 ] 

ASF GitHub Bot commented on DRILL-7261:
---

paul-rogers commented on issue #1796: DRILL-7261: Simplify Easy framework config
URL: https://github.com/apache/drill/pull/1796#issuecomment-499724041
 
 
   Squashed commits
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify Easy format config for new scan framework
> --
>
> Key: DRILL-7261
> URL: https://issues.apache.org/jira/browse/DRILL-7261
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

2019-06-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857386#comment-16857386
 ] 

ASF GitHub Bot commented on DRILL-7261:
---

arina-ielchiieva commented on issue #1796: DRILL-7261: Simplify Easy framework 
config
URL: https://github.com/apache/drill/pull/1796#issuecomment-499388249
 
 
   +1, please squash the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify Easy format config for new scan framework
> --
>
> Key: DRILL-7261
> URL: https://issues.apache.org/jira/browse/DRILL-7261
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

2019-06-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854294#comment-16854294
 ] 

ASF GitHub Bot commented on DRILL-7261:
---

arina-ielchiieva commented on pull request #1796: DRILL-7261: Simplify Easy 
framework config
URL: https://github.com/apache/drill/pull/1796#discussion_r289711757
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/TextFormatPlugin.java
 ##
 @@ -336,6 +268,53 @@ public RecordReader getRecordReader(FragmentContext 
context,
 }
   }
 
+  @Override
+  protected FileScanBuilder frameworkBuilder(
+  OptionManager options, EasySubScan scan) throws ExecutionSetupException {
+ColumnsScanBuilder builder = new ColumnsScanBuilder();
+builder.setReaderFactory(new ColumnsReaderFactory(this));
+
+// If this format has no headers, or wants to skip them,
+// then we must use the columns column to hold the data.
+
+builder.requireColumnsArray(
+! getConfig().isHeaderExtractionEnabled());
+
+// Text files handle nulls in an unusual way. Missing columns
+// are set to required Varchar and filled with blanks. Yes, this
+// means that the SQL statement or code cannot differentiate missing
+// columns from empty columns, but that is how CSV and other text
+// files have been defined within Drill.
+
+builder.setNullType(Types.required(MinorType.VARCHAR));
+
+// CSV maps blank columns to nulls (for nullable non-string columns),
+// or to the default value (for non-nullable non-string columns.)
+
+builder.setConversionProperty(AbstractConvertFromString.BLANK_ACTION_PROP,
+AbstractConvertFromString.BLANK_AS_NULL);
+
+// The text readers use required Varchar columns to represent null columns.
+
+builder.allowRequiredNullColumns(true);
+
+// Provide custom error context
+builder.setContext(
+new CustomErrorContext() {
+  @Override
+  public void addContext(UserException.Builder builder) {
+builder.addContext("Format plugin:", PLUGIN_NAME);
+builder.addContext("Plugin config name:", getName());
+builder.addContext("Extract headers:",
+Boolean.toString(getConfig().isHeaderExtractionEnabled()));
+builder.addContext("Skip headers:",
 
 Review comment:
   Skip lines?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify Easy format config for new scan framework
> --
>
> Key: DRILL-7261
> URL: https://issues.apache.org/jira/browse/DRILL-7261
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

2019-06-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854293#comment-16854293
 ] 

ASF GitHub Bot commented on DRILL-7261:
---

arina-ielchiieva commented on pull request #1796: DRILL-7261: Simplify Easy 
framework config
URL: https://github.com/apache/drill/pull/1796#discussion_r289711815
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/file/FileScanFramework.java
 ##
 @@ -89,6 +89,7 @@
  * @return Hadoop file split object with the file path, block
  * offset, and length.
  */
+
 
 Review comment:
   Nit: remove new line.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify Easy format config for new scan framework
> --
>
> Key: DRILL-7261
> URL: https://issues.apache.org/jira/browse/DRILL-7261
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7261) Simplify Easy format config for new scan framework

2019-05-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848328#comment-16848328
 ] 

ASF GitHub Bot commented on DRILL-7261:
---

paul-rogers commented on pull request #1796: DRILL-7261: Simplify Easy 
framework config for new scan
URL: https://github.com/apache/drill/pull/1796
 
 
   Most format plugins are created using the Easy format plugin. A recent
   change added support for the "row set" scan framework. After converting
   the text and log reader plugins, it became clear that the setup code
   could be made simpler.
   
   * Add the user name to the "file scan" framework.
   * Pass the file system, split and user name to the batch reader via
the "schema negotiator" rather than via the constructor.
   * Create the traditional "scan batch" scan or the new row-set scan via
   functions instead of classes.
   * Add Easy config option and method to choose the kind of scan
   framework.
   * Add Easy config options for some newer options such as whether the
   plugin supports statistics.
   
   Tested by running all unit tests for the CSV reader which is based on
   the new framework, and by testing the converted log reader (that reader
   is not part of this commit.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Simplify Easy format config for new scan framework
> --
>
> Key: DRILL-7261
> URL: https://issues.apache.org/jira/browse/DRILL-7261
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> Rollup of related CSV V3 fixes along with supporting row set framework fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)