[
https://issues.apache.org/jira/browse/ORC-233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136380#comment-16136380
]
ASF GitHub Bot commented on ORC-233:
------------------------------------
Github user ajayyadava commented on a diff in the pull request:
https://github.com/apache/orc/pull/160#discussion_r134393721
--- Diff: java/mapreduce/src/java/org/apache/orc/mapred/OrcInputFormat.java
---
@@ -58,7 +58,7 @@
*/
public static boolean[] parseInclude(TypeDescription schema,
String columnsStr) {
- if (columnsStr == null ||
+ if (StringUtils.isBlank(columnsStr) ||
--- End diff --
Hi @dongjoon-hyun,
Thank you for reviewing the patch and pointing me in the right direction. I
had earlier fixed only the exception. Based on your comment, a bit of reading
through the code and the spark test that you have linked in the JIRA my
understanding is that the desired behavior is to not select/project any columns
when the `orc.include.columns` is a blank string. I have added a test case for
what I understand is the desired behavior.
I have left the behavior for null(default) value to be as before which
seems bit inconsistent(null & blank should not be different) but I am not sure
on why it was changed and if it's ok to modify that behavior.
I am completely new to ORC and this is my first patch so my understanding
might be completely off. Please feel free to point me in the right direction.
Thanks!
> Allow `orc.include.columns` to be empty
> ---------------------------------------
>
> Key: ORC-233
> URL: https://issues.apache.org/jira/browse/ORC-233
> Project: ORC
> Issue Type: Bug
> Components: Java
> Affects Versions: 1.4.0
> Reporter: Dongjoon Hyun
>
> Apache ORC should support returning all NULLs by the following.
> {code}
> conf.set(OrcConf.INCLUDE_COLUMNS.getAttribute, "")
> {code}
> Currently, it raises the following exceptions.
> {code}
> For input string: ""
> java.lang.NumberFormatException: For input string: ""
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:592)
> at java.lang.Integer.parseInt(Integer.java:615)
> at
> org.apache.orc.mapred.OrcInputFormat.parseInclude(OrcInputFormat.java:69)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)