[jira] [Resolved] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-805. Resolution: Fixed Issue resolved by pull request 528 [https://github.com/apache/arrow/pull/528] > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-539) [Python] Support reading Parquet datasets with standard partition directory schemes
[ https://issues.apache.org/jira/browse/ARROW-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965070#comment-15965070 ] Wes McKinney commented on ARROW-539: PR: https://github.com/apache/arrow/pull/529 > [Python] Support reading Parquet datasets with standard partition directory > schemes > --- > > Key: ARROW-539 > URL: https://issues.apache.org/jira/browse/ARROW-539 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Wes McKinney >Assignee: Wes McKinney > Fix For: 0.3.0 > > Attachments: partitioned_parquet.tar.gz > > > Currently, we only support multi-file directories with a flat structure > (non-partitioned). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965047#comment-15965047 ] Leif Walsh commented on ARROW-805: -- Ok then, let's just focus on merging this then. > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965046#comment-15965046 ] Leif Walsh commented on ARROW-805: -- Ok then, let's just focus on merging this then. > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965039#comment-15965039 ] Wes McKinney commented on ARROW-805: I don't think we'll be able to run this CI in the main repo unless we can get Circle CI. We should set up an arrow-hadoop repository someplace (I can volunteer wesm/arrow-hadoop) where we have more admin control > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965027#comment-15965027 ] Leif Walsh commented on ARROW-805: -- Living proof: {noformat} ubuntu@49b5b4f128cb:~/build$ ARROW_HDFS_TEST_PORT=9000 ARROW_HDFS_TEST_USER=hdfs ARROW_HDFS_TEST_HOST=impala debug/io-hdfs-test [==] Running 18 tests from 2 test cases. [--] Global test environment set-up. [--] 9 tests from TestHdfsClient/0, where TypeParam = arrow::io::JNIDriver [ RUN ] TestHdfsClient/0.ConnectsAgain 17/04/11 22:06:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/04/11 22:06:03 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. [ OK ] TestHdfsClient/0.ConnectsAgain (1837 ms) [ RUN ] TestHdfsClient/0.CreateDirectory [ OK ] TestHdfsClient/0.CreateDirectory (211 ms) [ RUN ] TestHdfsClient/0.GetCapacityUsed [ OK ] TestHdfsClient/0.GetCapacityUsed (156 ms) [ RUN ] TestHdfsClient/0.GetPathInfo [ OK ] TestHdfsClient/0.GetPathInfo (393 ms) [ RUN ] TestHdfsClient/0.AppendToFile [ OK ] TestHdfsClient/0.AppendToFile (232 ms) [ RUN ] TestHdfsClient/0.ListDirectory [ OK ] TestHdfsClient/0.ListDirectory (228 ms) [ RUN ] TestHdfsClient/0.ReadableMethods [ OK ] TestHdfsClient/0.ReadableMethods (230 ms) [ RUN ] TestHdfsClient/0.LargeFile [ OK ] TestHdfsClient/0.LargeFile (263 ms) [ RUN ] TestHdfsClient/0.RenameFile [ OK ] TestHdfsClient/0.RenameFile (192 ms) [--] 9 tests from TestHdfsClient/0 (3742 ms total) [--] 9 tests from TestHdfsClient/1, where TypeParam = arrow::io::PivotalDriver [ RUN ] TestHdfsClient/1.ConnectsAgain [ OK ] TestHdfsClient/1.ConnectsAgain (150 ms) [ RUN ] TestHdfsClient/1.CreateDirectory [ OK ] TestHdfsClient/1.CreateDirectory (156 ms) [ RUN ] TestHdfsClient/1.GetCapacityUsed [ OK ] TestHdfsClient/1.GetCapacityUsed (127 ms) [ RUN ] TestHdfsClient/1.GetPathInfo [ OK ] TestHdfsClient/1.GetPathInfo (196 ms) [ RUN ] TestHdfsClient/1.AppendToFile [ OK ] TestHdfsClient/1.AppendToFile (225 ms) [ RUN ] TestHdfsClient/1.ListDirectory [ OK ] TestHdfsClient/1.ListDirectory (230 ms) [ RUN ] TestHdfsClient/1.ReadableMethods [ OK ] TestHdfsClient/1.ReadableMethods (195 ms) [ RUN ] TestHdfsClient/1.LargeFile [ OK ] TestHdfsClient/1.LargeFile (251 ms) [ RUN ] TestHdfsClient/1.RenameFile [ OK ] TestHdfsClient/1.RenameFile (189 ms) [--] 9 tests from TestHdfsClient/1 (1719 ms total) [--] Global test environment tear-down [==] 18 tests from 2 test cases ran. (5462 ms total) [ PASSED ] 18 tests. {noformat} > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965024#comment-15965024 ] Leif Walsh commented on ARROW-805: -- Good news, everyone! I got these tests to pass. I had to run them in a separate docker container that was linked with the hdfs one though, which isn't great. I'll try my hand at automating it tonight but don't expect to get that far. This PR has my changes but not CI changes https://github.com/apache/arrow/pull/528 I think we should deal with CI in a separate PR, though I will start working on it now. > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-720) [java] arrow should not have a dependency on slf4j bridges in compile
[ https://issues.apache.org/jira/browse/ARROW-720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964815#comment-15964815 ] Julien Le Dem commented on ARROW-720: - possibly. Although it's more related to this: https://github.com/apache/arrow/commit/55d8f99c351c22c2357924b4e70fcef7c8fd119a I think it is an interaction between the version of maven and maven plugins who use incompatible versions of slf4j... you can add the parameter -Dcheckstyle.skip as a workaround > [java] arrow should not have a dependency on slf4j bridges in compile > - > > Key: ARROW-720 > URL: https://issues.apache.org/jira/browse/ARROW-720 > Project: Apache Arrow > Issue Type: Bug > Components: Java - Vectors >Reporter: Julien Le Dem >Assignee: Julien Le Dem > > See: > https://github.com/apache/arrow/blob/d2d27555b4b2f3f0ba26539211bfe8b4d1b52481/java/pom.xml#L472 > as a library, arrow should not pick the direction of the bridges. > We should move those to test scope -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ARROW-808) [GLib] Remove needless ignore entries
[ https://issues.apache.org/jira/browse/ARROW-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-808. Resolution: Fixed Issue resolved by pull request 527 [https://github.com/apache/arrow/pull/527] > [GLib] Remove needless ignore entries > - > > Key: ARROW-808 > URL: https://issues.apache.org/jira/browse/ARROW-808 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-808) [GLib] Remove needless ignore entries
[ https://issues.apache.org/jira/browse/ARROW-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-808: --- Fix Version/s: 0.3.0 > [GLib] Remove needless ignore entries > - > > Key: ARROW-808 > URL: https://issues.apache.org/jira/browse/ARROW-808 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-808) [GLib] Remove needless ignore entries
[ https://issues.apache.org/jira/browse/ARROW-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-808: -- Assignee: Kouhei Sutou > [GLib] Remove needless ignore entries > - > > Key: ARROW-808 > URL: https://issues.apache.org/jira/browse/ARROW-808 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-807) [GLib] Update "Since" tag
[ https://issues.apache.org/jira/browse/ARROW-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-807: -- Assignee: Kouhei Sutou > [GLib] Update "Since" tag > - > > Key: ARROW-807 > URL: https://issues.apache.org/jira/browse/ARROW-807 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ARROW-807) [GLib] Update "Since" tag
[ https://issues.apache.org/jira/browse/ARROW-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-807. Resolution: Fixed Issue resolved by pull request 526 [https://github.com/apache/arrow/pull/526] > [GLib] Update "Since" tag > - > > Key: ARROW-807 > URL: https://issues.apache.org/jira/browse/ARROW-807 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-807) [GLib] Update "Since" tag
[ https://issues.apache.org/jira/browse/ARROW-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-807: --- Fix Version/s: 0.3.0 > [GLib] Update "Since" tag > - > > Key: ARROW-807 > URL: https://issues.apache.org/jira/browse/ARROW-807 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-806) [GLib] Support add/remove a column from table
[ https://issues.apache.org/jira/browse/ARROW-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-806: --- Fix Version/s: 0.3.0 > [GLib] Support add/remove a column from table > - > > Key: ARROW-806 > URL: https://issues.apache.org/jira/browse/ARROW-806 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ARROW-806) [GLib] Support add/remove a column from table
[ https://issues.apache.org/jira/browse/ARROW-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-806. Resolution: Fixed Issue resolved by pull request 525 [https://github.com/apache/arrow/pull/525] > [GLib] Support add/remove a column from table > - > > Key: ARROW-806 > URL: https://issues.apache.org/jira/browse/ARROW-806 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-806) [GLib] Support add/remove a column from table
[ https://issues.apache.org/jira/browse/ARROW-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-806: -- Assignee: Kouhei Sutou > [GLib] Support add/remove a column from table > - > > Key: ARROW-806 > URL: https://issues.apache.org/jira/browse/ARROW-806 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-801) [JAVA] Provide direct access to underlying buffer memory addresses in consistent way without generating garbage or large amount indirections
[ https://issues.apache.org/jira/browse/ARROW-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964572#comment-15964572 ] Julien Le Dem commented on ARROW-801: - We'd need to cast only if the types are not declared properly. java.lang.Iterator is pretty old. Guava and Scala for example clearly separate mutable and immutable collection without needing to cast. [Fix|Variable]WidthVector can be subclasses of FieldVector. I think we should try before declaring it's too much work. > [JAVA] Provide direct access to underlying buffer memory addresses in > consistent way without generating garbage or large amount indirections > > > Key: ARROW-801 > URL: https://issues.apache.org/jira/browse/ARROW-801 > Project: Apache Arrow > Issue Type: Bug > Components: Java - Vectors >Reporter: Jacques Nadeau > > When working with Arrow vectors recently, we observed a situation where our > time was dominated by calls to getFieldBuffers() to be able to retrieve > memory addresses (22s out of 26s total for a piece of code). We should > provide a direct mechanism to access this data so we can avoid all the extra > indirection and object creation. > A proposal: > getBitAddress(); > getDataAddress(); > getOffsetAddress(); > These interfaces would be made available at the FieldVector interface and > simply throw UnsupportedOperationException where not supported. > Unsupported Operations: > data for list type > offset for fixed width types > data and offset for struct type > data for union type -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-801) [JAVA] Provide direct access to underlying buffer memory addresses in consistent way without generating garbage or large amount indirections
[ https://issues.apache.org/jira/browse/ARROW-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964505#comment-15964505 ] Jacques Nadeau commented on ARROW-801: -- Since these are low-level interfaces, i'd prefer to not have to constantly deal with type casting. (Much like java.lang.Iterator has remove() even if it isn't always supported). Using [Fix|Variable]WidthVector interfaces are especially problematic since they aren't connected to FieldVector. I think the interface hierarchy could be improved but I'd prefer not to block this work by that (since I think that is a longer term piece of work). > [JAVA] Provide direct access to underlying buffer memory addresses in > consistent way without generating garbage or large amount indirections > > > Key: ARROW-801 > URL: https://issues.apache.org/jira/browse/ARROW-801 > Project: Apache Arrow > Issue Type: Bug > Components: Java - Vectors >Reporter: Jacques Nadeau > > When working with Arrow vectors recently, we observed a situation where our > time was dominated by calls to getFieldBuffers() to be able to retrieve > memory addresses (22s out of 26s total for a piece of code). We should > provide a direct mechanism to access this data so we can avoid all the extra > indirection and object creation. > A proposal: > getBitAddress(); > getDataAddress(); > getOffsetAddress(); > These interfaces would be made available at the FieldVector interface and > simply throw UnsupportedOperationException where not supported. > Unsupported Operations: > data for list type > offset for fixed width types > data and offset for struct type > data for union type -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964499#comment-15964499 ] Leif Walsh commented on ARROW-805: -- Okay, I got it set up but got a failure, I think it might be failing a permission check? Not sure yet. {noformat} ARROW_HDFS_TEST_PORT=9000 ARROW_HDFS_TEST_USER=hdfs ARROW_HDFS_TEST_HOST=impala debug/io-hdfs-test [==] Running 18 tests from 2 test cases. [--] Global test environment set-up. [--] 9 tests from TestHdfsClient/0, where TypeParam = arrow::io::JNIDriver [ RUN ] TestHdfsClient/0.ConnectsAgain 2017-04-11 11:13:25,511 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [ OK ] TestHdfsClient/0.ConnectsAgain (1381 ms) [ RUN ] TestHdfsClient/0.CreateDirectory /home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:174: Failure Value of: s.ok() Actual: false Expected: true [ FAILED ] TestHdfsClient/0.CreateDirectory, where TypeParam = arrow::io::JNIDriver (201 ms) [ RUN ] TestHdfsClient/0.GetCapacityUsed [ OK ] TestHdfsClient/0.GetCapacityUsed (137 ms) [ RUN ] TestHdfsClient/0.GetPathInfo [ OK ] TestHdfsClient/0.GetPathInfo (799 ms) [ RUN ] TestHdfsClient/0.AppendToFile 2017-04-11 11:13:27,536 WARN [Thread-15] hdfs.DFSClient (DFSOutputStream.java:run(557)) - DataStreamer Exception java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]], original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) FSDataOutputStream#close error: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]], original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) /home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:236: Failure Failed IOError: HDFS: CloseFile failed 2017-04-11 11:13:27,550 ERROR [main] hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(940)) - Failed to close inode 16419 java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]], original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454) [ FAILED ] TestHdfsClient/0.AppendToFile, where TypeParam = arrow::io::JNIDriver (207 ms) {noformat} [~cpcloud] maybe we can look together this afternoon? > listing empty HDFS directory returns an error instead of returning empty list > -
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964461#comment-15964461 ] Wes McKinney commented on ARROW-805: Check out https://github.com/cloudera/ibis/blob/master/circle.yml > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964458#comment-15964458 ] Leif Walsh commented on ARROW-805: -- How should I use that? docker run bash and build/run tests in there? Any env vars I need to set? > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-646) Cache miniconda packages
[ https://issues.apache.org/jira/browse/ARROW-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964454#comment-15964454 ] Wes McKinney commented on ARROW-646: [~cpcloud] [~leif] I marked this for 0.3 -- with the rampant conda / anaconda.org timeouts, if we can cache our conda tarballs using Travis CI caching we may be able to both speed up builds and reduce flakiness. I think this may involve changing the conda cache directory to be someplace other than the miniconda/ install in https://github.com/apache/arrow/blob/master/ci/travis_install_conda.sh#L27. > Cache miniconda packages > > > Key: ARROW-646 > URL: https://issues.apache.org/jira/browse/ARROW-646 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Uwe L. Korn >Assignee: Uwe L. Korn > Fix For: 0.3.0 > > > The unpacking of freshly downloaded conda packages makes up significant time > in our build. Locally this made the conda env creation for Parquet go from > 1min to 14s. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964455#comment-15964455 ] Phillip Cloud commented on ARROW-805: - Here's a link to the image: https://hub.docker.com/r/cpcloud86/impala/ > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-646) Cache miniconda packages
[ https://issues.apache.org/jira/browse/ARROW-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-646: --- Fix Version/s: 0.3.0 > Cache miniconda packages > > > Key: ARROW-646 > URL: https://issues.apache.org/jira/browse/ARROW-646 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Uwe L. Korn >Assignee: Uwe L. Korn > Fix For: 0.3.0 > > > The unpacking of freshly downloaded conda packages makes up significant time > in our build. Locally this made the conda env creation for Parquet go from > 1min to 14s. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964452#comment-15964452 ] Wes McKinney commented on ARROW-805: [~cpcloud] has a docker image with HDFS (and Hive/Impala) that we may be able to use for testing HDFS in a portable way. I currently run the tests against a local cluster, but it's pretty ad hoc. > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leif Walsh reassigned ARROW-805: Assignee: Leif Walsh > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh >Assignee: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1596#comment-1596 ] Wes McKinney commented on ARROW-805: Looks right, yep > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964439#comment-15964439 ] Leif Walsh commented on ARROW-805: -- Does that fix look right? If so I can implement and test. > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-808) [GLib] Remove needless ignore entries
Kouhei Sutou created ARROW-808: -- Summary: [GLib] Remove needless ignore entries Key: ARROW-808 URL: https://issues.apache.org/jira/browse/ARROW-808 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-804) [GLib] Update build document
[ https://issues.apache.org/jira/browse/ARROW-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-804: --- Fix Version/s: 0.3.0 > [GLib] Update build document > > > Key: ARROW-804 > URL: https://issues.apache.org/jira/browse/ARROW-804 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-807) [GLib] Update "Since" tag
Kouhei Sutou created ARROW-807: -- Summary: [GLib] Update "Since" tag Key: ARROW-807 URL: https://issues.apache.org/jira/browse/ARROW-807 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ARROW-804) [GLib] Update build document
[ https://issues.apache.org/jira/browse/ARROW-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-804. Resolution: Fixed Issue resolved by pull request 524 [https://github.com/apache/arrow/pull/524] > [GLib] Update build document > > > Key: ARROW-804 > URL: https://issues.apache.org/jira/browse/ARROW-804 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-804) [GLib] Update build document
[ https://issues.apache.org/jira/browse/ARROW-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-804: -- Assignee: Kouhei Sutou > [GLib] Update build document > > > Key: ARROW-804 > URL: https://issues.apache.org/jira/browse/ARROW-804 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-803) [GLib] Update package repository URL
[ https://issues.apache.org/jira/browse/ARROW-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-803: -- Assignee: Kouhei Sutou > [GLib] Update package repository URL > > > Key: ARROW-803 > URL: https://issues.apache.org/jira/browse/ARROW-803 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-803) [GLib] Update package repository URL
[ https://issues.apache.org/jira/browse/ARROW-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-803: --- Fix Version/s: 0.3.0 > [GLib] Update package repository URL > > > Key: ARROW-803 > URL: https://issues.apache.org/jira/browse/ARROW-803 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ARROW-803) [GLib] Update package repository URL
[ https://issues.apache.org/jira/browse/ARROW-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-803. Resolution: Fixed Issue resolved by pull request 523 [https://github.com/apache/arrow/pull/523] > [GLib] Update package repository URL > > > Key: ARROW-803 > URL: https://issues.apache.org/jira/browse/ARROW-803 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ARROW-802) [GLib] Add read examples
[ https://issues.apache.org/jira/browse/ARROW-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-802: -- Assignee: Kouhei Sutou > [GLib] Add read examples > > > Key: ARROW-802 > URL: https://issues.apache.org/jira/browse/ARROW-802 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-802) [GLib] Add read examples
[ https://issues.apache.org/jira/browse/ARROW-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-802: --- Fix Version/s: 0.3.0 > [GLib] Add read examples > > > Key: ARROW-802 > URL: https://issues.apache.org/jira/browse/ARROW-802 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Fix For: 0.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ARROW-802) [GLib] Add read examples
[ https://issues.apache.org/jira/browse/ARROW-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-802. Resolution: Fixed Issue resolved by pull request 522 [https://github.com/apache/arrow/pull/522] > [GLib] Add read examples > > > Key: ARROW-802 > URL: https://issues.apache.org/jira/browse/ARROW-802 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-805: --- Fix Version/s: 0.3.0 > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
[ https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964426#comment-15964426 ] Wes McKinney commented on ARROW-805: Marked for 0.3 > listing empty HDFS directory returns an error instead of returning empty list > - > > Key: ARROW-805 > URL: https://issues.apache.org/jira/browse/ARROW-805 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.2.0, 0.3.0 >Reporter: Leif Walsh > Fix For: 0.3.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { num_entries = 0; } > { return Status::IOError("HDFS: list directory failed"); } > } > {code} > I think that should have an else: > {code} > if (entries == nullptr) { > // If the directory is empty, entries is NULL but errno is 0. Non-zero > // errno indicates error > // > // Note: errno is thread-locala > if (errno == 0) { > num_entries = 0; > } else { > return Status::IOError("HDFS: list directory failed"); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-806) [GLib] Support add/remove a column from table
Kouhei Sutou created ARROW-806: -- Summary: [GLib] Support add/remove a column from table Key: ARROW-806 URL: https://issues.apache.org/jira/browse/ARROW-806 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list
Leif Walsh created ARROW-805: Summary: listing empty HDFS directory returns an error instead of returning empty list Key: ARROW-805 URL: https://issues.apache.org/jira/browse/ARROW-805 Project: Apache Arrow Issue Type: Bug Affects Versions: 0.2.0, 0.3.0 Reporter: Leif Walsh https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410 {code} if (entries == nullptr) { // If the directory is empty, entries is NULL but errno is 0. Non-zero // errno indicates error // // Note: errno is thread-locala if (errno == 0) { num_entries = 0; } { return Status::IOError("HDFS: list directory failed"); } } {code} I think that should have an else: {code} if (entries == nullptr) { // If the directory is empty, entries is NULL but errno is 0. Non-zero // errno indicates error // // Note: errno is thread-locala if (errno == 0) { num_entries = 0; } else { return Status::IOError("HDFS: list directory failed"); } } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-804) [GLib] Update build document
Kouhei Sutou created ARROW-804: -- Summary: [GLib] Update build document Key: ARROW-804 URL: https://issues.apache.org/jira/browse/ARROW-804 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-803) [GLib] Update package repository URL
Kouhei Sutou created ARROW-803: -- Summary: [GLib] Update package repository URL Key: ARROW-803 URL: https://issues.apache.org/jira/browse/ARROW-803 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ARROW-802) [GLib] Add read examples
Kouhei Sutou created ARROW-802: -- Summary: [GLib] Add read examples Key: ARROW-802 URL: https://issues.apache.org/jira/browse/ARROW-802 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)