[jira] [Resolved] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-805.

Resolution: Fixed

Issue resolved by pull request 528
[https://github.com/apache/arrow/pull/528]

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-539) [Python] Support reading Parquet datasets with standard partition directory schemes

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965070#comment-15965070
 ] 

Wes McKinney commented on ARROW-539:


PR: https://github.com/apache/arrow/pull/529

> [Python] Support reading Parquet datasets with standard partition directory 
> schemes
> ---
>
> Key: ARROW-539
> URL: https://issues.apache.org/jira/browse/ARROW-539
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Wes McKinney
> Fix For: 0.3.0
>
> Attachments: partitioned_parquet.tar.gz
>
>
> Currently, we only support multi-file directories with a flat structure 
> (non-partitioned). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965047#comment-15965047
 ] 

Leif Walsh commented on ARROW-805:
--

Ok then, let's just focus on merging this then. 

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965046#comment-15965046
 ] 

Leif Walsh commented on ARROW-805:
--

Ok then, let's just focus on merging this then. 

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965039#comment-15965039
 ] 

Wes McKinney commented on ARROW-805:


I don't think we'll be able to run this CI in the main repo unless we can get 
Circle CI. We should set up an arrow-hadoop repository someplace (I can 
volunteer wesm/arrow-hadoop) where we have more admin control

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965027#comment-15965027
 ] 

Leif Walsh commented on ARROW-805:
--

Living proof:

{noformat}
ubuntu@49b5b4f128cb:~/build$ ARROW_HDFS_TEST_PORT=9000 
ARROW_HDFS_TEST_USER=hdfs ARROW_HDFS_TEST_HOST=impala debug/io-hdfs-test 
[==] Running 18 tests from 2 test cases.
[--] Global test environment set-up.
[--] 9 tests from TestHdfsClient/0, where TypeParam = 
arrow::io::JNIDriver
[ RUN  ] TestHdfsClient/0.ConnectsAgain
17/04/11 22:06:02 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
17/04/11 22:06:03 WARN shortcircuit.DomainSocketFactory: The short-circuit 
local reads feature cannot be used because libhadoop cannot be loaded.
[   OK ] TestHdfsClient/0.ConnectsAgain (1837 ms)
[ RUN  ] TestHdfsClient/0.CreateDirectory
[   OK ] TestHdfsClient/0.CreateDirectory (211 ms)
[ RUN  ] TestHdfsClient/0.GetCapacityUsed
[   OK ] TestHdfsClient/0.GetCapacityUsed (156 ms)
[ RUN  ] TestHdfsClient/0.GetPathInfo
[   OK ] TestHdfsClient/0.GetPathInfo (393 ms)
[ RUN  ] TestHdfsClient/0.AppendToFile
[   OK ] TestHdfsClient/0.AppendToFile (232 ms)
[ RUN  ] TestHdfsClient/0.ListDirectory
[   OK ] TestHdfsClient/0.ListDirectory (228 ms)
[ RUN  ] TestHdfsClient/0.ReadableMethods
[   OK ] TestHdfsClient/0.ReadableMethods (230 ms)
[ RUN  ] TestHdfsClient/0.LargeFile
[   OK ] TestHdfsClient/0.LargeFile (263 ms)
[ RUN  ] TestHdfsClient/0.RenameFile
[   OK ] TestHdfsClient/0.RenameFile (192 ms)
[--] 9 tests from TestHdfsClient/0 (3742 ms total)

[--] 9 tests from TestHdfsClient/1, where TypeParam = 
arrow::io::PivotalDriver
[ RUN  ] TestHdfsClient/1.ConnectsAgain
[   OK ] TestHdfsClient/1.ConnectsAgain (150 ms)
[ RUN  ] TestHdfsClient/1.CreateDirectory
[   OK ] TestHdfsClient/1.CreateDirectory (156 ms)
[ RUN  ] TestHdfsClient/1.GetCapacityUsed
[   OK ] TestHdfsClient/1.GetCapacityUsed (127 ms)
[ RUN  ] TestHdfsClient/1.GetPathInfo
[   OK ] TestHdfsClient/1.GetPathInfo (196 ms)
[ RUN  ] TestHdfsClient/1.AppendToFile
[   OK ] TestHdfsClient/1.AppendToFile (225 ms)
[ RUN  ] TestHdfsClient/1.ListDirectory
[   OK ] TestHdfsClient/1.ListDirectory (230 ms)
[ RUN  ] TestHdfsClient/1.ReadableMethods
[   OK ] TestHdfsClient/1.ReadableMethods (195 ms)
[ RUN  ] TestHdfsClient/1.LargeFile
[   OK ] TestHdfsClient/1.LargeFile (251 ms)
[ RUN  ] TestHdfsClient/1.RenameFile
[   OK ] TestHdfsClient/1.RenameFile (189 ms)
[--] 9 tests from TestHdfsClient/1 (1719 ms total)

[--] Global test environment tear-down
[==] 18 tests from 2 test cases ran. (5462 ms total)
[  PASSED  ] 18 tests.
{noformat}

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965024#comment-15965024
 ] 

Leif Walsh commented on ARROW-805:
--

Good news, everyone!  I got these tests to pass.  I had to run them in a 
separate docker container that was linked with the hdfs one though, which isn't 
great.  I'll try my hand at automating it tonight but don't expect to get that 
far.

This PR has my changes but not CI changes 
https://github.com/apache/arrow/pull/528

I think we should deal with CI in a separate PR, though I will start working on 
it now.

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-720) [java] arrow should not have a dependency on slf4j bridges in compile

2017-04-11 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964815#comment-15964815
 ] 

Julien Le Dem commented on ARROW-720:
-

possibly. Although it's more related to this:
https://github.com/apache/arrow/commit/55d8f99c351c22c2357924b4e70fcef7c8fd119a
I think it is an interaction between the version of maven and maven plugins who 
use incompatible versions of slf4j...
you can add the parameter -Dcheckstyle.skip as a workaround

> [java] arrow should not have a dependency on slf4j bridges in compile
> -
>
> Key: ARROW-720
> URL: https://issues.apache.org/jira/browse/ARROW-720
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java - Vectors
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
>
> See: 
> https://github.com/apache/arrow/blob/d2d27555b4b2f3f0ba26539211bfe8b4d1b52481/java/pom.xml#L472
> as a library, arrow should not pick the direction of the bridges.
> We should move those to test scope



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ARROW-808) [GLib] Remove needless ignore entries

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-808.

Resolution: Fixed

Issue resolved by pull request 527
[https://github.com/apache/arrow/pull/527]

> [GLib] Remove needless ignore entries
> -
>
> Key: ARROW-808
> URL: https://issues.apache.org/jira/browse/ARROW-808
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-808) [GLib] Remove needless ignore entries

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-808:
---
Fix Version/s: 0.3.0

> [GLib] Remove needless ignore entries
> -
>
> Key: ARROW-808
> URL: https://issues.apache.org/jira/browse/ARROW-808
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-808) [GLib] Remove needless ignore entries

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-808:
--

Assignee: Kouhei Sutou

> [GLib] Remove needless ignore entries
> -
>
> Key: ARROW-808
> URL: https://issues.apache.org/jira/browse/ARROW-808
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-807) [GLib] Update "Since" tag

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-807:
--

Assignee: Kouhei Sutou

> [GLib] Update "Since" tag
> -
>
> Key: ARROW-807
> URL: https://issues.apache.org/jira/browse/ARROW-807
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ARROW-807) [GLib] Update "Since" tag

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-807.

Resolution: Fixed

Issue resolved by pull request 526
[https://github.com/apache/arrow/pull/526]

> [GLib] Update "Since" tag
> -
>
> Key: ARROW-807
> URL: https://issues.apache.org/jira/browse/ARROW-807
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-807) [GLib] Update "Since" tag

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-807:
---
Fix Version/s: 0.3.0

> [GLib] Update "Since" tag
> -
>
> Key: ARROW-807
> URL: https://issues.apache.org/jira/browse/ARROW-807
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-806) [GLib] Support add/remove a column from table

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-806:
---
Fix Version/s: 0.3.0

> [GLib] Support add/remove a column from table
> -
>
> Key: ARROW-806
> URL: https://issues.apache.org/jira/browse/ARROW-806
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ARROW-806) [GLib] Support add/remove a column from table

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-806.

Resolution: Fixed

Issue resolved by pull request 525
[https://github.com/apache/arrow/pull/525]

> [GLib] Support add/remove a column from table
> -
>
> Key: ARROW-806
> URL: https://issues.apache.org/jira/browse/ARROW-806
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-806) [GLib] Support add/remove a column from table

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-806:
--

Assignee: Kouhei Sutou

> [GLib] Support add/remove a column from table
> -
>
> Key: ARROW-806
> URL: https://issues.apache.org/jira/browse/ARROW-806
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-801) [JAVA] Provide direct access to underlying buffer memory addresses in consistent way without generating garbage or large amount indirections

2017-04-11 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964572#comment-15964572
 ] 

Julien Le Dem commented on ARROW-801:
-

We'd need to cast only if the types are not declared properly. 
java.lang.Iterator is pretty old. Guava and Scala for example clearly separate 
mutable and immutable collection without needing to cast. 
[Fix|Variable]WidthVector can be subclasses of FieldVector. I think we should 
try before declaring it's too much work.

> [JAVA] Provide direct access to underlying buffer memory addresses in 
> consistent way without generating garbage or large amount indirections
> 
>
> Key: ARROW-801
> URL: https://issues.apache.org/jira/browse/ARROW-801
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java - Vectors
>Reporter: Jacques Nadeau
>
> When working with Arrow vectors recently, we observed a situation where our 
> time was dominated  by calls to getFieldBuffers() to be able to retrieve 
> memory addresses (22s out of 26s total for a piece of code). We should 
> provide a direct mechanism to access this data so we can avoid all the extra 
> indirection and object creation. 
> A proposal:
> getBitAddress();
> getDataAddress();
> getOffsetAddress();
> These interfaces would be made available at the FieldVector interface and 
> simply throw UnsupportedOperationException where not supported.
> Unsupported Operations: 
> data for list type
> offset for fixed width types
> data and offset for struct type
> data for union type



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-801) [JAVA] Provide direct access to underlying buffer memory addresses in consistent way without generating garbage or large amount indirections

2017-04-11 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964505#comment-15964505
 ] 

Jacques Nadeau commented on ARROW-801:
--

Since these are low-level interfaces, i'd prefer to not have to constantly deal 
with type casting. (Much like java.lang.Iterator has remove() even if it isn't 
always supported). 

Using [Fix|Variable]WidthVector interfaces are especially problematic since 
they aren't connected to FieldVector.  I think the interface hierarchy could be 
improved but I'd prefer not to block this work by that (since I think that is a 
longer term piece of work).

> [JAVA] Provide direct access to underlying buffer memory addresses in 
> consistent way without generating garbage or large amount indirections
> 
>
> Key: ARROW-801
> URL: https://issues.apache.org/jira/browse/ARROW-801
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java - Vectors
>Reporter: Jacques Nadeau
>
> When working with Arrow vectors recently, we observed a situation where our 
> time was dominated  by calls to getFieldBuffers() to be able to retrieve 
> memory addresses (22s out of 26s total for a piece of code). We should 
> provide a direct mechanism to access this data so we can avoid all the extra 
> indirection and object creation. 
> A proposal:
> getBitAddress();
> getDataAddress();
> getOffsetAddress();
> These interfaces would be made available at the FieldVector interface and 
> simply throw UnsupportedOperationException where not supported.
> Unsupported Operations: 
> data for list type
> offset for fixed width types
> data and offset for struct type
> data for union type



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964499#comment-15964499
 ] 

Leif Walsh commented on ARROW-805:
--

Okay, I got it set up but got a failure, I think it might be failing a 
permission check?  Not sure yet.

{noformat}
ARROW_HDFS_TEST_PORT=9000 ARROW_HDFS_TEST_USER=hdfs ARROW_HDFS_TEST_HOST=impala 
debug/io-hdfs-test
[==] Running 18 tests from 2 test cases.
[--] Global test environment set-up.
[--] 9 tests from TestHdfsClient/0, where TypeParam = 
arrow::io::JNIDriver
[ RUN  ] TestHdfsClient/0.ConnectsAgain
2017-04-11 11:13:25,511 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
[   OK ] TestHdfsClient/0.ConnectsAgain (1381 ms)
[ RUN  ] TestHdfsClient/0.CreateDirectory
/home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:174: Failure
Value of: s.ok()
  Actual: false
Expected: true
[  FAILED  ] TestHdfsClient/0.CreateDirectory, where TypeParam = 
arrow::io::JNIDriver (201 ms)
[ RUN  ] TestHdfsClient/0.GetCapacityUsed
[   OK ] TestHdfsClient/0.GetCapacityUsed (137 ms)
[ RUN  ] TestHdfsClient/0.GetPathInfo
[   OK ] TestHdfsClient/0.GetPathInfo (799 ms)
[ RUN  ] TestHdfsClient/0.AppendToFile
2017-04-11 11:13:27,536 WARN  [Thread-15] hdfs.DFSClient 
(DFSOutputStream.java:run(557)) - DataStreamer Exception
java.io.IOException: Failed to replace a bad datanode on the existing pipeline 
due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
 
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
FSDataOutputStream#close error:
java.io.IOException: Failed to replace a bad datanode on the existing pipeline 
due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
 
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
/home/leif/git/arrow/cpp/src/arrow/io/io-hdfs-test.cc:236: Failure
Failed
IOError: HDFS: CloseFile failed
2017-04-11 11:13:27,550 ERROR [main] hdfs.DFSClient 
(DFSClient.java:closeAllFilesBeingWritten(940)) - Failed to close inode 16419
java.io.IOException: Failed to replace a bad datanode on the existing pipeline 
due to no more good datanodes being available to try. (Nodes: 
current=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]],
 
original=[DatanodeInfoWithStorage[172.17.0.2:50010,DS-12c742d4-9f2f-46b4-a512-ee1a1ebd732b,DISK]]).
 The current failed datanode replacement policy is DEFAULT, and a client may 
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' 
in its configuration.
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:914)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
[  FAILED  ] TestHdfsClient/0.AppendToFile, where TypeParam = 
arrow::io::JNIDriver (207 ms)
{noformat}

[~cpcloud] maybe we can look together this afternoon?

> listing empty HDFS directory returns an error instead of returning empty list
> -

[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964461#comment-15964461
 ] 

Wes McKinney commented on ARROW-805:


Check out https://github.com/cloudera/ibis/blob/master/circle.yml

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964458#comment-15964458
 ] 

Leif Walsh commented on ARROW-805:
--

How should I use that?  docker run bash and build/run tests in there?  Any env 
vars I need to set?

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-646) Cache miniconda packages

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964454#comment-15964454
 ] 

Wes McKinney commented on ARROW-646:


[~cpcloud] [~leif] I marked this for 0.3 -- with the rampant conda / 
anaconda.org timeouts, if we can cache our conda tarballs using Travis CI 
caching we may be able to both speed up builds and reduce flakiness. I think 
this may involve changing the conda cache directory to be someplace other than 
the miniconda/ install in 
https://github.com/apache/arrow/blob/master/ci/travis_install_conda.sh#L27. 

> Cache miniconda packages
> 
>
> Key: ARROW-646
> URL: https://issues.apache.org/jira/browse/ARROW-646
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
> Fix For: 0.3.0
>
>
> The unpacking of freshly downloaded conda packages makes up significant time 
> in our build. Locally this made the conda env creation for Parquet go from 
> 1min to 14s.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Phillip Cloud (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964455#comment-15964455
 ] 

Phillip Cloud commented on ARROW-805:
-

Here's a link to the image: https://hub.docker.com/r/cpcloud86/impala/

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-646) Cache miniconda packages

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-646:
---
Fix Version/s: 0.3.0

> Cache miniconda packages
> 
>
> Key: ARROW-646
> URL: https://issues.apache.org/jira/browse/ARROW-646
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
> Fix For: 0.3.0
>
>
> The unpacking of freshly downloaded conda packages makes up significant time 
> in our build. Locally this made the conda env creation for Parquet go from 
> 1min to 14s.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964452#comment-15964452
 ] 

Wes McKinney commented on ARROW-805:


[~cpcloud] has a docker image with HDFS (and Hive/Impala) that we may be able 
to use for testing HDFS in a portable way. I currently run the tests against a 
local cluster, but it's pretty ad hoc. 

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Walsh reassigned ARROW-805:


Assignee: Leif Walsh

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
>Assignee: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1596#comment-1596
 ] 

Wes McKinney commented on ARROW-805:


Looks right, yep

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964439#comment-15964439
 ] 

Leif Walsh commented on ARROW-805:
--

Does that fix look right? If so I can implement and test. 

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-808) [GLib] Remove needless ignore entries

2017-04-11 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-808:
--

 Summary: [GLib] Remove needless ignore entries
 Key: ARROW-808
 URL: https://issues.apache.org/jira/browse/ARROW-808
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-804) [GLib] Update build document

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-804:
---
Fix Version/s: 0.3.0

> [GLib] Update build document
> 
>
> Key: ARROW-804
> URL: https://issues.apache.org/jira/browse/ARROW-804
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-807) [GLib] Update "Since" tag

2017-04-11 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-807:
--

 Summary: [GLib] Update "Since" tag
 Key: ARROW-807
 URL: https://issues.apache.org/jira/browse/ARROW-807
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ARROW-804) [GLib] Update build document

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-804.

Resolution: Fixed

Issue resolved by pull request 524
[https://github.com/apache/arrow/pull/524]

> [GLib] Update build document
> 
>
> Key: ARROW-804
> URL: https://issues.apache.org/jira/browse/ARROW-804
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-804) [GLib] Update build document

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-804:
--

Assignee: Kouhei Sutou

> [GLib] Update build document
> 
>
> Key: ARROW-804
> URL: https://issues.apache.org/jira/browse/ARROW-804
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-803) [GLib] Update package repository URL

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-803:
--

Assignee: Kouhei Sutou

> [GLib] Update package repository URL
> 
>
> Key: ARROW-803
> URL: https://issues.apache.org/jira/browse/ARROW-803
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-803) [GLib] Update package repository URL

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-803:
---
Fix Version/s: 0.3.0

> [GLib] Update package repository URL
> 
>
> Key: ARROW-803
> URL: https://issues.apache.org/jira/browse/ARROW-803
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ARROW-803) [GLib] Update package repository URL

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-803.

Resolution: Fixed

Issue resolved by pull request 523
[https://github.com/apache/arrow/pull/523]

> [GLib] Update package repository URL
> 
>
> Key: ARROW-803
> URL: https://issues.apache.org/jira/browse/ARROW-803
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ARROW-802) [GLib] Add read examples

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-802:
--

Assignee: Kouhei Sutou

> [GLib] Add read examples
> 
>
> Key: ARROW-802
> URL: https://issues.apache.org/jira/browse/ARROW-802
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-802) [GLib] Add read examples

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-802:
---
Fix Version/s: 0.3.0

> [GLib] Add read examples
> 
>
> Key: ARROW-802
> URL: https://issues.apache.org/jira/browse/ARROW-802
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
> Fix For: 0.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ARROW-802) [GLib] Add read examples

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-802.

Resolution: Fixed

Issue resolved by pull request 522
[https://github.com/apache/arrow/pull/522]

> [GLib] Add read examples
> 
>
> Key: ARROW-802
> URL: https://issues.apache.org/jira/browse/ARROW-802
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-805:
---
Fix Version/s: 0.3.0

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964426#comment-15964426
 ] 

Wes McKinney commented on ARROW-805:


Marked for 0.3

> listing empty HDFS directory returns an error instead of returning empty list
> -
>
> Key: ARROW-805
> URL: https://issues.apache.org/jira/browse/ARROW-805
> Project: Apache Arrow
>  Issue Type: Bug
>Affects Versions: 0.2.0, 0.3.0
>Reporter: Leif Walsh
> Fix For: 0.3.0
>
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) { num_entries = 0; }
>   { return Status::IOError("HDFS: list directory failed"); }
> }
> {code}
> I think that should have an else:
> {code}
> if (entries == nullptr) {
>   // If the directory is empty, entries is NULL but errno is 0. Non-zero
>   // errno indicates error
>   //
>   // Note: errno is thread-locala
>   if (errno == 0) {
> num_entries = 0;
>   } else {
> return Status::IOError("HDFS: list directory failed");
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-806) [GLib] Support add/remove a column from table

2017-04-11 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-806:
--

 Summary: [GLib] Support add/remove a column from table
 Key: ARROW-806
 URL: https://issues.apache.org/jira/browse/ARROW-806
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-805) listing empty HDFS directory returns an error instead of returning empty list

2017-04-11 Thread Leif Walsh (JIRA)
Leif Walsh created ARROW-805:


 Summary: listing empty HDFS directory returns an error instead of 
returning empty list
 Key: ARROW-805
 URL: https://issues.apache.org/jira/browse/ARROW-805
 Project: Apache Arrow
  Issue Type: Bug
Affects Versions: 0.2.0, 0.3.0
Reporter: Leif Walsh


https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/hdfs.cc#L409-L410

{code}
if (entries == nullptr) {
  // If the directory is empty, entries is NULL but errno is 0. Non-zero
  // errno indicates error
  //
  // Note: errno is thread-locala
  if (errno == 0) { num_entries = 0; }
  { return Status::IOError("HDFS: list directory failed"); }
}
{code}

I think that should have an else:

{code}
if (entries == nullptr) {
  // If the directory is empty, entries is NULL but errno is 0. Non-zero
  // errno indicates error
  //
  // Note: errno is thread-locala
  if (errno == 0) {
num_entries = 0;
  } else {
return Status::IOError("HDFS: list directory failed");
  }
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-804) [GLib] Update build document

2017-04-11 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-804:
--

 Summary: [GLib] Update build document
 Key: ARROW-804
 URL: https://issues.apache.org/jira/browse/ARROW-804
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-803) [GLib] Update package repository URL

2017-04-11 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-803:
--

 Summary: [GLib] Update package repository URL
 Key: ARROW-803
 URL: https://issues.apache.org/jira/browse/ARROW-803
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ARROW-802) [GLib] Add read examples

2017-04-11 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-802:
--

 Summary: [GLib] Add read examples
 Key: ARROW-802
 URL: https://issues.apache.org/jira/browse/ARROW-802
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)