[jira] [Commented] (ARROW-1585) serialize_pandas round trip fails on integer columns

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195807#comment-16195807
 ] 

ASF GitHub Bot commented on ARROW-1585:
---

Github user cpcloud commented on the issue:

https://github.com/apache/arrow/pull/1161
  
@wesm This is now passing on Travis. Waiting for appveyor to start: 
https://ci.appveyor.com/project/cpcloud/arrow/build/1.0.329


> serialize_pandas round trip fails on integer columns
> 
>
> Key: ARROW-1585
> URL: https://issues.apache.org/jira/browse/ARROW-1585
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This roundtrip fails, since the Integer column isn't converted to a string 
> after deserializing
> {code:python}
> In [1]: import pandas as pd
> im
> In [2]: import pyarrow as pa
> In [3]: pa.deserialize_pandas(pa.serialize_pandas(pd.DataFrame({"0": [1, 
> 2]}))).columns
> Out[3]: Index(['0'], dtype='object')
> {code}
> That should be an {{ Int64Index([0]) }} for the columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1585) serialize_pandas round trip fails on integer columns

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195812#comment-16195812
 ] 

ASF GitHub Bot commented on ARROW-1585:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1161
  
Sweet, will keep an eye on the build


> serialize_pandas round trip fails on integer columns
> 
>
> Key: ARROW-1585
> URL: https://issues.apache.org/jira/browse/ARROW-1585
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This roundtrip fails, since the Integer column isn't converted to a string 
> after deserializing
> {code:python}
> In [1]: import pandas as pd
> im
> In [2]: import pyarrow as pa
> In [3]: pa.deserialize_pandas(pa.serialize_pandas(pd.DataFrame({"0": [1, 
> 2]}))).columns
> Out[3]: Index(['0'], dtype='object')
> {code}
> That should be an {{ Int64Index([0]) }} for the columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1641) [C++] Do not include in public headers

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195867#comment-16195867
 ] 

ASF GitHub Bot commented on ARROW-1641:
---

Github user asfgit closed the pull request at:

https://github.com/apache/arrow/pull/1165


> [C++] Do not include  in public headers
> --
>
> Key: ARROW-1641
> URL: https://issues.apache.org/jira/browse/ARROW-1641
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This is a part of ARROW-1134



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ARROW-1641) [C++] Do not include in public headers

2017-10-07 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-1641.
-
Resolution: Fixed

Issue resolved by pull request 1165
[https://github.com/apache/arrow/pull/1165]

> [C++] Do not include  in public headers
> --
>
> Key: ARROW-1641
> URL: https://issues.apache.org/jira/browse/ARROW-1641
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This is a part of ARROW-1134



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1641) [C++] Do not include in public headers

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195865#comment-16195865
 ] 

ASF GitHub Bot commented on ARROW-1641:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1165
  
+1


> [C++] Do not include  in public headers
> --
>
> Key: ARROW-1641
> URL: https://issues.apache.org/jira/browse/ARROW-1641
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This is a part of ARROW-1134



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1250) [Python] Define API for user type checking of array types

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195869#comment-16195869
 ] 

ASF GitHub Bot commented on ARROW-1250:
---

Github user asfgit closed the pull request at:

https://github.com/apache/arrow/pull/1183


> [Python] Define API for user type checking of array types
> -
>
> Key: ARROW-1250
> URL: https://issues.apache.org/jira/browse/ARROW-1250
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> We have some subclasses of {{pyarrow.lib.DataType}}, but we haven't been 
> designing with the intent of writing {{isinstance(arr.type, 
> pyarrow.TimestampType)}}. We should think about the public API for such 
> type-checking or other type of schema validation. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ARROW-1250) [Python] Define API for user type checking of array types

2017-10-07 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-1250.
-
Resolution: Fixed

Issue resolved by pull request 1183
[https://github.com/apache/arrow/pull/1183]

> [Python] Define API for user type checking of array types
> -
>
> Key: ARROW-1250
> URL: https://issues.apache.org/jira/browse/ARROW-1250
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> We have some subclasses of {{pyarrow.lib.DataType}}, but we haven't been 
> designing with the intent of writing {{isinstance(arr.type, 
> pyarrow.TimestampType)}}. We should think about the public API for such 
> type-checking or other type of schema validation. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ARROW-1585) serialize_pandas round trip fails on integer columns

2017-10-07 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-1585.
-
Resolution: Fixed

Issue resolved by pull request 1161
[https://github.com/apache/arrow/pull/1161]

> serialize_pandas round trip fails on integer columns
> 
>
> Key: ARROW-1585
> URL: https://issues.apache.org/jira/browse/ARROW-1585
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This roundtrip fails, since the Integer column isn't converted to a string 
> after deserializing
> {code:python}
> In [1]: import pandas as pd
> im
> In [2]: import pyarrow as pa
> In [3]: pa.deserialize_pandas(pa.serialize_pandas(pd.DataFrame({"0": [1, 
> 2]}))).columns
> Out[3]: Index(['0'], dtype='object')
> {code}
> That should be an {{ Int64Index([0]) }} for the columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1585) serialize_pandas round trip fails on integer columns

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195886#comment-16195886
 ] 

ASF GitHub Bot commented on ARROW-1585:
---

Github user asfgit closed the pull request at:

https://github.com/apache/arrow/pull/1161


> serialize_pandas round trip fails on integer columns
> 
>
> Key: ARROW-1585
> URL: https://issues.apache.org/jira/browse/ARROW-1585
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This roundtrip fails, since the Integer column isn't converted to a string 
> after deserializing
> {code:python}
> In [1]: import pandas as pd
> im
> In [2]: import pyarrow as pa
> In [3]: pa.deserialize_pandas(pa.serialize_pandas(pd.DataFrame({"0": [1, 
> 2]}))).columns
> Out[3]: Index(['0'], dtype='object')
> {code}
> That should be an {{ Int64Index([0]) }} for the columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ARROW-1585) serialize_pandas round trip fails on integer columns

2017-10-07 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-1585:
---

Assignee: Phillip Cloud

> serialize_pandas round trip fails on integer columns
> 
>
> Key: ARROW-1585
> URL: https://issues.apache.org/jira/browse/ARROW-1585
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Phillip Cloud
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This roundtrip fails, since the Integer column isn't converted to a string 
> after deserializing
> {code:python}
> In [1]: import pandas as pd
> im
> In [2]: import pyarrow as pa
> In [3]: pa.deserialize_pandas(pa.serialize_pandas(pd.DataFrame({"0": [1, 
> 2]}))).columns
> Out[3]: Index(['0'], dtype='object')
> {code}
> That should be an {{ Int64Index([0]) }} for the columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ARROW-1586) [PYTHON] serialize_pandas roundtrip loses columns name

2017-10-07 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-1586.
-
Resolution: Fixed

Resolved in 
https://github.com/apache/arrow/commit/e31c2e376fb5df1d9143377b76b9a0d3f79ebbd4

> [PYTHON] serialize_pandas roundtrip loses columns name
> --
>
> Key: ARROW-1586
> URL: https://issues.apache.org/jira/browse/ARROW-1586
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Phillip Cloud
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> The serialize / deserialize roundtrip loses {{ df.columns.name }}
> {code:python}
> In [1]: import pandas as pd
> In [2]: import pyarrow as pa
> In [3]: df = pd.DataFrame([[1, 2]], columns=pd.Index(['a', 'b'], 
> name='col_name'))
> In [4]: df.columns.name
> Out[4]: 'col_name'
> In [5]: pa.deserialize_pandas(pa.serialize_pandas(df)).columns.name
> {code}
> Is this in scope for pyarrow? I suspect it would require an update to the 
> pandas section of the Schema metadata.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1585) serialize_pandas round trip fails on integer columns

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195888#comment-16195888
 ] 

ASF GitHub Bot commented on ARROW-1585:
---

Github user wesm commented on the issue:

https://github.com/apache/arrow/pull/1161
  
Thanks!


> serialize_pandas round trip fails on integer columns
> 
>
> Key: ARROW-1585
> URL: https://issues.apache.org/jira/browse/ARROW-1585
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.7.0
>Reporter: Tom Augspurger
>Assignee: Phillip Cloud
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This roundtrip fails, since the Integer column isn't converted to a string 
> after deserializing
> {code:python}
> In [1]: import pandas as pd
> im
> In [2]: import pyarrow as pa
> In [3]: pa.deserialize_pandas(pa.serialize_pandas(pd.DataFrame({"0": [1, 
> 2]}))).columns
> Out[3]: Index(['0'], dtype='object')
> {code}
> That should be an {{ Int64Index([0]) }} for the columns.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ARROW-1656) [C++] Endianness Macro is Incorrect on Windows

2017-10-07 Thread Phillip Cloud (JIRA)
Phillip Cloud created ARROW-1656:


 Summary: [C++] Endianness Macro is Incorrect on Windows
 Key: ARROW-1656
 URL: https://issues.apache.org/jira/browse/ARROW-1656
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.7.1
Reporter: Phillip Cloud
Assignee: Phillip Cloud
 Fix For: 0.8.0


This leads to all of the {{ToBigEndian}} implementations having no effect, when 
they should be byte-swapping.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ARROW-1652) [JS] Batch hint for Vector.get

2017-10-07 Thread Paul Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Taylor reassigned ARROW-1652:
--

Assignee: Paul Taylor

> [JS] Batch hint for Vector.get
> --
>
> Key: ARROW-1652
> URL: https://issues.apache.org/jira/browse/ARROW-1652
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: JavaScript
>Reporter: Brian Hulette
>Assignee: Paul Taylor
>  Labels: Performance
>
> The {{Vector.get}} function just accepts an index, and looks up the 
> appropriate record batch on every call. This can lead to a lot of additional 
> lookups when iterating by index. It would be nice if {{Vector.get}} accepted 
> an optional batch hint, similar to 
> [{{Vector.range}}|https://github.com/apache/arrow/blob/master/js/src/vector/typed.ts#L51]
> Additionally, if {{Table}} had some knowledge of the batches in its Vectors, 
> it could use this batch hint to improve performance when iterating over rows.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ARROW-1656) [C++] Endianness Macro is Incorrect on Windows And Mac

2017-10-07 Thread Phillip Cloud (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillip Cloud updated ARROW-1656:
-
Summary: [C++] Endianness Macro is Incorrect on Windows And Mac  (was: 
[C++] Endianness Macro is Incorrect on Windows)

> [C++] Endianness Macro is Incorrect on Windows And Mac
> --
>
> Key: ARROW-1656
> URL: https://issues.apache.org/jira/browse/ARROW-1656
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.7.1
>Reporter: Phillip Cloud
>Assignee: Phillip Cloud
> Fix For: 0.8.0
>
>
> This leads to all of the {{ToBigEndian}} implementations having no effect, 
> when they should be byte-swapping.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1656) [C++] Endianness Macro is Incorrect on Windows And Mac

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195934#comment-16195934
 ] 

ASF GitHub Bot commented on ARROW-1656:
---

Github user cpcloud commented on the issue:

https://github.com/apache/arrow/pull/1184
  
I have to do some more research to figure out the best way to handle this.


> [C++] Endianness Macro is Incorrect on Windows And Mac
> --
>
> Key: ARROW-1656
> URL: https://issues.apache.org/jira/browse/ARROW-1656
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.7.1
>Reporter: Phillip Cloud
>Assignee: Phillip Cloud
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This leads to all of the {{ToBigEndian}} implementations having no effect, 
> when they should be byte-swapping.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ARROW-1656) [C++] Endianness Macro is Incorrect on Windows And Mac

2017-10-07 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-1656:
--
Labels: pull-request-available  (was: )

> [C++] Endianness Macro is Incorrect on Windows And Mac
> --
>
> Key: ARROW-1656
> URL: https://issues.apache.org/jira/browse/ARROW-1656
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.7.1
>Reporter: Phillip Cloud
>Assignee: Phillip Cloud
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> This leads to all of the {{ToBigEndian}} implementations having no effect, 
> when they should be byte-swapping.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1631) [C++] Add GRPC to ThirdpartyToolchain.cmake

2017-10-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195936#comment-16195936
 ] 

ASF GitHub Bot commented on ARROW-1631:
---

Github user MaxRis commented on the issue:

https://github.com/apache/arrow/pull/1182
  
@wesm have changed `ARROW_WITH_GRPC` to be `OFF` by default. Also removed 
changes to ci build scripts.


>  [C++] Add GRPC to ThirdpartyToolchain.cmake
> 
>
> Key: ARROW-1631
> URL: https://issues.apache.org/jira/browse/ARROW-1631
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
> Environment: Windows, Linux, OS X
>Reporter: Max Risuhin
>Assignee: Max Risuhin
>  Labels: pull-request-available
>
> Building of GRPC library and linking with it should be supported by CMake 
> build scripts (ThirdpartyToolchain.cmake)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ARROW-633) [Java] Add support for FixedSizeBinary type

2017-10-07 Thread Wes McKinney (JIRA)

 [ 
https://issues.apache.org/jira/browse/ARROW-633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-633:
---
Fix Version/s: (was: 0.8.0)
   0.9.0

> [Java] Add support for FixedSizeBinary type
> ---
>
> Key: ARROW-633
> URL: https://issues.apache.org/jira/browse/ARROW-633
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Java - Vectors
>Reporter: Wes McKinney
>Assignee: Jingyuan Wang
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)