[jira] [Created] (ARROW-374) Python: clarify unicode vs. binary in API

2016-11-08 Thread Jochen Ott (JIRA)
Jochen Ott created ARROW-374: Summary: Python: clarify unicode vs. binary in API Key: ARROW-374 URL: https://issues.apache.org/jira/browse/ARROW-374 Project: Apache Arrow Issue Type: Improvement

Re: Error Sending ArrowRecordBatch from Java to Python

2016-11-08 Thread Bryan Cutler
Thanks Wes! I'd be happy to help out with the effort - I'll look into the refs you mentioned. I updated SPARK-13534 with what we have so far, but not much has been done with the internal format conversion yet. Bryan On Tue, Nov 8, 2016 at 4:34 PM, Wes McKinney wrote: > Until we have integrati

Re: Error Sending ArrowRecordBatch from Java to Python

2016-11-08 Thread Wes McKinney
Until we have integration tests proving otherwise, you can assume that the IPC wire/file formats are currently incompatible (not on purpose). Both Julien and I are working on corresponding efforts in Java and C++ this week (e.g. see https://github.com/apache/arrow/pull/201 and ARROW-373), but it's

[jira] [Assigned] (ARROW-373) [C++] Implement C++ version of JSON file format for testing

2016-11-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-373: -- Assignee: Wes McKinney > [C++] Implement C++ version of JSON file format for testing > -

Error Sending ArrowRecordBatch from Java to Python

2016-11-08 Thread Bryan Cutler
Hi Devs, I'm currently working on SPARK-13534 to use Arrow in Spark DataFrame toPandas conversion and getting stuck with an invalid metadata size error trying to send a simple ArrowRecordBatch created in Java over a socket to Python. The strategy so far is like this: Java side: - make a simple A

Re: Question

2016-11-08 Thread Julien Le Dem
Hi all, Just to clarify. Yes Arrow intends to define network protocols. The file format is merely the network messages in a file. We are also looking into IPC. Inter-process communication using shared memory. On Thu, Nov 3, 2016 at 5:51 AM, Donald Foss wrote: > Abdulrahman, your schema diagram

[jira] [Commented] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15649014#comment-15649014 ] Julien Le Dem commented on ARROW-372: - The json representation of the schema is definer

[jira] [Commented] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648792#comment-15648792 ] Wes McKinney commented on ARROW-372: Can you show the schema also? Thanks > Create JSO

[jira] [Commented] (ARROW-373) [C++] Implement C++ version of JSON file format for testing

2016-11-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648718#comment-15648718 ] Wes McKinney commented on ARROW-373: Depends on ARROW-372 > [C++] Implement C++ versio

[jira] [Created] (ARROW-373) [C++] Implement C++ version of JSON file format for testing

2016-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-373: -- Summary: [C++] Implement C++ version of JSON file format for testing Key: ARROW-373 URL: https://issues.apache.org/jira/browse/ARROW-373 Project: Apache Arrow I

[jira] [Updated] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated ARROW-372: Description: {noformat} { "schema" : ..., "batches" : [{ "count" : 10, "columns" : [

[jira] [Updated] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated ARROW-372: Description: {noformat} { "schema" : ..., "batches" : [{ "count" : 10, "columns" : [

[jira] [Updated] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated ARROW-372: Description: {noformat} { "schema" : ..., "batches" : [{ "count" : 10, "columns" : [

[jira] [Updated] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated ARROW-372: Description: {noformat} { "schema" : ..., "batches" : [{ "count" : ..., "columns" : {

[jira] [Created] (ARROW-372) Create JSON arrow file format for integration tests

2016-11-08 Thread Julien Le Dem (JIRA)
Julien Le Dem created ARROW-372: --- Summary: Create JSON arrow file format for integration tests Key: ARROW-372 URL: https://issues.apache.org/jira/browse/ARROW-372 Project: Apache Arrow Issue Ty

[jira] [Commented] (ARROW-370) Python: Pandas conversion from `datetime.date` columns

2016-11-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15647890#comment-15647890 ] Wes McKinney commented on ARROW-370: I think we can do optimistic type inference on dty

[jira] [Updated] (ARROW-371) Python: Table with null timestamp becomes float in pandas

2016-11-08 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated ARROW-371: -- Affects Version/s: 0.1.0 > Python: Table with null timestamp becomes float in pandas > --

[jira] [Created] (ARROW-371) Python: Table with null timestamp becomes float in pandas

2016-11-08 Thread Jochen Ott (JIRA)
Jochen Ott created ARROW-371: Summary: Python: Table with null timestamp becomes float in pandas Key: ARROW-371 URL: https://issues.apache.org/jira/browse/ARROW-371 Project: Apache Arrow Issue Ty

[jira] [Updated] (ARROW-370) Python: Pandas conversion from `datetime.date` columns

2016-11-08 Thread Uwe L. Korn (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated ARROW-370: -- Component/s: Python > Python: Pandas conversion from `datetime.date` columns > --

[jira] [Created] (ARROW-370) Python: Pandas conversion from `datetime.date` columns

2016-11-08 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-370: - Summary: Python: Pandas conversion from `datetime.date` columns Key: ARROW-370 URL: https://issues.apache.org/jira/browse/ARROW-370 Project: Apache Arrow Issue Typ