[jira] [Created] (ARROW-2544) [CI] Run C++ tests with two jobs on Travis-CI

2018-05-07 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2544:
-

 Summary: [CI] Run C++ tests with two jobs on Travis-CI
 Key: ARROW-2544
 URL: https://issues.apache.org/jira/browse/ARROW-2544
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Antoine Pitrou
Assignee: Omer Katz


See https://github.com/apache/arrow/pull/1899



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Writing empty strings to parquet files

2018-05-07 Thread scarrascoso
Hi Wes:

Thanks for your message.

I would say that both test_pandas_parquet_1_0_rountrip and 
test_pandas_parquet_2_0_rountrip (in 
arrow/python/pyarrow/tests/test_parquet.py) already test this.
Sorry I didn’t realize this sooner.

All the best,

Sergio Carrascoso

> On 5 May 2018, at 01:31, Wes McKinney  wrote:
> 
> Thanks Sergio. If we don't have any unit tests explicitly testing
> this, it would be a good idea to add some anyway.
> 
> - Wes
> 
> On Fri, May 4, 2018 at 12:26 PM,   wrote:
>> Hi Uwe:
>> 
>> Thanks a lot for your feedback.
>> 
>> While preparing a simple example to reproduce this issue, I have been able 
>> to get the expected behavior (empty strings properly written as ‘’ in the 
>> parquet file).
>> So actually there’s no problem with the Parquet.write_table
>> 
>> The problem was rather in a bug whereas two steps in my process were in the 
>> wrong order, so None values were being applied unicode formatting earlier 
>> than expected, thus becoming ‘None’.
>> 
>> Again, thank you very much and apologies for the noise.
>> 
>> Best,
>> 
>> Sergio Carrascoso
>> 
>>> On 4 May 2018, at 10:54, Uwe L. Korn  wrote:
>>> 
>>> Hello Sergio,
>>> 
>>> this is definitely unwanted behaviour. Can you open an issue on 
>>> https://issues.apache.org/jira/projects/PARQUET and provide a minimal 
>>> reproducing example. There is definitely a difference between empty strings 
>>> and null strings. Parquet also supports the differentiation thus we should 
>>> support roundtripping them.
>>> 
>>> Uwe
>>> 
>>> On Thu, May 3, 2018, at 8:47 AM, scarrasc...@ravenpack.com wrote:
 
 Hi:
 
 I would like to know if there is any way in PyArrow to write empty
 string values to a parquet file.
 When I use Parquet.write_table, if any column contains empty string
 values, they end up as None in the parquet file.
 My process depends on these values to be properly written as empty
 strings in the parquet files.
 
 To provide some context, my current worflow is the following:
 
 - Read content from json files (using Pandas.read_json)
 - Convert the corresponding dataframe to a PyArrow table (using
 PyArrow.Table.from_pandas)
 - Finally, write the table to a parquet file (using Parquet.write_table)
 
 I have done some checks during the process, and the empty string values
 are being honored until the writing step to a parquet file.
 
 The options for the write_table method don't provide any specific for
 this, is this behavior (write '' as None) an unavoidable default?
 Is there any other way to write the parquet files where I have more
 options to deal with this?
 
 Any hint or feedback will be greatly appreciated.
 
 Thanks a lot in advance, all the best.
 
 Sergio Carrascoso
 
>> 



[jira] [Created] (ARROW-2545) [Python] Arrow fails linking against statically-compiled Python

2018-05-07 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2545:
-

 Summary: [Python] Arrow fails linking against statically-compiled 
Python
 Key: ARROW-2545
 URL: https://issues.apache.org/jira/browse/ARROW-2545
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Affects Versions: 0.9.0
Reporter: Antoine Pitrou


See 
https://issues.apache.org/jira/browse/ARROW-1661?focusedCommentId=16462745&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16462745
 : to link statically against {{libpythonXX.a}}, you need to add in some system 
libraries such as {{libutil}}. Otherwise some symbols end up unresolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2546) [CI] Intermittent npm failures

2018-05-07 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2546:
-

 Summary: [CI] Intermittent npm failures
 Key: ARROW-2546
 URL: https://issues.apache.org/jira/browse/ARROW-2546
 Project: Apache Arrow
  Issue Type: Bug
  Components: Continuous Integration, JavaScript
Reporter: Antoine Pitrou


See for example https://travis-ci.org/apache/arrow/jobs/375891278 .

{code}
npm WARN deprecated gulp-util@3.0.8: gulp-util is deprecated - replace it, 
following the guidelines at https://medium.com/gulpjs/gulp-util-ca3b1f9f9ac5
npm WARN deprecated standard-format@1.6.10: standard-format is deprecated in 
favor of a built-in autofixer in 'standard'. Usage: standard --fix
npm WARN deprecated minimatch@2.0.10: Please update to minimatch 3.0.2 or 
higher to avoid a RegExp DoS issue
npm WARN tar ENOENT: no such file or directory, open 
'/home/travis/build/apache/arrow/js/node_modules/.staging/google-closure-compiler-2d7bab98/contrib/externs/maps/google_maps_api_v3_23.js'
npm WARN ajv-keywords@3.2.0 requires a peer of ajv@^6.0.0 but none is 
installed. You must install peer dependencies yourself.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.3 
(node_modules/fsevents):
npm WARN enoent SKIPPING OPTIONAL DEPENDENCY: ENOENT: no such file or 
directory, rename 
'/home/travis/build/apache/arrow/js/node_modules/.staging/fsevents-5f35bbaf/node_modules/abbrev'
 -> '/home/travis/build/apache/arrow/js/node_modules/.staging/abbrev-e214f964'
npm ERR! code EINTEGRITY
npm ERR! 
sha512-bqB1yS6o9TNA9ZC/MJxM0FZzPnZdtHj0xWK/IZ5khzVqdpGul/R/EIiHRgFXlwTD7PSIaYVnGKq1QgMCu2mnqw==
 integrity checksum failed when using sha512: wanted 
sha512-bqB1yS6o9TNA9ZC/MJxM0FZzPnZdtHj0xWK/IZ5khzVqdpGul/R/EIiHRgFXlwTD7PSIaYVnGKq1QgMCu2mnqw==
 but got 
sha512-kgTmj+eAwkxGNzcVy5l66pJ3Exmxgj4IdQQ5fK53JTbfThLZFQybsk64V8pq2MMKXcqkkU6/0gGHXKbURv065w==.
 (4688848 bytes)
npm ERR! A complete log of this run can be found in:
npm ERR! /home/travis/.npm/_logs/2018-05-07T13_34_45_558Z-debug.log
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2547) [Format] Fix off-by-one in List> example

2018-05-07 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2547:
--

 Summary: [Format] Fix off-by-one in List> example
 Key: ARROW-2547
 URL: https://issues.apache.org/jira/browse/ARROW-2547
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Format
Reporter: Uwe L. Korn
 Fix For: 0.10.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2548) [Format] Clarify `List` Array example

2018-05-07 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-2548:
--

 Summary: [Format] Clarify `List` Array example
 Key: ARROW-2548
 URL: https://issues.apache.org/jira/browse/ARROW-2548
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Format
Reporter: Uwe L. Korn
 Fix For: 0.10.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2549) [GLib] Apply arrow::StatusCodes changes to GArrowError

2018-05-07 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2549:
---

 Summary: [GLib] Apply arrow::StatusCodes changes to GArrowError
 Key: ARROW-2549
 URL: https://issues.apache.org/jira/browse/ARROW-2549
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Affects Versions: 0.9.0
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou
 Fix For: 0.10.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2550) [C++] Add missing status codes into arrow::StatusCode::CodeAsString()

2018-05-07 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2550:
---

 Summary: [C++] Add missing status codes into 
arrow::StatusCode::CodeAsString()
 Key: ARROW-2550
 URL: https://issues.apache.org/jira/browse/ARROW-2550
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Affects Versions: 0.9.0
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou
 Fix For: 0.10.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)