Removed "Create Blog" permission in the wiki for everyone but admins

2018-07-17 Thread Uwe L. Korn
Hello,

just as an FYI: I have removed the "Create Blog" permission in the Arrow wiki 
as this was a source of spam in the recent days. I don't think we're going to 
make use of the Blog feature in Confluence, so this should not be a problem.

Cheers
Uwe


Re: Arrow meetup in Hyderabad July 24

2018-07-17 Thread Wes McKinney
hey Kelly,

That's great -- FYI the e-mail that was generated by Meetup.com when
you created the event did not include the location. I got confused and
thought it was in SF/Bay Area. Maybe we should put the location in the
title of the meetup event?

As a second thing, the meetup group's front matter "Apache Arrow is
backed by key developers of 13 major open source projects, including
Calcite, Cassandra, Dremio, Drill, Hadoop, HBase, Ibis, Impala, Kudu,
Pandas, Parquet, Phoenix, Spark, and Storm making it the de-facto
standard for columnar in-memory analytics." is getting a bit out of
date. I don't know how many more open source projects we intersect
with at this point, but it's a lot. The scope of the project has also
expanded significantly (with the open standard for columnar data still
being a key selling point for the project).

On this subject, http://arrow.apache.org is due for a bit of sprucing
up of language as we near the 0.10.0 release.

Thanks for organizing these meetups!
Wes

On Mon, Jul 16, 2018 at 4:47 PM, Kelly Stirman  wrote:
> We're organizing a meetup in Hyderabad next week. Would anyone like to give
> a talk? Apologies, I know it's a long shot due to location and short notice
> (some of our Mountain View team will be visiting our team there who is
> working on Gandiva).
>
> https://www.meetup.com/Apache-Arrow-Meetup/events/252744998/
>
> This will be in HITEC City, so close to lots of engineering teams in case
> you have friends in the area you think would be interested.
>
> Thanks,
> Kelly


[jira] [Created] (ARROW-2872) [Python] Add pytest mark to opt into TensorFlow unit tests

2018-07-17 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2872:
---

 Summary: [Python] Add pytest mark to opt into TensorFlow unit tests
 Key: ARROW-2872
 URL: https://issues.apache.org/jira/browse/ARROW-2872
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.10.0


After pulling in ARROW-1744, I found this a little bit unfriendly: 

{code}
pyarrow/tests/test_plasma_tf_op.py::test_plasma_tf_op FAILED   
[ 82%]
>> captured stdout 
>> >>>
TensorFlow version: 1.8.0
Compiling Plasma TensorFlow Op...
>> captured stderr 
>> >>>
+++ dirname /home/wesm/code/arrow/python/pyarrow/tensorflow/build.sh
++ cd /home/wesm/code/arrow/python/pyarrow/tensorflow
++ pwd
+ PYARROW_TENSORFLOW_DIR=/home/wesm/code/arrow/python/pyarrow/tensorflow
++ python -c 'import tensorflow as tf; print(" 
".join(tf.sysconfig.get_compile_flags()))'
+ 
TF_CFLAGS='-I/home/wesm/miniconda/envs/arrow-dev/lib/python3.6/site-packages/tensorflow/include
 -D_GLIBCXX_USE_CXX11_ABI=0'
++ python -c 'import tensorflow as tf; print(" 
".join(tf.sysconfig.get_link_flags()))'
+ 
TF_LFLAGS='-L/home/wesm/miniconda/envs/arrow-dev/lib/python3.6/site-packages/tensorflow
 -ltensorflow_framework'
++ uname
+ '[' Linux == Darwin ']'
+ NDEBUG=-DNDEBUG
++ pkg-config --cflags --libs plasma arrow arrow-python
+ g++ -std=c++11 -g -shared 
/home/wesm/code/arrow/python/pyarrow/tensorflow/plasma_op.cc -o 
/home/wesm/code/arrow/python/pyarrow/tensorflow/plasma_op.so -DNDEBUG 
-I/home/wesm/local/include 
-I/home/wesm/miniconda/envs/arrow-dev/include/python3.6m 
-I/home/wesm/local/include -L/home/wesm/local/lib -lplasma -larrow_python 
-larrow -fPIC 
-I/home/wesm/miniconda/envs/arrow-dev/lib/python3.6/site-packages/tensorflow/include
 -D_GLIBCXX_USE_CXX11_ABI=0 
-L/home/wesm/miniconda/envs/arrow-dev/lib/python3.6/site-packages/tensorflow 
-ltensorflow_framework -O2
/home/wesm/code/arrow/python/pyarrow/tensorflow/plasma_op.cc:33:10: fatal 
error: arrow/adapters/tensorflow/convert.h: No such file or directory
 #include "arrow/adapters/tensorflow/convert.h"
  ^
compilation terminated.
> traceback 
> >>

use_gpu = False

@pytest.mark.plasma
def test_plasma_tf_op(use_gpu=False):
import pyarrow.plasma as plasma

>   plasma.build_plasma_tensorflow_op()

pyarrow/tests/test_plasma_tf_op.py:89: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ 
pyarrow/plasma.py:56: in build_plasma_tensorflow_op
subprocess.check_call(["bash", script_path])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ 

popenargs = (['bash', 
'/home/wesm/code/arrow/python/pyarrow/tensorflow/build.sh'],)
kwargs = {}, retcode = 1
cmd = ['bash', '/home/wesm/code/arrow/python/pyarrow/tensorflow/build.sh']

def check_call(*popenargs, **kwargs):
"""Run command with arguments.  Wait for command to complete.  If
the exit code was zero then return, otherwise raise
CalledProcessError.  The CalledProcessError object will have the
return code in the returncode attribute.

The arguments are the same as for the call function.  Example:

check_call(["ls", "-l"])
"""
retcode = call(*popenargs, **kwargs)
if retcode:
cmd = kwargs.get("args")
if cmd is None:
cmd = popenargs[0]
>   raise CalledProcessError(retcode, cmd)
E   subprocess.CalledProcessError: Command '['bash', 
'/home/wesm/code/arrow/python/pyarrow/tensorflow/build.sh']' returned non-zero 
exit status 1.

../../../miniconda/envs/arrow-dev/lib/python3.6/subprocess.py:291: 
CalledProcessError
 entering PDB 
 
> /home/wesm/miniconda/envs/arrow-dev/lib/python3.6/subprocess.py(291)check_call()
-> raise CalledProcessError(retcode, cmd)
{code}

If you pass {{-DARROW_PLASMA=ON}} but do not also pass 
{{-DARROW_TENSORFLOW=ON}} then this occurs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2871) [Python] Array.to_numpy is invalid for boolean arrays

2018-07-17 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2871:
---

 Summary: [Python] Array.to_numpy is invalid for boolean arrays
 Key: ARROW-2871
 URL: https://issues.apache.org/jira/browse/ARROW-2871
 Project: Apache Arrow
  Issue Type: Bug
  Components: Python
Reporter: Wes McKinney
Assignee: Wes McKinney
 Fix For: 0.10.0


This should raise for the time being:

{code}
In [1]: import pyarrow as pa

In [2]: arr = pa.array([True, False, True])

In [3]: arr
Out[3]: 

[
  True,
  False,
  True
]

In [4]: arr.to_numpy()
Out[4]: array([ True])
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2870) [Python] Define API for handling null markers from Array.to_numpy

2018-07-17 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2870:
---

 Summary: [Python] Define API for handling null markers from 
Array.to_numpy
 Key: ARROW-2870
 URL: https://issues.apache.org/jira/browse/ARROW-2870
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney


This is follow-up work for {{Arrow.to_numpy}} started in ARROW-564



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2869) [Python] Add documentation for Array.to_numpy

2018-07-17 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-2869:
---

 Summary: [Python] Add documentation for Array.to_numpy
 Key: ARROW-2869
 URL: https://issues.apache.org/jira/browse/ARROW-2869
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney
 Fix For: 0.10.0


See patch for ARROW-564

As part of this we should probably make a "Using pyarrow with NumPy" section in 
the docs (there is one for pandas already)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2868) [Packaging] Fix centos-7 build

2018-07-17 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-2868:
--

 Summary: [Packaging] Fix centos-7 build
 Key: ARROW-2868
 URL: https://issues.apache.org/jira/browse/ARROW-2868
 Project: Apache Arrow
  Issue Type: Task
  Components: Packaging
Reporter: Krisztian Szucs
 Fix For: 0.10.0


This is the only failing build: 
https://travis-ci.org/kszucs/crossbow/builds/404781005



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2867) [Python] Incorrect example for Cython usage

2018-07-17 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2867:
-

 Summary: [Python] Incorrect example for Cython usage
 Key: ARROW-2867
 URL: https://issues.apache.org/jira/browse/ARROW-2867
 Project: Apache Arrow
  Issue Type: Bug
  Components: Documentation, Python
Affects Versions: 0.9.0
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou


When blindly pasting the Cython distutils example, one might get the following 
error:
{code}
Traceback (most recent call last):
  File "setup.py", line 20, in 
ext_modules=ext_modules,
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/core.py", 
line 148, in setup
dist.run_commands()
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/dist.py", 
line 955, in run_commands
self.run_command(cmd)
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/dist.py", 
line 974, in run_command
cmd_obj.run()
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/command/build_ext.py",
 line 339, in run
self.build_extensions()
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/command/build_ext.py",
 line 448, in build_extensions
self._build_extensions_serial()
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/command/build_ext.py",
 line 473, in _build_extensions_serial
self.build_extension(ext)
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/command/build_ext.py",
 line 558, in build_extension
target_lang=language)
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/ccompiler.py", 
line 717, in link_shared_object
extra_preargs, extra_postargs, build_temp, target_lang)
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/unixccompiler.py",
 line 159, in link
libraries)
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/ccompiler.py", 
line 1089, in gen_lib_options
lib_opts.append(compiler.library_dir_option(dir))
  File 
"/home/antoine/miniconda3/envs/pyarrow/lib/python3.6/distutils/unixccompiler.py",
 line 207, in library_dir_option
return "-L" + dir
TypeError: must be str, not list
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)