[ 
https://issues.apache.org/jira/browse/ARROW-640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16394627#comment-16394627
 ] 

Alex Hagerman edited comment on ARROW-640 at 3/11/18 9:02 PM:
--------------------------------------------------------------

I think this has changed since the original ticket. The comparison appears to 
be working. Tested this with string and numbers. Also getting an error on set 
now. Going to continue looking into this, but if anybody has thoughts on this 
I'd be happy to hear them. Also from_pylist appears to have been removed, but I 
didn't find it searching the change log on github only an addition in 0.3. I'm 
going to look at the history of __eq__ on ArrayValue and as_py then work on 
what would make sense for __hash__.
{code:java}
%load_ext Cython
import pyarrow as pa

pylist = [1,1,1,2]
arr = pa.array(pylist)
arr
<pyarrow.lib.Int64Array object at 0x7fbad56e4c28>
[
  1,
  1,
  1,
  2
]
arr[0] == arr[1]
True
arr[0] == arr[3]
False
word_list = ['test', 'not the same', 'test', 'nope']
word_list[0] == word_list[2]
True
word_list[0] == word_list[1]
False
pa.array.__eq__
<method-wrapper '__eq__' of builtin_function_or_method object at 0x7fbaab609990>
set(arr)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-ba21c71e79f9> in <module>()
----> 1 set(arr)

TypeError: unhashable type: 'pyarrow.lib.Int64Value'
arr_list = pa.from_pylist([1, 1, 1, 2])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-30966022c9ed> in <module>()
----> 1 arr_list = pa.from_pylist([1, 1, 1, 2])

AttributeError: module 'pyarrow' has no attribute 'from_pylist'
{code}
 


was (Author: alexhagerman):
I think this has changed since the original ticket. The comparison appears to 
be working. Tested this with string and numbers. Also getting an error on set 
now. Going to continue looking into this, but if anybody has thoughts on this 
I'd be happy to hear them. Also from_pylist appears to have been removed, but I 
didn't find it searching the change log on github only an addition in 0.3. I'm 
going to look at the history or __eq__ on the ScalarValue and as_py then work 
on what would make sense for __hash__.
{code:java}
%load_ext Cython
import pyarrow as pa

pylist = [1,1,1,2]
arr = pa.array(pylist)
arr
<pyarrow.lib.Int64Array object at 0x7fbad56e4c28>
[
  1,
  1,
  1,
  2
]
arr[0] == arr[1]
True
arr[0] == arr[3]
False
word_list = ['test', 'not the same', 'test', 'nope']
word_list[0] == word_list[2]
True
word_list[0] == word_list[1]
False
pa.array.__eq__
<method-wrapper '__eq__' of builtin_function_or_method object at 0x7fbaab609990>
set(arr)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-ba21c71e79f9> in <module>()
----> 1 set(arr)

TypeError: unhashable type: 'pyarrow.lib.Int64Value'
arr_list = pa.from_pylist([1, 1, 1, 2])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-30966022c9ed> in <module>()
----> 1 arr_list = pa.from_pylist([1, 1, 1, 2])

AttributeError: module 'pyarrow' has no attribute 'from_pylist'
{code}
 

> [Python] Arrow scalar values should have a sensible __hash__ and comparison
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-640
>                 URL: https://issues.apache.org/jira/browse/ARROW-640
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Miki Tebeka
>            Assignee: Alex Hagerman
>            Priority: Major
>             Fix For: 0.10.0
>
>
> {noformat}
> In [86]: arr = pa.from_pylist([1, 1, 1, 2])
> In [87]: set(arr)
> Out[87]: {1, 2, 1, 1}
> In [88]: arr[0] == arr[1]
> Out[88]: False
> In [89]: arr
> Out[89]: 
> <pyarrow.array.Int64Array object at 0x7f8c8c739e08>
> [
>   1,
>   1,
>   1,
>   2
> ]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to