[jira] [Commented] (ARROW-15645) [Flight][Java][C++] Data read through Flight is having endianness issue on s390x

2022-02-25 Thread Ravi Gummadi (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17498004#comment-17498004
 ] 

Ravi Gummadi commented on ARROW-15645:
--

Thanks for the details [~apitrou] . I will watch 
[https://issues.apache.org/jira/projects/ARROW/issues/ARROW-15778] and test on 
my environment once a fix for 15778 is available.

> [Flight][Java][C++] Data read through Flight is having endianness issue on 
> s390x
> 
>
> Key: ARROW-15645
> URL: https://issues.apache.org/jira/browse/ARROW-15645
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC, Java
>Affects Versions: 5.0.0
> Environment: Linux s390x (big endian)
>Reporter: Ravi Gummadi
>Priority: Major
>
> Am facing an endianness issue on s390x(big endian) when converting the data 
> read through flight to pandas data frame.
> (1) table.validate() fails with error
> {code}
> Traceback (most recent call last):
>   File "/tmp/2.py", line 51, in 
>     table.validate()
>   File "pyarrow/table.pxi", line 1232, in pyarrow.lib.Table.validate
>   File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
> pyarrow.lib.ArrowInvalid: Column 1: In chunk 0: Invalid: Negative offsets in 
> binary array
> {code}
> (2) table.to_pandas() gives a segmentation fault
> 
> Here is a sample code that I am using:
> {code:python}
> from pyarrow import flight
> import os
> import json
> flight_endpoint = os.environ.get("flight_server_url", 
> "grpc+tls://...local:443")
> print(flight_endpoint)
> #
> class TokenClientAuthHandler(flight.ClientAuthHandler):
>     """An example implementation of authentication via handshake.
>        With the default constructor, the user token is read from the 
> environment: TokenClientAuthHandler().
>        You can also pass a user token as parameter to the constructor, 
> TokenClientAuthHandler(yourtoken).
>     """
>     def \_\_init\_\_(self, token: str = None):
>         super().\_\_init\__()
>         if( token != None):
>             strToken = strToken = 'Bearer {}'.format(token)
>         else:
>             strToken = 'Bearer {}'.format(os.environ.get("some_auth_token"))
>         self.token = strToken.encode('utf-8')
>         #print(self.token)
>     def authenticate(self, outgoing, incoming):
>         outgoing.write(self.token)
>         self.token = incoming.read()
>     def get_token(self):
>         return self.token
>     
> readClient = flight.FlightClient(flight_endpoint)
> readClient.authenticate(TokenClientAuthHandler())
> cmd = json.dumps(\{...})
> descriptor = flight.FlightDescriptor.for_command(cmd)
> flightInfo = readClient.get_flight_info(descriptor)
> reader = readClient.do_get(flightInfo.endpoints[0].ticket)
> table = reader.read_all()
> print(table)
> print(table.num_columns)
> print(table.num_rows)
> table.validate()
> table.to_pandas()
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (ARROW-15645) [Flight][Java][C++] Data read through Flight is having endianness issue on s390x

2022-02-24 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497458#comment-17497458
 ] 

Antoine Pitrou commented on ARROW-15645:


If my diagnosis above is correct, then this is really caused by ARROW-15778.

You could work it around by disable endianness conversion on the Flight client 
side, but unfortunately that is not exposed in Python (see ARROW-15777).

> [Flight][Java][C++] Data read through Flight is having endianness issue on 
> s390x
> 
>
> Key: ARROW-15645
> URL: https://issues.apache.org/jira/browse/ARROW-15645
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, FlightRPC, Java
>Affects Versions: 5.0.0
> Environment: Linux s390x (big endian)
>Reporter: Ravi Gummadi
>Priority: Major
>
> Am facing an endianness issue on s390x(big endian) when converting the data 
> read through flight to pandas data frame.
> (1) table.validate() fails with error
> {code}
> Traceback (most recent call last):
>   File "/tmp/2.py", line 51, in 
>     table.validate()
>   File "pyarrow/table.pxi", line 1232, in pyarrow.lib.Table.validate
>   File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
> pyarrow.lib.ArrowInvalid: Column 1: In chunk 0: Invalid: Negative offsets in 
> binary array
> {code}
> (2) table.to_pandas() gives a segmentation fault
> 
> Here is a sample code that I am using:
> {code:python}
> from pyarrow import flight
> import os
> import json
> flight_endpoint = os.environ.get("flight_server_url", 
> "grpc+tls://...local:443")
> print(flight_endpoint)
> #
> class TokenClientAuthHandler(flight.ClientAuthHandler):
>     """An example implementation of authentication via handshake.
>        With the default constructor, the user token is read from the 
> environment: TokenClientAuthHandler().
>        You can also pass a user token as parameter to the constructor, 
> TokenClientAuthHandler(yourtoken).
>     """
>     def \_\_init\_\_(self, token: str = None):
>         super().\_\_init\__()
>         if( token != None):
>             strToken = strToken = 'Bearer {}'.format(token)
>         else:
>             strToken = 'Bearer {}'.format(os.environ.get("some_auth_token"))
>         self.token = strToken.encode('utf-8')
>         #print(self.token)
>     def authenticate(self, outgoing, incoming):
>         outgoing.write(self.token)
>         self.token = incoming.read()
>     def get_token(self):
>         return self.token
>     
> readClient = flight.FlightClient(flight_endpoint)
> readClient.authenticate(TokenClientAuthHandler())
> cmd = json.dumps(\{...})
> descriptor = flight.FlightDescriptor.for_command(cmd)
> flightInfo = readClient.get_flight_info(descriptor)
> reader = readClient.do_get(flightInfo.endpoints[0].ticket)
> table = reader.read_all()
> print(table)
> print(table.num_columns)
> print(table.num_rows)
> table.validate()
> table.to_pandas()
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)