Hi Joris,
  Thank you for your reply. I solved this problem by using 
‘table.rename_columns()’. By the way, according to the method description of 
rename_columns(), it will create a new table. Does this mean I need extra 
memory space to store the new table?

  Another question: Why the column name is automatically named as ['int', 
'int'] after reading from socket, instead of [‘int_1’, ‘int_2’] )?


Best regards,
maqy

发件人: Joris Van den Bossche
发送时间: 2020年5月11日 16:28
收件人: [email protected]
主题: Re: pyarrow.Table.to_pandas() raise ValueError: Found non-uniquecolumn index

Hi Maqy,

Can you rename the columns?

Currently, the to_pandas method does not support converting pyarrow Tables to 
pandas DataFrames if there are duplicate column names present.
I suppose that with some effort, it might be possible to support this, though, 
if someone is interested in looking into this.

Best,
Joris

On Mon, 11 May 2020 at 07:18, maqy <[email protected]> wrote:
 I use pyarrow to receive the arrow data sent from java, the data type is 
int(two columns). The python code I use is:
```
 client, addr = socket_server.accept()
 my_file = client.makefile(“rb”)
  
 reader = pa.RecordBatchStreamReader(my_file)
 talbe = reader.read_all()
 # raise ValueError
 df = table.to_pandas()  
```
    The reason for this problem is that the columns names of the table are 
['int', 'int']. How should I solve this problem?
 
Best regards,
maqy
 

Reply via email to