p-ortmann opened a new issue, #41667:
URL: https://github.com/apache/arrow/issues/41667

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Platform MacOs 14.5 (23F79)
   Version: 15.0.2 and 16.1.0.
   ```python
   import pyarrow as pa
   import pandas as pd
   import pyarrow.parquet as pq
   data = {
       'column1': [1, 2, None],
       'column2': ['a', None, 'c']
   }
   
   schema = pa.schema([
       pa.field('column1', pa.int64(), nullable=True),
       pa.field('column2', pa.string(), nullable=False)  # make column2 not 
nullable
   ])
   
   table = pa.Table.from_pydict(data, schema=schema) # set up table with data 
that doesn't the schema
   assert table.schema.equals(pa.schema(schema))
   
   print('table before writing \n')
   print(table.to_pandas())
   pq.write_table(table, 'output.parquet')
   
   table = pq.read_table('output.parquet')
   
   print('table after writing and reading \n')
   print(table.to_pandas())
   ```
   
   yields 
   
   ```
   table before writing 
   
      column1 column2
   0      1.0       a
   1      2.0    None
   2      NaN       c
   table after writing and reading 
   
      column1 column2
   0      1.0       a
   1      2.0       c
   2      NaN       a
   ```
   which is not correct for column 2.
   
   I would expect this to fail on set up of the table, which is what happens if 
you replace 
   ```python 
   table = pa.Table.from_pydict(data, schema=schema)
   ``` 
   with
   ```python
   dataframe = pd.DataFrame(data)
   table = pa.Table.from_pandas(dataframe, schema=schema)
   ```
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to