ng-oliver commented on issue #25946:
URL: https://github.com/apache/beam/issues/25946#issuecomment-2980482411

   > hello - i'd also like to bump this
   > 
   > whenever the `schema` argument provided contains `timestamp`, `date, 
`datetime` the SDK will assume the data has the micro attribute which isn't the 
case in python native datetime
   > 
   > 
[beam/sdks/python/apache_beam/utils/timestamp.py](https://github.com/apache/beam/blob/99202b237e364bf77f40b6da0ec22cb7b17c37d0/sdks/python/apache_beam/utils/timestamp.py#L63)
   > 
   > Line 63 in 
[99202b2](/apache/beam/commit/99202b237e364bf77f40b6da0ec22cb7b17c37d0)
   > 
   >  self.micros = int(seconds * 1000000) + int(micros) 
   > The legacy `method=STREAMING_INSERT` is 10x easier to use as it handles 
reading JSON and writing to BigQuery seamlessly. Appreciate an upgrade on 
`method=STORAGE_API` to align with Google's [pricing 
incentive](https://cloud.google.com/bigquery/pricing#data_ingestion_pricing) 
that `stroage_api` has 2TB free ingestion per month and is half price from 
`streaming_insert` on a pay-as-you-go basis.
   
   i have a temporary workaround that, for each bigquery timestamp column in my 
json, convert it to the Beam Timestamp class.
   
   The json would flow smoothly into WriteToBigQuery(method = 
'STORAGE_WRITE_API')
   
   ```
   class ParseJsonLineFn(beam.DoFn):
       def __init__(self, convert_to_beam_timestamp=False):
           self.convert_to_beam_timestamp = convert_to_beam_timestamp
   
       def process(self, line):
           if not line.strip():
               return  # Skip empty lines
   
           try:
               json_obj = json.loads(line)
   
               # Timestamp conversion
               if self.convert_to_beam_timestamp:
                   for field in ["timestamp_1", "timestamp_2"]:
                       if field in json_obj:
                           json_obj[field] = 
parse_to_beam_timestamp(json_obj[field])
   
               yield json_obj
   
           except Exception as e:
               logging.error(f"Failed to parse JSON line: {e}")
               raise
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to