Kludex commented on issue #33176:
URL: https://github.com/apache/beam/issues/33176#issuecomment-2538955701

   Okay... Hi there again... 👋 
   
   Let me try to explain where I stand now, and I'm actually going to stop on 
my own, to ask advice.
   
   I've been trying to make it work in Dataflow. This is the pipeline I have in 
hands (for testing purposes):
   
   ```py
   class Split(beam.DoFn):
       def process(self, element: str):
           logfire.info(f"in split {element}")
           return element.split(" ")
   
   
   def logfire_print(element: str):
       with logfire.span("Print"):
           logfire.info(element)
           logfire.info("{globals}", globals=globals())
           logfire.info("{locals}", locals=locals())
           logfire.info("{traceback}", traceback=traceback.format_stack())
   
   
   pipeline_options = PipelineOptions()
   
   with Pipeline(options=pipeline_options) as pipeline:
       text = [
           "To be, or not to be: that is the question: ",
           "Whether 'tis nobler in the mind to suffer ",
           "The slings and arrows of outrageous fortune, ",
           "Or to take arms against a sea of troubles, ",
       ]
   
       pipeline = (
           pipeline
           | "Create" >> beam.Create(text)
           | "Split" >> beam.ParDo(Split())
           | "Filter" >> beam.Filter(lambda x: x != "the")
           | "Print" >> beam.Map(logfire_print)
       )
   ``` 
   
   What we want is to have a span that starts when we run the pipeline, i.e. on 
the machine that ran it:
   
   ```bash
    uv run python main.py \
       --region ... \
       --runner DataflowRunner \
       --project ... \
       --temp_location ... \
       --requirements_file ./requirements.txt \
       --save_main_session
   ```
   
   And we want to have a span being created when each step runs e.g. when 
`logfire_print` is called for each element, we want to have the `logfire_print` 
wrapped, and then any `logfire.info()` will be contained on this automatic span 
that was created.
   
   The problem is... Unfortunately, I can't monkeypatch anything on the worker 
side, because we actually pickle the function to send it to the worker. 
   
   <img width="883" alt="Screenshot 2024-12-12 at 14 28 07" 
src="https://github.com/user-attachments/assets/11a9758d-930c-4ba8-b9e3-aba9e64b16d7";
 />
   
   This is what I got ☝️
   
   I wanted `Print` to be inside the first big span...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to