Thanks for the update! I also was not able to repro, so presumably something is fixed? :-)
Thanks, Evan On Mon, Mar 21, 2022 at 8:40 PM Valentyn Tymofieiev <[email protected]> wrote: > I came across this thread and wasn't able to reproduce the `expecting a KV > coder, but had Strings` error, so hopefully that's fixed now. I had to > modify the repro to add .with_outputs() to the line 49 in > https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc > > On Mon, Sep 27, 2021 at 5:58 PM Robert Bradshaw <[email protected]> > wrote: > >> As a workaround, can you try passing the use_portable_job_submission >> experiment? >> >> On Mon, Sep 27, 2021 at 2:19 PM Luke Cwik <[email protected]> wrote: >> > >> > Sorry, I forgot that you had a minimal repro for this issue, I attached >> details to the internal bug. >> > >> > On Mon, Sep 27, 2021 at 2:18 PM Luke Cwik <[email protected]> wrote: >> >> >> >> There is an internal bug 195053987 that matches what you're describing >> but we were unable able to get a minimal repro for it. It would be useful >> if you had a minimal repro for the issue that I could update the internal >> bug with details and/or you could reach out to GCP support with job ids >> and/or minimal repros to get support as well. >> >> >> >> On Wed, Sep 22, 2021 at 6:57 AM Evan Galpin <[email protected]> >> wrote: >> >>> >> >>> Thanks for the response Luke :-) >> >>> >> >>> I did try setting <pcoll>.element_type for each resulting PCollection >> using "apache_beam.typehints.typehints.KV" to describe the elements, which >> passed type checking. I also ran the full dataset (batch job) without the >> GBK in question but instead using a dummy DoFn in its place which asserted >> that every element that would be going into the GBK was a 2-tuple, along >> with using --runtime_type_check, all of which run successfully without the >> GBK after the TaggedOutput DoFn. >> >>> >> >>> Adding back the GBK also runs end-to-end successfully on the >> DirectRunner using the identical dataset. But as soon as I add the GBK and >> use the DataflowRunner (v2), I get errors as soon as the optimized step >> involving the GBK is in the "running" status: >> >>> >> >>> - "Could not start worker docker container" >> >>> - "Error syncing pod" >> >>> - "Check failed: pair_coder Strings" or "Check failed: kv_coder : >> expecting a KV coder, but had Strings" >> >>> >> >>> Anything further to try? I can also provide Job IDs from Dataflow if >> helpful (and safe to share). >> >>> >> >>> Thanks, >> >>> Evan >> >>> >> >>> On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <[email protected]> wrote: >> >>>> >> >>>> Have you tried setting the element_type[1] explicitly on each output >> PCollection that is returned after applying the multi-output ParDo? >> >>>> I believe you'll get a DoOutputsTuple[2] returned after applying the >> mult-output ParDo which allows access to the underlying PCollection objects. >> >>>> >> >>>> 1: >> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99 >> >>>> 2: >> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234 >> >>>> >> >>>> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <[email protected]> >> wrote: >> >>>>> >> >>>>> This is badly plaguing a pipeline I'm currently developing, where >> the exact same data set and code runs end-to-end on DirectRunner, but fails >> on DataflowRunner with either "Check failed: kv_coder : expecting a KV >> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the >> harness logs. It seems to be consistently repeatable with any TaggedOutput >> + GBK afterwards. >> >>>>> >> >>>>> Any advice on how to proceed? >> >>>>> >> >>>>> Thanks, >> >>>>> Evan >> >>>>> >> >>>>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <[email protected]> >> wrote: >> >>>>>> >> >>>>>> The Dataflow error logs only showed 1 error which was: "The job >> failed because a work item has failed 4 times. Look in previous log entries >> for the cause of each one of the 4 failures. For more information, see >> https://cloud.google.com/dataflow/docs/guides/common-errors. The work >> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c >> Root cause: The worker lost contact with the service." In "Diagnostics" >> there were errors stating "Error syncing pod: Could not start worker docker >> container". The harness logs i.e. "projects/my-project/logs/ >> dataflow.googleapis.com%2Fharness" finally contained an error that >> looked suspect, which was "Check failed: kv_coder : expecting a KV coder, >> but had Strings", below[1] is a link to possibly a stacktrace or extra >> detail, but is internal to google so I don't have access. >> >>>>>> >> >>>>>> [1] >> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4 >> >>>>>> >> >>>>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw < >> [email protected]> wrote: >> >>>>>>> >> >>>>>>> Huh, that's strange. Yes, the exact error on the service would be >> helpful. >> >>>>>>> >> >>>>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin < >> [email protected]> wrote: >> >>>>>>> > >> >>>>>>> > Thanks for the response. I've created a gist here to >> demonstrate a minimal repro: >> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc >> >>>>>>> > >> >>>>>>> > It seemed to run fine both on DirectRunner and PortableRunner >> (embed mode), but Dataflow v2 runner raised an error at runtime seemingly >> associated with the Shuffle service? I have job IDs and trace links if >> those are helpful as well. >> >>>>>>> > >> >>>>>>> > Thanks, >> >>>>>>> > Evan >> >>>>>>> > >> >>>>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw < >> [email protected]> wrote: >> >>>>>>> >> >> >>>>>>> >> This is not yet supported. Using a union for now is the way to >> go. (If >> >>>>>>> >> only the last value of the union was used, that sounds like a >> bug. Do >> >>>>>>> >> you have a minimal repro?) >> >>>>>>> >> >> >>>>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin < >> [email protected]> wrote: >> >>>>>>> >> > >> >>>>>>> >> > Hi all, >> >>>>>>> >> > >> >>>>>>> >> > What is the recommended way to write type hints for a tagged >> output DoFn where the outputs to different tags have different types? >> >>>>>>> >> > >> >>>>>>> >> > I tried using a Union to describe each of the possible >> output types, but that resulted in mismatched coder errors where only the >> last entry in the Union was used as the assumed type. Is there a way to >> associate a type hint to a tag or something like that? >> >>>>>>> >> > >> >>>>>>> >> > Thanks, >> >>>>>>> >> > Evan >> >
