Hi, The model used should be decided in accordance from where the audio being transcribed originates. If the audio is not specific to one of the alternatives models, choosing the default model would be appropriate. You can find a breakdown of each model <https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig> in the request configuration documentation. If you do want to use speaker diarization <https://cloud.google.com/speech-to-text/docs/multiple-voices#speaker_diarization>, it’s only available for the phone_call model.
Regarding the audio length, whether you chose the default, video or phone model, you shouldn’t have any problems transcribing a 3 hour long audio. For audio length, the content limits <https://cloud.google.com/speech-to-text/quotas#content> depend on the request type rather than the model chosen. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/e6388ee4-b017-4294-8fa6-b7675c9e7684%40googlegroups.com.