Re: [I] [SIP-161] Translating Superset asset data [superset]

via GitHub Tue, 08 Apr 2025 22:28:14 -0700


pomegranited commented on issue #32854:
URL: https://github.com/apache/superset/issues/32854#issuecomment-2788318466


   Thank you for your feedback @mistercrunch !
   
   > ### Clarifying the use of string_ids and/or use strings as keys in 
i18n_translations.
   
   My thinking here was to avoid having these jinja filters in the actual 
fields, and wrap the entire string for templated fields at runtime.
   
   This gives us a few advantages which I detailed in the final point under 
"Rejected Alternatives". What do you think?
   
   > ### About caching / About batch-retrieval
   
   Ah good point that batch-retrieval probably won't work for a jinja filter! I 
have added a `I18N_ASSET_TRANSLATIONS_CACHE_CONFIG` setting to the description 
above, so operators can configure caching if desired.
   
   > ### About auto-populating translation through a service
   > In the age of AI, I think no one should manually translate stuff anymore. 
   
   Agreed, but we also need to be able to correct what the AI gets wrong, so I 
don't think we can ditch the UI completely.
   
   > The jinja macro, as it encounters new strings, will auto-populate rows in 
i18n_translations and flag them as "to-be-translated".
   
   I don't think we need to flag them -- the absence of a translation for a 
given language is enough to know it needs to be translated.
   
   >  It seems writing a simple job, or generating async jobs to call a service 
using celery-beat (a cron that can run every N minutes that looks for new 
string, batches a call to GPT or google translate).
   
   Someone can write this for sure -- but I won't do it as part of this SIP 
because of the open source issue. I am providing a command-line tool for 
extraction/importing translations though, so will ensure that API I create is 
general enough for someone to use it with a machine translation service.
   
   > Even if we do want/need a UI for translation, i'd suggest building it 
externally to Superset maybe (similar to the poedit approach), so we'd only 
have to expose a CRUD REST API (if even) as you could slap a simple UI on top 
(Claude can probably build one quickly, or even plug a simple no-code solution 
like ReTool or AirTable on top of that table/model). It punts this UI 
complexity outside of Superset.
   
   I'm definitely in favour of a simple UI for these translations. But there's 
a huge advantage to having an in-context UI when humans are translating these 
strings, and I'd like to preserve that.
   
   > ### About garbage collection
   > Lots of strings might get orphaned, and can be good to trim the table. 
That would require maintaining some sort of "last_used_dttm" thing, but that's 
more IO to manage.
   
   I think we can trim the table without recording "last used" -- since we're 
storing the asset UUID + field and model names, we can easily identify 
translated strings that are no longer used. I've added a 2nd "command line 
tool" to handle this for Phase 1 -- operators can run/cron this as needed.
   
   > ### About Jinja overhead
   > ...simply not bothering with doing the template logic if/when we're in a 
non-i18n environment.
   
   True.. I've updated the description here to state that this feature will be 
enabled/used only if the feature flag + multiple languages are enabled in 
Superset.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Re: [I] [SIP-161] Translating Superset asset data [superset]

Reply via email to