[Wikitech-l] Re: ORES To Lift Wing Migration

Aaron Halfaker Fri, 22 Sep 2023 09:01:24 -0700

It looks like model_info is not implemented at all.  E.g.
https://ores-legacy.wikimedia.org/v3/scores/enwiki?model_info=statistics.thresholds.true.%22maximum+recall+@+precision+%3E=+0.9%22&models=damaging


I get {"detail":{"error":{"code":"bad request","message":"model_info query
parameter is not supported by this endpoint anymore. For more information
please visit https://wikitech.wikimedia.org/wiki/ORES"}}}

But when I go to that page, nothing discusses model_info.  Is there a way
to get this from LiftWing?

On Fri, Sep 22, 2023 at 8:53 AM Aaron Halfaker <aaron.halfa...@gmail.com>
wrote:

> Do you have a tag for filing bugs against ORES-legacy?  I can't seem to
> find a relevant one in phab.
>
> On Fri, Sep 22, 2023 at 8:39 AM Luca Toscano <ltosc...@wikimedia.org>
> wrote:
>
>> Hi Aaron!
>>
>> Thanks for following up. The API is almost compatible with what ORES
>> currently does, but there are limitations (like the max number of revisions
>> in a batch etc..). The API clearly states when something is not supported,
>> so you can check its compatibility now making some requests to:
>>
>> https://ores-legacy.wikimedia.org
>>
>>  If you open a task with a list of systems that you need to migrate we
>> can definitely take a look and help. So far the traffic being served by
>> ORES has been reduced to few clients, and all of them don't run with
>> recognizable UAs (see https://meta.wikimedia.org/wiki/User-Agent_policy)
>> so we'll try our best to support them. The migration to Lift Wing has been
>> widely publicized, a lot of documentation is available to migrate. We'd
>> suggest trying Lift Wing for your systems instead (see
>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Usage).
>>
>> The Machine Learning plan is to eventually deprecate ores-legacy too, to
>> maintain only one system (namely Lift Wing). There is no final date yet,
>> we'll try to reach out to all remaining users first, so if you plan to keep
>> using ores-legacy please follow up with us first :)
>>
>> Thanks!
>>
>> Luca (on behalf of the ML Team)
>>
>> On Fri, Sep 22, 2023 at 5:10 PM Aaron Halfaker <aaron.halfa...@gmail.com>
>> wrote:
>>
>>> Does the new ores-legacy support the same feature set.  E.g. features
>>> output, injection, and threshold optimizations.  Or is it just prediction?
>>> This will affect some of the systems I need to migrate.
>>>
>>> On Fri, Sep 22, 2023, 06:21 Ilias Sarantopoulos <
>>> isarantopou...@wikimedia.org> wrote:
>>>
>>>> Hello!
>>>>
>>>>
>>>> As a next step in the deprecation process of ORES
>>>> https://wikitech.wikimedia.org/wiki/ORES the Machine Learning team will
>>>> switch the backend of ores.wikimedia.org to ores-legacy, a k8s
>>>> application meant to provide a compatibility layer between ORES and Lift
>>>> Wing so users that have not yet migrated to Lift Wing will be
>>>> transparently migrated. Ores-legacy is an application that has the same API
>>>> as ORES but in the background makes requests to Lift Wing, allowing us to
>>>> decommission the ORES servers until all clients have moved.
>>>>
>>>> This change is planned to take place on Monday 25th of September. If
>>>> you have a client/application that is still using ORES we expect that this
>>>> switch is going to be transparent for you.
>>>>
>>>> However keep in mind that ores-legacy is not a 100% replacement for
>>>> ORES as some old and unused features are no longer supported.
>>>>
>>>> If you see anything out of the ordinary, feel free to contact the
>>>> Machine Learning team:
>>>>
>>>> IRC libera: #wikimedia-ml
>>>>
>>>> Phabricator: Machine-Learning-team tag
>>>>
>>>> Thank you!
>>>>
>>>>
>>>> On Wed, Aug 9, 2023 at 1:22 PM Chaloemphon Praphuchakang <
>>>> yoshrakpra...@gmail.com> wrote:
>>>>
>>>>>
>>>>> On Tue, 8 Aug 2023, 10:45 Tilman Bayer, <haebw...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> Hi Chris,
>>>>>>
>>>>>> On Mon, Aug 7, 2023 at 11:51 AM Chris Albon <cal...@wikimedia.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Tilman,
>>>>>>>
>>>>>>> Most of the work is still very experimental. We have hosted a few
>>>>>>> LLMs on Lift Wing already (StarCoder for example) but they were just
>>>>>>> running on CPU, far too slow for real use cases. But it proves that we 
>>>>>>> can
>>>>>>> easily host LLMs on Lift Wing. We have been pretty quiet about it while 
>>>>>>> we
>>>>>>> focus on the ORES migration, but it is our next big project. More soon
>>>>>>> hopefully!
>>>>>>>
>>>>>> Understood. Looking forward to learning more later!
>>>>>>
>>>>>>
>>>>>>> Where we are now is that we have budget for a big GPU purchase
>>>>>>> (~10-20 GPUs depending on cost), the question we will try to answer 
>>>>>>> after
>>>>>>> the ORES migration is complete is: what GPUs should we purchase? We are
>>>>>>> trying to balance our strong preference to stay open source (i.e. AMD 
>>>>>>> mROC)
>>>>>>> in a world dominated by a single closed source vendor (i.e. Nvidia). In
>>>>>>> addition, do we go for a few expensive GPUs better suited to LLMs 
>>>>>>> (A1000,
>>>>>>> H100, etc) or a mix of big and small? We will need to figure out all 
>>>>>>> this.
>>>>>>>
>>>>>> I see. On that matter, what do you folks make of the recent
>>>>>> announcements of AMD's partnerships with Hugging Face and Pytorch[5]?
>>>>>> (which, I understand, came after the ML team had already launched the
>>>>>> aforementioned new AMD explorations)
>>>>>>
>>>>>> "Open-source AI: AMD looks to Hugging Face and Meta spinoff PyTorch
>>>>>> to take on Nvidia [...]
>>>>>> Both partnerships involve AMD’s ROCm AI software stack, the company’s
>>>>>> answer to Nvidia’s proprietary CUDA platform and application-programming
>>>>>> interface. AMD called ROCm an open and portable AI system with
>>>>>> out-of-the-box support that can port to existing AI models. [...B]oth AMD
>>>>>> and Hugging Face are dedicating engineering resources to each other and
>>>>>> sharing data to ensure that the constantly updated AI models from Hugging
>>>>>> Face, which might not otherwise run well on AMD hardware, would be
>>>>>> “guaranteed” to work on hardware like the MI300X. [...] AMD said PyTorch
>>>>>> will fully upstream the ROCm software stack and “provide immediate ‘day
>>>>>> zero’ support for PyTorch 2.0 with ROCm release 5.4.2 on all AMD Instinct
>>>>>> accelerators,” which is meant to appeal to those customers looking to
>>>>>> switch from Nvidia’s software ecosystem."
>>>>>>
>>>>>>
>>>>>> In their own announcement, Hugging Face offered further details,
>>>>>> including a pretty impressive list of models to be supported:[6]
>>>>>>
>>>>>>
>>>>>> "We intend to support state-of-the-art transformer architectures for
>>>>>> natural language processing, computer vision, and speech, such as BERT,
>>>>>> DistilBERT, ROBERTA, Vision Transformer, CLIP, and Wav2Vec2. Of course,
>>>>>> generative AI models will be available too (e.g., GPT2, GPT-NeoX, T5, 
>>>>>> OPT,
>>>>>> LLaMA), including our own BLOOM and StarCoder models. Lastly, we will 
>>>>>> also
>>>>>> support more traditional computer vision models, like ResNet and ResNext,
>>>>>> and deep learning recommendation models, a first for us. [..] We'll do 
>>>>>> our
>>>>>> best to test and validate these models for PyTorch, TensorFlow, and ONNX
>>>>>> Runtime for the above platforms. [...] We will integrate the AMD ROCm SDK
>>>>>> seamlessly in our open-source libraries, starting with the transformers
>>>>>> library."
>>>>>>
>>>>>>
>>>>>> Do you think this may promise too much, or could it point to a
>>>>>> possible solution of the Foundation's conundrum?
>>>>>> In any case, this seems to be an interesting moment where many in AI
>>>>>> are trying to move away from Nvidia's proprietary CUDA platform. Most of
>>>>>> them probably more for financial and availability reasons though, given 
>>>>>> the
>>>>>> current GPU shortages[7] (which the ML team is undoubtedly aware of
>>>>>> already; mentioning this as context for others on this list. See also
>>>>>> Marketwatch's remarks about current margins[5]).
>>>>>>
>>>>>> Regards, Tilman
>>>>>>
>>>>>>
>>>>>> [5]
>>>>>> https://archive.ph/2023.06.15-173527/https://www.marketwatch.com/amp/story/open-source-ai-amd-looks-to-hugging-face-and-meta-spinoff-pytorch-to-take-on-nvidia-e4738f87
>>>>>> [6] https://huggingface.co/blog/huggingface-and-amd
>>>>>> [7] See e.g.
>>>>>> https://gpus.llm-utils.org/nvidia-h100-gpus-supply-and-demand/
>>>>>> (avoid playing the song though. Don't say I didn't warn you)
>>>>>>
>>>>>>
>>>>>>> I wouldn't characterize WMF's Language Team using CPU as because of
>>>>>>> AMD, rather at the time we didn't have the budget for GPUs so Lift Wing
>>>>>>> didn't have any. Since then we have moved two GPUs onto Lift Wing for
>>>>>>> testing but they are pretty old (2017ish). Once we make the big GPU
>>>>>>> purchase Lift Wing will gain a lot of functionality for LLM and similar
>>>>>>> models.
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On Sun, Aug 6, 2023 at 9:57 PM Tilman Bayer <haebw...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Thu, Aug 3, 2023 at 7:16 AM Chris Albon <cal...@wikimedia.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi everybody,
>>>>>>>>>
>>>>>>>>> TL;DR We would like users of ORES models to migrate to our new
>>>>>>>>> open source ML infrastructure, Lift Wing, within the next five 
>>>>>>>>> months. We
>>>>>>>>> are available to help you do that, from advice to making code 
>>>>>>>>> commits. It
>>>>>>>>> is important to note: All ML models currently accessible on ORES are 
>>>>>>>>> also
>>>>>>>>> currently accessible on Lift Wing.
>>>>>>>>>
>>>>>>>>> As part of the Machine Learning Modernization Project (
>>>>>>>>> https://www.mediawiki.org/wiki/Machine_Learning/Modernization),
>>>>>>>>> the Machine Learning team has deployed a Wikimedia’s new machine 
>>>>>>>>> learning
>>>>>>>>> inference infrastructure, called Lift Wing (
>>>>>>>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing).
>>>>>>>>> Lift Wing brings a lot of new features such as support for GPU-based
>>>>>>>>> models, open source LLM hosting, auto-scaling, stability, and ability 
>>>>>>>>> to
>>>>>>>>> host a larger number of models.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This sounds quite exciting! What's the best place to read up on
>>>>>>>> that planned support for GPU-based models and open source LLMs? (I 
>>>>>>>> also saw
>>>>>>>> in the recent NYT article[1] that the team is "in the process of 
>>>>>>>> adapting
>>>>>>>> A.I. models that are 'off the shelf; — essentially models that have 
>>>>>>>> been
>>>>>>>> made available by researchers for anyone to freely customize — so that
>>>>>>>> Wikipedia’s editors can use them for their work.")
>>>>>>>>
>>>>>>>> I'm aware of the history[2] of not being able to use NVIDIA
>>>>>>>> GPUs due to their CUDA drivers being proprietary. It was mentioned 
>>>>>>>> recently
>>>>>>>> in the Wikimedia AI Telegram group that this is still a serious 
>>>>>>>> limitation,
>>>>>>>> despite some new explorations with AMD GPUs[3] - to the point that 
>>>>>>>> e.g. the
>>>>>>>> WMF's Language team has resorted to using models without GPU support 
>>>>>>>> (CPU
>>>>>>>> only).[4]
>>>>>>>> It sounds like there is reasonable hope that this situation could
>>>>>>>> change fairly soon? Would it also mean both at the same time, i.e. open
>>>>>>>> source LLMs running with GPU support (considering that at least some
>>>>>>>> well-known ones appear to require torch.cuda.is_available() == True for
>>>>>>>> that)?
>>>>>>>>
>>>>>>>> Regards, Tilman
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://www.nytimes.com/2023/07/18/magazine/wikipedia-ai-chatgpt.html
>>>>>>>> [2]
>>>>>>>> https://techblog.wikimedia.org/2020/04/06/saying-no-to-proprietary-code-in-production-is-hard-work-the-gpu-chapter/
>>>>>>>> [3] https://phabricator.wikimedia.org/T334583 etc.
>>>>>>>> [4]
>>>>>>>> https://diff.wikimedia.org/2023/06/13/mint-supporting-underserved-languages-with-open-machine-translation/
>>>>>>>> or https://thottingal.in/blog/2023/07/21/wikiqa/ (experimental
>>>>>>>> but, I understand, written to be deployable on WMF infrastructure)
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> With the creation of Lift Wing, the team is turning its attention
>>>>>>>>> to deprecating the current machine learning infrastructure, ORES. ORES
>>>>>>>>> served us really well over the years, it was a successful project but 
>>>>>>>>> it
>>>>>>>>> came before radical changes in technology like Docker, Kubernetes and 
>>>>>>>>> more
>>>>>>>>> recently MLOps. The servers that run ORES are at the end of their 
>>>>>>>>> planned
>>>>>>>>> lifespan and so to save cost we are going to shut them down in early 
>>>>>>>>> 2024.
>>>>>>>>>
>>>>>>>>> We have outlined a deprecation path on Wikitech (
>>>>>>>>> https://wikitech.wikimedia.org/wiki/ORES), please read the page
>>>>>>>>> if you are a maintainer of a tool or code that uses the ORES endpoint
>>>>>>>>> https://ores.wikimedia.org/). If you have any doubt or if you
>>>>>>>>> need assistance in migrating to Lift Wing, feel free to contact the 
>>>>>>>>> ML team
>>>>>>>>> via:
>>>>>>>>>
>>>>>>>>> - Email: m...@wikimedia.org
>>>>>>>>> - Phabricator: #Machine-Learning-Team tag
>>>>>>>>> - IRC (Libera): #wikimedia-ml
>>>>>>>>>
>>>>>>>>> The Machine Learning team is available to help projects migrate,
>>>>>>>>> from offering advice to making code commits. We want to make this as 
>>>>>>>>> easy
>>>>>>>>> as possible for folks.
>>>>>>>>>
>>>>>>>>> High Level timeline:
>>>>>>>>>
>>>>>>>>> **By September 30th 2023: *Infrastructure powering the ORES API
>>>>>>>>> endpoint will be migrated from ORES to Lift Wing. For users, the API
>>>>>>>>> endpoint will remain the same, and most users won’t notice any change.
>>>>>>>>> Rather just the backend services powering the endpoint will change.
>>>>>>>>>
>>>>>>>>> Details: We'd like to add a DNS CNAME that points
>>>>>>>>> ores.wikimedia.org to ores-legacy.wikimedia.org, a new endpoint
>>>>>>>>> that offers a almost complete replacement of the ORES API calling 
>>>>>>>>> Lift Wing
>>>>>>>>> behind the scenes. In an ideal world we'd migrate all tools to Lift 
>>>>>>>>> Wing
>>>>>>>>> before decommissioning the infrastructure behind
>>>>>>>>> ores.wikimedia.org, but it turned out to be really challenging so
>>>>>>>>> to avoid disrupting users we chose to implement a transition 
>>>>>>>>> layer/API.
>>>>>>>>>
>>>>>>>>> To summarize, if you don't have time to migrate before September
>>>>>>>>> to Lift Wing, your code/tool should work just fine on
>>>>>>>>> ores-legacy.wikimedia.org and you'll not have to change a line in
>>>>>>>>> your code thanks to the DNS CNAME. The ores-legacy endpoint is not a 
>>>>>>>>> 100%
>>>>>>>>> replacement for ores, we removed some very old and not used features, 
>>>>>>>>> so we
>>>>>>>>> highly recommend at least test the new endpoint for your use case to 
>>>>>>>>> avoid
>>>>>>>>> surprises when we'll make the switch. In case you find anything weird,
>>>>>>>>> please report it to us using the aforementioned channels.
>>>>>>>>>
>>>>>>>>> **September to January: *We will be reaching out to every user of
>>>>>>>>> ORES we can identify and working with them to make the migration 
>>>>>>>>> process as
>>>>>>>>> easy as possible.
>>>>>>>>>
>>>>>>>>> **By January 2024: *If all goes well, we would like zero traffic
>>>>>>>>> on the ORES API endpoint so we can turn off the ores-legacy API.
>>>>>>>>>
>>>>>>>>> If you want more information about Lift Wing, please check
>>>>>>>>> https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing
>>>>>>>>>
>>>>>>>>> Thanks in advance for the patience and the help!
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> The Machine Learning Team
>>>>>>>>> _______________________________________________
>>>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>>>> To unsubscribe send an email to
>>>>>>>>> wikitech-l-le...@lists.wikimedia.org
>>>>>>>>>
>>>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>>> To unsubscribe send an email to
>>>>>>>> wikitech-l-le...@lists.wikimedia.org
>>>>>>>>
>>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>>>
>>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>>
>>>>>> _______________________________________________
>>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>>
>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>
>>>>> _______________________________________________
>>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>>
>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>
>>>> _______________________________________________
>>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>>
>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>
>>> _______________________________________________
>>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>>
>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>
>> _______________________________________________
>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>> To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
>>
>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
>

_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

[Wikitech-l] Re: ORES To Lift Wing Migration

Reply via email to