so with petals down, i was wondering, how would i finetune llama 405b
without petals?

basically, i would do it very slowly :s

i was thinking of getting into the nitty gritty of backpropagation
graphs and doing it layer by layer.
for example, if you have to offload every layer, you could update one
layer's weights, at the same time as forward passing the next batch.
this would double the speed.
  • Re: [ot][spa... Undescribed Horrific Abuse, One Victim & Survivor of Many
    • Re: [ot... Undescribed Horrific Abuse, One Victim & Survivor of Many
      • Re:... Undescribed Horrific Abuse, One Victim & Survivor of Many
        • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
          • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
            • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
              • ... user
                • ... user
                • ... user
                • ... Undescribed Horrific Abuse, One Victim & Survivor of Many
                • ... user

Reply via email to