Re: Morning Spam

Undescribed Horrific Abuse, One Victim & Survivor of Many Mon, 20 May 2024 06:08:06 -0700

# we improved the stability of low-data machine learning model training

## why is this useful?


low-data model training would let individual users make AI models without
access to big data, such as those models used at scale to addict users to
products, services, or behaviors.

ability to recreate these models with the user in control would give them
many more options for personal freedom.

additionally low-data training makes more powerful AI.

## what was the issue?

karl’s familiar plan involved observing an attempt and improving from its
results, which can trigger us, preventing our belief it would work. we
didn’t know why different neuron groups had opposite confidence. it was
confusing.

additionally, we’ve gotten more sensitive to looking at current research to
integrate existing work, and the potential economic impact of such work can
make it hard to find

## what was the path to the new approach?

we combined two researches we’ve been expose to, to make a plan that
reduces the space of unknowns and hence requires fewer testing cycles.

basically, models trained on short data can generalize correctly to long
data, but only if the short data is trained in a way that does this,
generally associated with number of batches as well as other training
parameters.

meanwhile, the human mind commonly meditates on short data to prepare for
long data successfully, and we have extensive exposure and cognition around
this.

## so what’s the new approach?

we propose training a simple metamodel that performs the activity of
deciding how many training steps, what data to select, and other parameters
to select when training on short data, so as to maximize generalization to
large data.

the metamodel could optionally also perform the training itself by
generating model weights. although this is not required, it could speed
training if the model learns to outperform the optimizer.

these metamodel approaches are of course obvious, but writing them clearly
is needed because their use is also rare.

a common concern with training metamodels is the amount of data needed.
this problem seems solved to us by including the metamodel itself in its
own expanding dataset, such that it learns to generalize itself to larger
data.

including the metamodel in its own dataset can confuse developers and
designers, so clear separation of concepts when implementing seems helpful.

## what problems remain?

simulations imply that there remain expected failure areas that will
require testing cycles, although the diversity of this space may have been
significantly shrunk.

some simulations imply the approach fails reliably. it is hard to discern
whether this is due to political or mathematical reasons, and we suspect
the former, and that the challenge can be surmounted.

a next step for researchers could be to more clearly describe how to
address confusion around using the metamodel on itself. this would
significantly aid belief in its success.

a next step for engineers could be to implement the approach with small
enough systems that the metamodel not need apply to itself, and deliver
results or feedback. a clear working system would make it easier to
describe how to address confusion.

## in short

to make a big model on a small dataset, train a training model that
improves generalization to more data.

this is more powerful but more difficult if it also improves generalization
of itself.

Re: Morning Spam

Reply via email to