Hi, Sean.

Perhaps I don't understand. As I see it, ParamGridBuilder builds an
Array[ParamMap]. What I am proposing is a new class that also builds an
Array[ParamMap] via its build() method, so there would be no "change in the
APIs". This new class would, of course, have methods that defined the
search space (log, linear, etc) over which random values were chosen.

Now, if this is too trivial to warrant the work and people prefer Hyperopt,
then so be it. It might be useful for people not using Python but they can
just roll-their-own, I guess.

Anyway, looking forward to hearing what you think.

Regards,

Phillip



On Fri, Jan 29, 2021 at 4:18 PM Sean Owen <sro...@gmail.com> wrote:

> I think that's a bit orthogonal - right now you can't specify continuous
> spaces. The straightforward thing is to allow random sampling from a big
> grid. You can create a geometric series of values to try, of course -
> 0.001, 0.01, 0.1, etc.
> Yes I get that if you're randomly choosing, you can randomly choose from a
> continuous space of many kinds. I don't know if it helps a lot vs the
> change in APIs (and continuous spaces don't make as much sense for grid
> search)
> Of course it helps a lot if you're doing a smarter search over the space,
> like what hyperopt does. For that, I mean, one can just use hyperopt +
> Spark ML already if desired.
>
> On Fri, Jan 29, 2021 at 9:01 AM Phillip Henry <londonjava...@gmail.com>
> wrote:
>
>> Thanks, Sean! I hope to offer a PR next week.
>>
>> Not sure about a dependency on the grid search, though - but happy to
>> hear your thoughts. I mean, you might want to explore logarithmic space
>> evenly. For example,  something like "please search 1e-7 to 1e-4" leads to
>> a reasonably random sample being {3e-7, 2e-6, 9e-5}. These are (roughly)
>> evenly spaced in logarithmic space but not in linear space. So, saying what
>> fraction of a grid search to sample wouldn't make sense (unless the grid
>> was warped, of course).
>>
>> Does that make sense? It might be better for me to just write the code as
>> I don't think it would be very complicated.
>>
>> Happy to hear your thoughts.
>>
>> Phillip
>>
>>
>>
>> On Fri, Jan 29, 2021 at 1:47 PM Sean Owen <sro...@gmail.com> wrote:
>>
>>> I don't know of anyone working on that. Yes I think it could be useful.
>>> I think it might be easiest to implement by simply having some parameter to
>>> the grid search process that says what fraction of all possible
>>> combinations you want to randomly test.
>>>
>>> On Fri, Jan 29, 2021 at 5:52 AM Phillip Henry <londonjava...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have no work at the moment so I was wondering if anybody would be
>>>> interested in me contributing code that generates an Array[ParamMap] for
>>>> random hyperparameters?
>>>>
>>>> Apparently, this technique can find a hyperparameter in the top 5% of
>>>> parameter space in fewer than 60 iterations with 95% confidence [1].
>>>>
>>>> I notice that the Spark code base has only the brute force
>>>> ParamGridBuilder unless I am missing something.
>>>>
>>>> Hyperparameter optimization is an area of interest to me but I don't
>>>> want to re-invent the wheel. So, if this work is already underway or there
>>>> are libraries out there to do it please let me know and I'll shut up :)
>>>>
>>>> Regards,
>>>>
>>>> Phillip
>>>>
>>>> [1]
>>>> https://www.oreilly.com/library/view/evaluating-machine-learning/9781492048756/ch04.html
>>>>
>>>

Reply via email to