[
https://issues.apache.org/jira/browse/SOLR-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343656#comment-16343656
]
Jeroen Steggink commented on SOLR-11838:
----------------------------------------
As a start, I think applying models for LTR or classifying documents/fields
when indexing would be most useful.
One thing we shouldn't underestimate is data structures for Neural Networks.
Depending on the network structure a model may depend on a specific data
structure. For example, timeseries-vectors are very different from other
vectors. Are we doing just bag-of-words or do we keep the order of words? How
many fields would your like as input? How many inputs can models have
(preferably ComputationGraphs, as they are more flexible).
Furthermore, we should think about what is actually going to work. Having
one-hot encoding for all terms in an index could be problematic. There is
already a logistic regression implementation which works great for simple
classification. If we're going to use DL4J it should add something more than
Solr already offers.
Maybe we can think of a few specific use cases to make a prototype for?
I think we can make a DataVec record reader for Solr (@[~kwatters]). But I
guess this is something we can add to DataVec itself, instead of adding this to
Solr. An alternative could be to use Solr's Streaming API to return data in a
format which is efficient and could be directly used by DataVec.
Another thing I'd like to mention is dependencies. Instead of relying on DL4J
specifically, we could think about abstracting data input and output for
machine learning and applying models in general. As a DL4J user I'm not very
interested in running it on a Solr server. I have dedicated servers running
DL4J models which I serve using REST APIs. The reason is that I have servers
with GPUs and lot's of RAM dedicated for this type of process. Solr on the
other hand can be very demanding in a different way.
> explore supporting Deeplearning4j NeuralNetwork models
> ------------------------------------------------------
>
> Key: SOLR-11838
> URL: https://issues.apache.org/jira/browse/SOLR-11838
> Project: Solr
> Issue Type: New Feature
> Reporter: Christine Poerschke
> Priority: Major
> Attachments: SOLR-11838.patch
>
>
> [~yuyano] wrote in SOLR-11597:
> bq. ... If we think to apply this to more complex neural networks in the
> future, we will need to support layers ...
> [~malcorn_redhat] wrote in SOLR-11597:
> bq. ... In my opinion, if this is a route Solr eventually wants to go, I
> think a better strategy would be to just add a dependency on
> [Deeplearning4j|https://deeplearning4j.org/] ...
> Creating this ticket for the idea to be explored further (if anyone is
> interested in exploring it), complimentary to and independent of the
> SOLR-11597 RankNet related effort.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]